Karan Sharma

Resetting the WAL Log in PostgreSQL After a Crash

1 minutes (260 words)

In an event of a DB crash or an unclean shutdown (host node failure), sometimes Postgres cannot find a valid checkpoint to start. The logs look like:

2021-02-16 04:09:44.977 UTC [26] LOG:  database system was shut down at 2021-02-16 03:50:02 UTC
2021-02-16 04:09:44.978 UTC [26] LOG:  record with incorrect prev-link 1DF0000/7BB0000 at 0/D786C2B0
2021-02-16 04:09:44.978 UTC [26] LOG:  invalid primary checkpoint record
2021-02-16 04:09:44.978 UTC [26] PANIC:  could not locate a valid checkpoint record
2021-02-16 04:09:45.252 UTC [25] LOG:  startup process (PID 26) was terminated by signal 6: Aborted
2021-02-16 04:09:45.252 UTC [25] LOG:  aborting startup due to startup process failure
2021-02-16 04:09:45.254 UTC [25] LOG:  database system is shut down

The line to notice here is:

2021-02-16 04:09:44.978 UTC [26] PANIC:  could not locate a valid checkpoint record

In such a scenario, perform a WAL Log Reset with pg_resetwal:

Quoting from the docs:

pg_resetwal clears the write-ahead log (WAL) and optionally resets some other control information stored in the pg_control file. This function is sometimes needed if these files have become corrupted. It should be used only as a last resort, when the server will not start due to such corruption.

$ su postgres
$ pg_resetwal /var/lib/postgresql/data

Postgres should initialise now. But as the docs clarify further:

After running this command, it should be possible to start the server, but bear in mind that the database might contain inconsistent data due to partially-committed transactions. You should immediately dump your data, run initdb, and reload. After reload, check for inconsistencies and repair as needed.

Tags: #Postgres