When PostgreSQL is unable to detect a valid checkpoint from which to begin the recovery procedure, it displays the error "PANIC: could not locate a valid checkpoint record." This may occur if the PostgreSQL container is not restarted safely.
Prevent data corruption
For stateful applications like databases, the first method is to use Stateful Sets rather than Deployments; this is a Kubernetes best practice. To preserve the stability and integrity of your database, Stateful Sets offer assurances regarding the ordering and uniqueness of pods. Additionally, stateful sets will manage restarts more effectively.
The second method is to safely shut down your PostgreSQL instance by using a preStop hook in Kubernetes.
Before a pod is terminated in Kubernetes, the preStop hook is invoked. Once the preStop hook is finished, the pod's termination process starts.
Apply the preStop hook in a Kubernetes configuration for PostgreSQL:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
replicas: 1
serviceName: "postgres"
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:latest
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "pg_ctl -D /var/lib/postgres/data -w -t 60 -m fast stop"]
volumeClaimTemplates:
- metadata:
name: postgres-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
This preStop hook runs the pg_ctl stop command to shut down the PostgreSQL server.
-
-D /var/lib/postgres/dataoption specifies the directory where the data base files live, -
-wwaits until the server shuts down, -
-t 60specifies the wait timeout in seconds, and -
-m fastmeans to do a "fast" shutdown, which rolls back all active transactions, disconnects clients immediately and shuts down.
For AWS ECS deployments you can set the Min and max running tasks respectively: 0% min and 100% max
Recover from an error state
This can be fixed by running pg_resetwal command by connecting to the container.
However, since the pod is in crashLoopBackOff state, we will not be able to connect to the container.
We must first make the pod to be in stable state to execute the command.
Please follow the steps below for manual execution of the command pg_resetwal in a container.
This process requires the pod to be in a stable state, and not in a crash loop.
You can achieve this by introducing a delay in the postgres-deployment.yaml file.
- Scale down the deployment by reducing the replica count of the deployment to zero using the following command:
kubectl scale deployment postgres - replicas=0 - Introduce a delay: Modify the
postgres-deployment.yamlfile to include a sleep command that delays the initialization process by 600 seconds, keeping the pod in the initializing state.
Here is an example of how to add it:
spec:
containers:
- name: postgres
image: postgres:latest
command: [ "/bin/bash", "-c", " - " ]
args: [ "sleep 600;" ]
- Scale up the deployment: Bring the deployment back up by increasing the replica count to one using the following command:
kubectl scale deployment postgres - replicas=1 - Execute
pg_resetwalin the pod: Once the pod reaches the initializing state, run thepg_resetwalcommand by executing into the pod. Here is how you can do this:
kubectl exec -it postgres - /bin/bash
su postgres
pg_resetwal /var/lib/postgres/data
- Proceed despite the warning: If you do receive the warning, continue the process by forcing the
pg_resetwalcommand as follows:pg_resetwal /var/lib/posgtgres/data -f
After running this command, you should receive a confirmation message stating “Write-ahead log reset.”
After you have successfully executed the pg_resetwal command, you should be able to restart the PostgreSQL server.
Remember that the pg_resetwal command is a measure of last resort and carries the risk of data loss or inconsistency.
Always make sure to maintain regular backups of your PostgreSQL databases and consider setting up high availability and replication solutions for your production databases.
Top comments (0)