pgBackRest: Multi-Destination PostgreSQL Backups in CloudNativePG

“The backup that only exists in one place doesn’t exist at all.”

The Problem: Barman’s Dirty Secret

I had what I thought was a solid PostgreSQL backup strategy for my Immich database. CloudNativePG with Barman Cloud Plugin, two ObjectStores configured—one for Backblaze B2, one for Cloudflare R2. Daily ScheduledBackups for each destination. Belt and suspenders.

Then I checked the actual buckets.

B2: Full backups, WAL files, everything present. R2: Empty. Nothing. Not a single file.

Both ScheduledBackups were writing to B2. The R2 ObjectStore was configured, referenced in the ScheduledBackup—and completely ignored.

Turns out, Barman Cloud Plugin has a limitation I hadn’t spotted. The barmanObjectName parameter in ScheduledBackup? It’s ignored. The plugin only uses whatever’s configured in the cluster’s plugin configuration. Both my scheduled backups were hitting the same destination.

This isn’t theoretical concern. If Backblaze goes down, I can’t copy backups to R2 because they’re only in B2. My 3-2-1 backup strategy was actually a 2-1-1.

The Alternative: pgBackRest

After researching alternatives, I found three options for PostgreSQL backups with CloudNativePG plugins:

Barman (what I was using) - Single destination limitation
WAL-G - Similar architecture, no multi-repo support
pgBackRest - Native multi-repository support

pgBackRest from Dalibo looked promising. It’s designed from the ground up for multi-repository backups—WAL archiving goes to ALL configured repositories simultaneously. The catch: it’s experimental, has 16 GitHub stars, and the documentation is thin.

I gave it a shot anyway.

Deploying the pgBackRest Plugin

Unlike Barman which has a Helm chart, the Dalibo pgBackRest plugin requires manual deployment. I created a new directory structure:

1
2
3
4
5
6
7


kubernetes/apps/database/cloudnative-pg/pgbackrest/
├── kustomization.yaml
├── crd.yaml          # Repository CRD
├── rbac.yaml         # ServiceAccount, ClusterRole, RoleBinding
├── certificate.yaml  # Self-signed TLS for plugin communication
├── deployment.yaml   # The controller
└── service.yaml      # Exposes the controller to CNPG

The controller deployment is straightforward:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


apiVersion: apps/v1
kind: Deployment
metadata:
  name: pgbackrest-controller
  namespace: database
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pgbackrest-controller
  template:
    spec:
      containers:
        - name: pgbackrest-controller
          image: registry.hub.docker.com/dalibo/cnpg-pgbackrest-controller:latest
          args:
            - operator
            - --server-cert=/server/tls.crt
            - --server-key=/server/tls.key
            - --client-cert=/client/tls.crt
            - --server-address=:9090
            - --log-level=debug
          env:
            - name: SIDECAR_IMAGE
              value: registry.hub.docker.com/dalibo/cnpg-pgbackrest-sidecar:latest
          volumeMounts:
            - mountPath: /server
              name: server
            - mountPath: /client
              name: client
      volumes:
        - name: server
          secret:
            secretName: pgbackrest-controller-server-tls
        - name: client
          secret:
            secretName: pgbackrest-controller-client-tls

The Service Name Gotcha

My first deployment crashed with a cryptic error:

1

stanza creation failed: can't parse pgbackrest JSON: invalid character 'P'

After way too much debugging, I found the issue. I’d named the service pgbackrest. Kubernetes automatically creates environment variables for services: PGBACKREST_SERVICE_HOST, PGBACKREST_PORT, etc.

pgBackRest interprets any PGBACKREST_* environment variable as configuration. The sidecar was trying to parse PGBACKREST_PORT_9090_TCP_ADDR as a pgBackRest option and choking on the JSON output.

The fix: rename the service to cnpg-pgbackrest. No more environment variable conflicts.

Leader Lease Conflict

After fixing the service name, the controller was stuck:

1

attempting to acquire leader lease database/822e3f5c.cnpg.io...

Both Barman and pgBackRest plugins use the same leader election lease name. I had to disable Barman first:

1
2


flux suspend helmrelease barman-cloud -n database
kubectl scale deployment barman-cloud-plugin-barman-cloud -n database --replicas=0

Once Barman released the lease, pgBackRest acquired it and started working.

Configuring Multi-Repository Backups

The Repository CR is where the magic happens. You can define multiple S3 repositories and pgBackRest will archive WAL to all of them:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42


apiVersion: pgbackrest.dalibo.com/v1
kind: Repository
metadata:
  name: immich18-repository
  namespace: database
spec:
  repoConfiguration:
    stanza: immich18
    archive:
      async: true
      pushQueueMax: 1GiB
    s3Repositories:
      # Primary: Backblaze B2
      - bucket: ${IMMICH_PG_BACKUP_B2_BUCKET}
        endpoint: s3.us-east-005.backblazeb2.com
        region: us-east-005
        repoPath: /immich18
        retentionPolicy:
          full: 14
          fullType: count
        secretRef:
          accessKeyId:
            name: immich-cnpg-secret
            key: b2-access-key-id
          secretAccessKey:
            name: immich-cnpg-secret
            key: b2-secret-access-key
      # Secondary: Cloudflare R2 (use Flux variable substitution)
      - bucket: ${IMMICH_PG_BACKUP_R2_BUCKET}
        endpoint: ${CLOUDFLARE_ACCOUNT_ID}.r2.cloudflarestorage.com
        region: auto
        repoPath: /immich18
        retentionPolicy:
          full: 14
          fullType: count
        secretRef:
          accessKeyId:
            name: immich-cnpg-secret
            key: r2-access-key-id
          secretAccessKey:
            name: immich-cnpg-secret
            key: r2-secret-access-key

Note the R2 bucket and endpoint use Flux variable substitution (${VARIABLE_NAME}) instead of hardcoded values. These get populated from an ExternalSecret that pulls from 1Password, with the Flux Kustomization configured to substitute variables from that secret. This keeps sensitive values like Cloudflare account IDs out of git history.

The cluster just needs to reference the plugin:

1
2
3
4
5


spec:
  plugins:
    - name: pgbackrest.dalibo.com
      parameters:
        repositoryRef: immich18-repository

The Backup Target Problem

First backup attempt failed:

1
2


ERROR: [056]: unable to find primary cluster - cannot proceed
HINT: are all available clusters in recovery?

CNPG defaults to running backups on replicas to reduce load on the primary. But pgBackRest can’t run backups from replicas without SSH access to the primary—which doesn’t exist in Kubernetes.

The fix is simple: tell CNPG to run backups on the primary:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: immich18-daily-b2
spec:
  schedule: "0 3 * * *"
  target: primary  # This is the key
  method: plugin
  cluster:
    name: immich18
  pluginConfiguration:
    name: pgbackrest.dalibo.com

The Multi-Repo Full Backup Discovery

With the target fixed, backups started working. WAL archiving was going to both repositories—I could see files appearing in both B2 and R2. But when I checked the full backups:

1
2
3
4
5
6
7


# B2
aws s3 ls s3://<your-bucket>/immich18/ --profile backblaze-b2 --recursive | wc -l
1285

# R2
aws s3 ls s3://<your-bucket>/immich18/ --profile cloudflare-r2 --region auto --recursive | wc -l
11

WAL archives were in both. Full backup was only in B2.

This is actually intentional behavior in pgBackRest. From the documentation:

WAL archiving: Pushes to ALL configured repositories simultaneously
Full backups: Only runs against ONE repository (defaults to repo1)

The reasoning makes sense—full backups are large and expensive. Doing them twice doubles storage costs. WAL goes everywhere for redundancy.

But for disaster recovery, I needed full backups in both locations.

The selectedRepository Parameter

After digging through the plugin source code, I found the solution. The plugin accepts a selectedRepository parameter:

1
2
3
4


selectedRepo, ok := request.Parameters["selectedRepository"]
if !ok {
    selectedRepo = "1" // use first repo by default
}

Note: the parameter is selectedRepository, not repo. The code defaults to repository 1 if not specified.

Testing it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


apiVersion: postgresql.cnpg.io/v1
kind: Backup
metadata:
  name: pgbackrest-r2-test
spec:
  cluster:
    name: immich18
  method: plugin
  target: primary
  pluginConfiguration:
    name: pgbackrest.dalibo.com
    parameters:
      selectedRepository: "2"

The logs confirmed it:

1

{"msg":"using repo","repo":"PGBACKREST_REPO=2"}

After waiting for the backup to complete:

1
2


aws s3 ls s3://<your-bucket>/immich18/ --profile cloudflare-r2 --region auto --recursive | wc -l
1232

Full backup in R2.

The Final Configuration: Two ScheduledBackups

To get true dual-destination full backups, I created two ScheduledBackups—one for each repository:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


---
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: immich18-daily-b2
  namespace: database
spec:
  schedule: "0 3 * * *"
  immediate: true
  backupOwnerReference: self
  method: plugin
  target: primary
  cluster:
    name: immich18
  pluginConfiguration:
    name: pgbackrest.dalibo.com
    parameters:
      selectedRepository: "1"
---
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: immich18-daily-r2
  namespace: database
spec:
  # Offset by 1 hour to avoid concurrent runs
  schedule: "0 4 * * *"
  immediate: false
  backupOwnerReference: self
  method: plugin
  target: primary
  cluster:
    name: immich18
  pluginConfiguration:
    name: pgbackrest.dalibo.com
    parameters:
      selectedRepository: "2"

Key details:

Schedules are offset by 1 hour to avoid concurrent backup operations
immediate: true only on the first one (don’t need two immediate backups)
Both target the primary explicitly

Verifying the Setup

After deployment, both buckets showed full backups:

Bucket	Files	Full Backup
B2	1285+	`20251217-034103F`
R2	1232+	`20251217-035928F`

WAL archiving continues to push to both repositories automatically. The Repository CR status shows the recovery window:

1
2
3
4
5
6
7
8


status:
  recoveryWindow:
    firstBackup:
      label: 20251217-034103F
      type: full
    lastBackup:
      label: 20251217-035928F
      type: full

Lessons Learned

Assumption	Reality	Fix
Barman supports multiple ObjectStores per ScheduledBackup	`barmanObjectName` is ignored	Switch to pgBackRest
Service name `pgbackrest` is fine	Creates conflicting `PGBACKREST_*` env vars	Use `cnpg-pgbackrest`
Plugins can share leader leases	They use the same lease name	Disable Barman first
Backups run on any cluster member	pgBackRest needs the primary	Add `target: primary`
Parameter name is `repo`	It’s `selectedRepository`	Read the source code
pgBackRest multi-repo means full backups everywhere	WAL yes, full backups no	Create two ScheduledBackups

Is pgBackRest Worth It?

For the specific use case of multi-destination full backups, yes. The Dalibo plugin is experimental but functional. The key advantages over Barman:

True WAL replication: WAL files go to all repositories simultaneously
Explicit repository selection: You can target specific repos for full backups
Better retention policies: Per-repository retention configuration

The downsides:

No Helm chart (manual deployment required)
Experimental status (16 stars on GitHub)
Documentation is sparse (I had to read source code)

For a homelab where I’m willing to debug issues, it’s the right choice. For production, I’d wait for the plugin to mature or implement external replication (rclone sync between buckets).

Part 2: The Full Migration

With pgBackRest working for Immich, I decided to go all-in and migrate my entire PostgreSQL infrastructure:

Rename immich18 to postgres18-immich - Clearer naming convention
Create postgres18-cluster - New cluster to replace postgres17 (which used Barman)
Migrate all 46 databases from postgres17 to postgres18-cluster

Shared Credentials

Rather than managing separate S3 credentials for each cluster, I consolidated to shared pgBackRest credentials:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30


apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: cnpg-secret
  namespace: database
spec:
  secretStoreRef:
    kind: ClusterSecretStore
    name: onepassword-connect
  target:
    name: cnpg-secret
  data:
    # pgBackRest B2 - shared across all clusters
    - secretKey: b2-access-key-id
      remoteRef:
        key: backblaze
        property: BACKBLAZE_PGBACKREST_ACCESS_KEY
    - secretKey: b2-secret-access-key
      remoteRef:
        key: backblaze
        property: BACKBLAZE_PGBACKREST_SECRET_ACCESS_KEY
    # pgBackRest R2 - shared across all clusters
    - secretKey: r2-access-key-id
      remoteRef:
        key: cloudflare
        property: CLOUDFLARE_PGBACKREST_ACCESS_KEY
    - secretKey: r2-secret-access-key
      remoteRef:
        key: cloudflare
        property: CLOUDFLARE_PGBACKREST_SECRET_ACCESS_KEY

Each cluster gets its own S3 bucket but shares the same credentials, simplifying secret management.

Creating postgres18-cluster

The new cluster uses standard CloudNativePG PostgreSQL 18 (no VectorChord needed - that’s only for Immich):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres18-cluster
  namespace: database
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:18@sha256:7f374e054e46fdefd64b52904e32362949703a75c05302dca8ffa1eb78d41891
  storage:
    size: 20Gi
    storageClass: openebs-hostpath
  postgresql:
    parameters:
      max_connections: "200"
      shared_buffers: 256MB
      wal_keep_size: 2GB
  plugins:
    - name: pgbackrest.dalibo.com
      parameters:
        repositoryRef: postgres18-cluster-repository
  bootstrap:
    initdb:
      database: postgres
      owner: postgres

With its own Repository CR pointing to nerdz-postgres-cluster bucket:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40


apiVersion: pgbackrest.dalibo.com/v1
kind: Repository
metadata:
  name: postgres18-cluster-repository
  namespace: database
spec:
  repoConfiguration:
    stanza: postgres18-cluster
    archive:
      async: true
      pushQueueMax: 1GiB
    s3Repositories:
      - bucket: nerdz-postgres-cluster
        endpoint: s3.us-east-005.backblazeb2.com
        region: us-east-005
        repoPath: /postgres18-cluster
        retentionPolicy:
          full: 14
          fullType: count
        secretRef:
          accessKeyId:
            name: cnpg-secret
            key: b2-access-key-id
          secretAccessKey:
            name: cnpg-secret
            key: b2-secret-access-key
      - bucket: nerdz-postgres-cluster
        endpoint: ${CLOUDFLARE_ACCOUNT_ID}.r2.cloudflarestorage.com
        region: auto
        repoPath: /postgres18-cluster
        retentionPolicy:
          full: 14
          fullType: count
        secretRef:
          accessKeyId:
            name: cnpg-secret
            key: r2-access-key-id
          secretAccessKey:
            name: cnpg-secret
            key: r2-secret-access-key

The Migration: pg_dump Direct Pipe

With postgres18-cluster running and healthy, it was time to migrate 46 databases (~1.8GB total) from postgres17.

Step 1: Scale down all apps

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


# Downloads namespace (14 apps)
kubectl scale deploy -n downloads autobrr bazarr bazarr-foreign bazarr-uhd \
  dashbrr prowlarr radarr radarr-uhd readarr sabnzbd sonarr sonarr-foreign \
  sonarr-uhd whisparr --replicas=0

# Home namespace
kubectl scale deploy -n home atuin linkwarden manyfold paperless --replicas=0

# Home-automation namespace
kubectl scale deploy -n home-automation n8n teslamate --replicas=0

# Other namespaces
kubectl scale deploy -n games romm --replicas=0
kubectl scale deploy -n observability gatus --replicas=0
kubectl scale deploy -n plane plane-admin-wl plane-api-wl plane-beat-worker-wl \
  plane-live-wl plane-space-wl plane-worker-wl --replicas=0
kubectl scale deploy -n security pocket-id --replicas=0

Step 2: Verify no connections

1
2
3


kubectl exec -n database postgres17-1 -c postgres -- psql -U postgres -c \
  "SELECT datname, usename, client_addr, state FROM pg_stat_activity \
   WHERE datname IS NOT NULL ORDER BY datname;"

Only the query connection itself should appear.

Step 3: Direct pipe migration

1
2


kubectl exec -n database postgres17-1 -c postgres -- pg_dumpall -U postgres | \
  kubectl exec -i -n database postgres18-cluster-1 -c postgres -- psql -U postgres

This pipes the dump directly between clusters - no intermediate file needed. The only “errors” are expected:

1
2


ERROR:  role "postgres" already exists
ERROR:  role "streaming_replica" already exists

These roles already exist in the target cluster.

Step 4: Verify the migration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


# Check database counts match
kubectl exec -n database postgres17-1 -c postgres -- psql -U postgres -c \
  "SELECT COUNT(*) FROM pg_database WHERE datname NOT IN ('template0', 'template1');"
# Result: 46

kubectl exec -n database postgres18-cluster-1 -c postgres -- psql -U postgres -c \
  "SELECT COUNT(*) FROM pg_database WHERE datname NOT IN ('template0', 'template1');"
# Result: 46

# Spot check critical data
kubectl exec -n database postgres18-cluster-1 -c postgres -- psql -U postgres -d teslamate -c \
  "SELECT COUNT(*) FROM positions;"
# Result: 3,140,499 (matches source)

Step 5: Update ExternalSecrets

All apps needed their database hostname updated from postgres17-rw.database.svc.cluster.local to postgres18-cluster-rw.database.svc.cluster.local:

1
2


find kubernetes/apps -name "*.yaml" -exec grep -l "postgres17-rw" {} \; | \
  xargs sed -i 's/postgres17-rw\.database\.svc\.cluster\.local/postgres18-cluster-rw.database.svc.cluster.local/g'

Step 6: Scale apps back up

1
2
3
4


kubectl scale deploy -n downloads autobrr bazarr bazarr-foreign bazarr-uhd \
  dashbrr prowlarr radarr radarr-uhd readarr sabnzbd sonarr sonarr-foreign \
  sonarr-uhd whisparr --replicas=1
# ... repeat for other namespaces

Adding PgBouncer Connection Pooling

For better connection management, I added PgBouncer poolers to both clusters:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
  name: postgres18-cluster-pooler
  namespace: database
spec:
  cluster:
    name: postgres18-cluster
  instances: 2
  type: rw
  pgbouncer:
    poolMode: session
    parameters:
      max_client_conn: "500"
      default_pool_size: "100"

Cleanup

With everything migrated and verified:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


# Suspend the old Kustomization
flux suspend kustomization cloudnative-pg-cluster17 -n database

# Delete the cluster
kubectl delete cluster postgres17 -n database

# Remove from git
rm -rf kubernetes/apps/database/cloudnative-pg/cluster17/
# Edit ks.yaml to remove the cluster17 Kustomization
git add -A && git commit -m "chore(database): remove postgres17 cluster"
git push

Final Architecture

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


kubernetes/apps/database/cloudnative-pg/
├── app/                      # CNPG operator + shared secrets
├── pgbackrest/               # pgBackRest controller
├── postgres18-immich/        # Immich cluster (VectorChord)
│   ├── postgres18-immich.yaml
│   ├── repository.yaml       # B2: nerdz-postgres-immich
│   ├── scheduledbackup.yaml  # B2 @ 03:00, R2 @ 04:00
│   ├── pooler.yaml
│   └── service.yaml
└── postgres18-cluster/       # Main cluster (all other apps)
    ├── postgres18-cluster.yaml
    ├── repository.yaml       # B2: nerdz-postgres-cluster
    ├── scheduledbackup.yaml  # B2 @ 03:00, R2 @ 04:00
    ├── pooler.yaml
    └── service.yaml

Cluster	Purpose	Image	Backups
postgres18-immich	Immich only	VectorChord PG18	B2 + R2 via pgBackRest
postgres18-cluster	All other apps (46 DBs)	Standard PG18	B2 + R2 via pgBackRest

The Complete Lessons Learned

Assumption	Reality	Fix
Barman supports multiple ObjectStores per ScheduledBackup	`barmanObjectName` is ignored	Switch to pgBackRest
Service name `pgbackrest` is fine	Creates conflicting `PGBACKREST_*` env vars	Use `cnpg-pgbackrest`
Plugins can share leader leases	They use the same lease name	Disable Barman first
Backups run on any cluster member	pgBackRest needs the primary	Add `target: primary`
Parameter name is `repo`	It’s `selectedRepository`	Read the source code
pgBackRest multi-repo means full backups everywhere	WAL yes, full backups no	Create two ScheduledBackups
pg_dumpall needs intermediate storage	Direct pipe works fine	`pg_dumpall \| psql` between pods
CNPG clusters can scale to 0	Minimum is 1 instance	Suspend Kustomization + delete cluster

Is It Worth It?

Absolutely. The migration took about 2 hours total:

Zero data loss: All 46 databases migrated with verified row counts
Zero downtime (for the migration itself - apps were briefly stopped)
True dual-destination backups: Both B2 and R2 now have full backups
Cleaner architecture: Shared credentials, consistent naming, PgBouncer pooling
PostgreSQL 18: Latest version with performance improvements

The pgBackRest plugin is experimental, but it works. For a homelab, it’s the right choice.

This post documents part of the ongoing work on my home-ops repository. The pgBackRest plugin is from Dalibo.