Restic-based backup¶
Attention
Scribe has been renamed to VolSync!
The Scribe project has been renamed to VolSync, and it has a new home. Come join us at our new location:
Documentation: https://volsync.readthedocs.io
Artifact Hub: https://artifacthub.io/packages/helm/backube-helm-charts/volsync
Scribe supports taking backups of PersistentVolume data using the Restic-based data mover. A ReplicationSource defines the backup policy (target, frequency, and retention), while a ReplicationDestination is used for restores.
The Restic mover is different than most of Scribe’s other movers because it is not meant for synchronizing data between clusters. This mover is specifically meant for data backup.
Specifying a repository¶
For both backup and restore operations, it is necessary to specify a backup
repository for Restic. The repository and connection information are defined in
a restic-config
Secret.
Below is an example showing how to use a repository stored on Minio.
apiVersion: v1
kind: Secret
metadata:
name: restic-config
type: Opaque
stringData:
# The repository url
RESTIC_REPOSITORY: s3:http://minio.minio.svc.cluster.local:9000/restic-repo
# The repository encryption key
RESTIC_PASSWORD: my-secure-restic-password
# ENV vars specific to the chosen back end
# https://restic.readthedocs.io/en/stable/030_preparing_a_new_repo.html
AWS_ACCESS_KEY_ID: access
AWS_SECRET_ACCESS_KEY: password
This Secret will be referenced for both backup (ReplicationSource) and for restore (ReplicationDestination).
Note
If necessary, the repository will be automatically initialized (i.e.,
restic init
) during the first backup.
Configuring backup¶
A backup policy is defined by a ReplicationSource object that uses the restic replication method.
---
apiVersion: scribe.backube/v1alpha1
kind: ReplicationSource
metadata:
name: mydata-backup
spec:
# The PVC to be backed up
sourcePVC: mydata
trigger:
# Take a backup every 30 minutes
schedule: "*/30 * * * *"
restic:
# Prune the repository (repack to free space) every 2 weeks
pruneIntervalDays: 14
# Name of the Secret with the connection information
repository: restic-config
# Retention policy for backups
retain:
hourly: 6
daily: 5
weekly: 4
monthly: 2
yearly: 1
# Clone the source volume prior to taking a backup to ensure a
# point-in-time image.
copyMethod: Clone
Backup options¶
There are a number of additional configuration options not shown in the above example. Scribe’s Restic mover options closely follow those of Restic itself.
- accessModes
When using a copyMethod of Clone or Snapshot, this field allows overriding the access modes for the point-in-time (PiT) volume. The default is to use the access modes from the source PVC.
- capacity
When using a copyMethod of Clone or Snapshot, this allows overriding the capacity of the PiT volume. The default is to use the capacity of the source volume.
- copyMethod
This specifies the method used to create a PiT copy of the source volume. Valid values are:
Clone - Create a new volume by cloning the source PVC (i.e., use the source PVC as the volumeSource for the new volume.
None - Do no create a PiT copy. The Scribe data mover will directly use the source PVC.
Snapshot - Create a VolumeSnapshot of the source PVC, then use that snapshot to create the new volume. This option should be used for CSI drivers that support snapshots but not cloning.
- storageClassName
This specifies the name of the StorageClass to use when creating the PiT volume. The default is to use the same StorageClass as the source volume.
- volumeSnapshotClassName
When using a copyMethod of Snapshot, this specifies the name of the VolumeSnapshotClass to use. If not specified, the cluster default will be used.
- cacheCapacity
This determines the size of the Restic metadata cache volume. This volume contains cached metadata from the backup repository. It must be large enough to hold the non-pruned repository metadata. The default is
1 Gi
.- cacheStorageClassName
This is the name of the StorageClass that should be used when provisioning the cache volume. It defaults to
.spec.storageClassName
, then to the name of the StorageClass used by the source PVC.- cacheAccessModes
This is the access mode(s) that should be used to provision the cache volume. It defaults to
.spec.accessModes
, then to the access modes used by the source PVC.- pruneIntervalDays
This determines the number of days between running
restic prune
on the repository. The prune operation repacks the data to free space, but it can also generate significant I/O traffic as a part of the process. Setting this option allows a trade-off between storage consumption (from no longer referenced data) and access costs.- repository
This is the name of the Secret (in the same Namespace) that holds the connection information for the backup repository. The repository path should be unique for each PV.
- retain
This has sub-fields for
hourly
,daily
,weekly
,monthly
, andyearly
that allow setting the number of each type of backup to retain. There is an additional field,within
that can be used to specify a time period during which all backups should be retained. See Restic’s documentation on –keep-within for more information.When more than the specified number of backups are present in the repository, they will be removed via Restic’s
forget
operation, and the space will be reclaimed during the next prune.
Performing a restore¶
Data from a backup can be restored using the ReplicationDestination CR. In most cases, it is desirable to perform a single restore into an empty PersistentVolume.
For example, create a PVC to hold the restored data:
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: datavol
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
Restore the data into datavol
:
---
apiVersion: scribe.backube/v1alpha1
kind: ReplicationDestination
metadata:
name: datavol-dest
spec:
trigger:
manual: restore-once
restic:
repository: restic-repo
destinationPVC: datavol
copyMethod: None
In the above example, the data will be written directly into the new PVC since
it is specified via destinationPVC
, and no snapshot will be created since a
copyMethod
of None
is used.
The restore operation only needs to be performed once, so instead of using a cronspec-based schedule, a manual trigger is used. After the restore completes, the ReplicationDestination object can be deleted.
Note
Currently, Scribe only supports restoring the latest backup. However, older backups may be present in the repository (according to the retain parameters). Those can be accessed directly using the Restic utility plus the connection information and credentials from the repository Secret.
Restore options¶
There are a number of additional configuration options not shown in the above example.
- accessModes
When Scribe creates the destination volume, this specifies the accessModes for the PVC. The value should be ReadWriteOnce or ReadWriteMany.
- capacity
When Scribe creates the destination volume, this value is used to determine its size. This need not match the size of the source volume, but it must be large enough to hold the incoming data.
- copyMethod
This specifies how the data should be preserved at the end of each synchronization iteration. Valid values are:
None - Do not create a point-in-time copy of the data.
Snapshot - Create a VolumeSnapshot at the end of each iteration
- destinationPVC
Instead of having Scribe automatically provision the destination volume (using capacity, accessModes, etc.), the name of a pre-existing PVC may be specified here.
- storageClassName
When Scribe creates the destination volume, this specifies the name of the StorageClass to use. If omitted, the system default StorageClass will be used.
- volumeSnapshotClassName
When using a copyMethod of Snapshot, this value specifies the name of the VolumeSnapshotClass to use when creating a snapshot.
- cacheCapacity
This determines the size of the Restic metadata cache volume. This volume contains cached metadata from the backup repository. It must be large enough to hold the non-pruned repository metadata. The default is
1 Gi
.- cacheStorageClassName
This is the name of the StorageClass that should be used when provisioning the cache volume. It defaults to
.spec.storageClassName
, then to the name of the StorageClass used by the source PVC.- cacheAccessModes
This is the access mode(s) that should be used to provision the cache volume. It defaults to
.spec.accessModes
, then to the access modes used by the source PVC.- repository
This is the name of the Secret (in the same Namespace) that holds the connection information for the backup repository. The repository path should be unique for each PV.