UPDATE: Starting version 9, this is no longer required. Please read more here > Ultimate Guide to Version 9
In Veeam Backup & Replication v7, the Backup Copy Job was implemented. This job type is by nature using the “forward incremental forever” method, which was later introduced in version 8 for primary backup jobs as well.
The “forever” aspect of these job types is implemented by performing a transform of the oldest incremental backup file (VIB) in the chain once the desired retention is achieved. The transform process will merge data blocks of the VIB file into the full backup file (VBK). Such transforms are also referred to as synthetic operations.
The purpose of this blog post is to provide a workaround for users of Veeam Backup & Replication v7 or v8 using deduplication appliances as backup repository for the Backup Copy Job, so they can avoid such synthetic operations.
Deduplication Storage and Synthetic Operations
So, why avoid synthetic operations? Synthetic full backups have been around for many years, and they were introduced to mitigate the issue of having to perform traditional active full backups with large amounts of data. In larger infrastructures, active full backups simply do not scale. For more information about how synthetic full backups are created, please refer to the Veeam User Guide > How Synthetic Full Backup Works
When using the Backup Copy Job, archive restore points (grandfather-father-son or GFS) will be created as synthetic full backups, while the daily transform process will also read and write existing data blocks in the backup repository. The transform process and creating “synthetic full” restore points are by nature random workloads (1x random read, 1x random write).
In addition to extremely efficient data reduction mechanisms, most deduplication appliances implement a level of read-ahead to speed up restore operations. The read-ahead will have a significant performance boost for sequential operations such as full VM restores, whereas it will typically have a severe penalty for any synthetic operation such as transforming incremental backups and creating full backups. In addition to affecting synthetic operations, the impact of read-ahead extends to features such as Instant VM Recovery, file and item-level recoveries with the Explorer products and Virtual Lab, but I will leave that for a future blog post. Until then, I will simply refer to this post on the Veeam Community Forums (it applies to any version) > Version 7 and Deduplication devices.
Some deduplication appliances implement APIs to offload synthetic operations to the appliance itself, such as EMC DataDomain DDBoost. DDBoost is supported natively in Veeam Backup & Replication v8, and thus synthetic operations run much faster, while the penalty during restores remain unchanged. Veeam also has native support for ExaGrid, and while the implementation is a bit different, it can be considered as efficient as DDBoost for synthetic operations.
This blog post is aimed at deduplication appliances where such implementations cannot be leveraged natively in Veeam Backup & Replication e.g. EMC DataDomain without DDBoost, HP StoreOnce, NetApp SteelStore, Quantum DXi or Dell DR4000 and DR6000.
In order to prevent any synthetic operations from occuring, the following steps can be used for configuring the Backup Copy Job to never exceed its configured retention period for one chain. Instead of continuing working on the same backup chain, a new chain will be created by forcing Veeam Backup & Replication to perform an active full as if it was the very first cycle of the Backup Copy Job.
- Configure the repository as Rotated Media
- Use the PowerShell script provided below to relocate the chain of the Backup Copy Job once it has reached the desired retention
The script will lookup the configured simple retention points for the Backup Copy Job, and relocate these to a folder prefixed with “Archive” in the same folder as the original Backup Copy Job target folder. Please note there is no retention handling implemented, except from the parameters explained later.
1. Rotated Media
As described in the article from the Veeam manual (Backup Repositories with Rotated Drives), it is possible to force the software to recreate the backup chain if it no longer exists. We can leverage this feature to force an active full backup and thus avoid a synthetic operation. It is as simple as a checkbox in the advanced repository configuration:
For Veeam Backup & Replication older than v8, the following registry key is available:
ForceCreateMissingVBK (DWORD). In version 7 Patch 3, an additional registry key was introduced:
ForceDeleteBackupFiles (DWORD). Before setting these registry keys, please read more about their behaviour in the following knowledge base article as they can end up wiping your entire repository if not handled with care. As the names imply, they were invented for rotated media > Veeam KB1154
2. Post Job Script
Save the script to
C:\Veeam\VeeamActiveBackupCopy.ps1. Ensure that your Veeam service account has permissions to access this folder, and also full permissions to access and rename directories in the backup repository. Configure the script as post job activity for your Backup Copy Job using this command:
C:\windows\system32\WindowsPowerShell\v1.0\powershell.exe -Command C:\Veeam\VeeamActiveBackupCopy.ps1 -DeleteIncremental $false -DeleteOldChain $false
The script takes two parameters:
-DeleteOldChain. The default setting for both is $false, so the script will simply relocate the backup files to the designated “Archive” folder.
-DeleteIncremental $truewill remove incremental backups from previous chains, once a new chain is added. This can be compared to GFS style
-DeleteOldChain $truewill remove both previous full and incremental backups, only preserving the most recent backups
The default behaviour and
-DeleteIncremental $true can be seen in the following two screenshots.
To verify your script works, simply validate that post job activity has completed successfully in your Backup Copy Job session:
Kudos to my colleague Tom Sightler for providing the part of the PowerShell script which will automatically detect the name of the job. It really simplifies the usage of the script!
If you found this blog post to be helpful, I would be happy to see you sharing it with your social networks.