RMAN and Data Domain space efficiency; measuring the impact of RMAN compression and encryption on Data Domain capacity consumption

Increasingly the old paradigm of nightly backing up your large Oracle database to tape, or writing the backup to a NAS share which is then mysteriously swept to tape by some unseen force (the Sys Admins), is becoming an unsustainable approach when backup windows are shrinking, 24/7 availability is becoming the norm and databases are getting larger and larger.

Backup Appliances, such as Sun’s ZDLRA or EMC’s Data Domain, offer many performance and manageability advantages over traditional approaches, and much of the work of traditional backup software is now baked into the appliance itself.

A modern database backup and recovery appliance should include features such as data encryption, compression, de-duplication and remote replication as standard features, to offload those functions from the host CPUs which we want to dedicate to running the database software.

EMC’s Data Domain includes SISL technology – Stream-Informed Segment Layout – to yield some very impressive data reduction numbers using a combination of de-duplication and compression.

But what happens if we choose to use RMAN features in addition to those offered by the backup applicance?  What happens if we try to encrypt, de-duplicate, or compress, an RMAN backupset that is itself encrypted or compressed?

I decided to run some tests to measure the impact of RMAN backupset compression and encryption on the capacity efficiency of Data Domain.

To do this I created an Oracle 11gR2 database on OEL 6.4 and generated a 10GiB Swingbench SOE dataset, which yields about 22GiB of data when indexing is accounted for.  Adding in some additional space for the data dictionary, the auxiliary tablespace and some sundry bits and pieces, my on-disk database ended up about 32GiB in size.

The test server was a VMware Linux guest with 6GB of RAM and 4 vCPUs.

I needed a method to compare the impact of various RMAN settings, so I created eight Data Domain storage units, each one of which was exported via NFS to my database host.  The eight storage units were as follows:

Storage Unit Purpose
/data/col1/NCNE_1 SU for a single RMAN backupset – not compressed or encrypted
/data/col1/NCNE_A SU for 3 RMAN backupsets – not compressed or encrypted
/data/col1/ENCR_1 SU for a single RMAN backupset – using RMAN encryption but not compression
/data/col1/ENCR_A SU for 3 RMAN backupsets – using RMAN encryption but not compression
/data/col1/COMP_1 SU for a single RMAN backupset – using RMAN compression but not encryption
/data/col1/COMP_A SU for 3 RMAN backupsets – using RMAN compression but not encryption
/data/col1/COEN_1 SU for a single RMAN backupset – using RMAN compression and encryption
/data/col1/COEN_A SU for 3 RMAN backupsets – using RMAN compression and encryption

DD Boost was not used for this test. Whereas Oracle supports encrypting RMAN backupsets written to disk, to write an encrypted backupset to an SBT device, which is how DD Boost presents itself to RMAN, requires Oracle Secure Backup, which is an additional license fee to Oracle and is not supported by DD Boost.

The database backed up for this test had not been written to this Data Domain before, to ensure deduplication could not occur between the test Storage Units and existing Storage Units used for other backups.

Archivelogs were not backed up as part of the test. Although archivelogs compress they do not dedupe, and may have skewed the overall numbers.

A single 1Gb NIC was used for the test. This test was about capacity, not performance.

The RMAN script used to backup the database is shown below, in this example with encryption and compression enabled. For the testing I used AES128 encryption for those RMAN backupsets that were encrypted.

#
# RMAN script to perform level 0 backup using DD boost
# 

configure controlfile autobackup on;
configure retention policy to recovery window of 3 days;
configure default device type to disk;
configure backup optimization on;

CONFIGURE ENCRYPTION ALGORITHM 'AES128';
CONFIGURE ENCRYPTION FOR DATABASE ON;

configure device type disk backup type to backupset parallelism 4;
configure channel device type disk maxopenfiles 1;

run {

   set controlfile autobackup format for device type sbt to "CONTROLFILE.%F";

   backup as compressed backupset filesperset 1 section size 32G format '/nfs_mount/dd0205_coen_a/%u_%p' incremental level = 0 (database);

}

#
# eof

And so to the results. After all the backups had completed, the Data Domain interface was checked to determine overall space efficiency:

storage unit comparison 2

 

With RMAN compression and encryption turned off, our single 31.9GiB RMAN backupset compressed down to 3.7GiB, a respectable 8.7X reduction.

But then backing up that same database another three times, which is logically 95.6GiB of data, only required 0.4GiB of space on the Data Domain. The de-duplication was eliminating most of the backup, and Data Domain compression was taking care of what was left.

The overall data reduction after the three follow up backups was an impressive 271.3X.

But when RMAN backupset compression is enabled things don’t look so good.

RMAN compression reduced the 31.9GiB backup down to 6.9GiB, not as efficient as the Data Domain’s 3.7GiB but still good. Data Domain was able to find some minimal additional efficiency and the final backup required 6.6GiB of space on the Data Domain.

However the three subsequent RMAN compressed backups did not benefit from de-duplication. Each subsequent backup was also 6.9GiB yielding a further 20.6GiB which Data Domain was only able to compress down to 18.9GiB for an overall efficiency of 1.1X.

Even with FILESPERSET of 1, RMAN compression means that any minor changes at the start of the backupset interrupts the patterns of all subsequently backed up blocks meaning de-duplication can bring little benefit.

Turning off RMAN backupset compression and enabling RMAN backupset encryption shows a different story.

The encrypted backupset is 31.9GiB – the same size as the non-encrypted/non-compressed backupset. And after it has been written to the Data Domain, compression only yields a modest 1.4X effiency for a final backup size of 22.6GiB.

Three subsequent RMAN encrypted backups yield 95.7GiB of data. Whereas the non-encrypted backupsets were able to de-dupe, the encrypted ones do not, and so Data Domain is only able to achieve some modest compression on the encrypted data for a total size of 67.9GiB.

Compare that to the 0.4GiB that the three non-encrypted backupsets were reduced down to!

Finally I tested compressing and encrypting the RMAN backupset.

A single compressed and encrypted backup generated a 6.9GiB RMAN backupset, the same as the non-encrypted but compressed backupset. Data Domain however was not able to achieve any further compression of this data, and in fact the interface showed the data inflating very slightly to 7.0GiB.

Similarly repeating the test a further three times yields 20.6GiB of RMAN backupsets, which require 21.0GiB of space on the Data Domain.

What is also of interest here is the time required to complete the backups.

The basic non-compressed, non-encrypted backup took 9m 17s to backup over the single 1Gb link.  Backup speeds were consistent across additional backups, with the three subsequent backups taking 36m 49s combined.

The RMAN compressed backup took 14m 36s, with the three subsequent executions taking a combined 43m 48s.  So even though Oracle RMAN was able to reduce the size of the backup from 31.9GiB to 6.9GiB, the cost of doing so was a substantial amount of host CPU.

One surprise for me was the RMAN encrypted backup.  Using AES128 encryption, I expected the CPU to be somewhat taxed for this operation, but it was not.  The RMAN encrypted backup also completed in 9m 17s with the three subsequent executions taking 43m 48s – identical to the non-compressed/non-encrypted backup.

However when compressing and encrypting the RMAN backupset, RMAN required 15m 37s to complete a single backup, and 46m 18s to complete the subsequent three backups.

The impact of Oracle compression on encrypted data can be further highlighted in this chart, taken from an Oracle article.  I added the color to highlight the negative impact of some combinations of Oracle TDE with RMAN compressed backups.

data path encrypted and compressed rman backupset

Essentially when using RMAN compression, encrypted data must be decrypted, compressed and then re-encrypted.  We have seen the severe performance impact of Oracle RMAN backupset compression in the examples above, and when combined with TDE it adds additional complexity and overhead.

Since the Data Domain provides in-line Data at Rest Encryption for RMAN backups, plus compression and de-duplication, the best approach seems to be to offload all of those functions to the backup appliance for maximum security, performance and efficiency.

For those DBAs who need additional layers of security, Oracle’s TDE solution provides excellent security at the column or tablespace level.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s