Replicating an RMAN Backupset with Data Domain and DD Boost

In this post, we will replicate an RMAN backupsets using RMAN managed replication and DD Boost.

In an earlier post, I showed how it was possible to use Data Domain’s Mtree replication to ensure that your Oracle RMAN backups are safely replicated to a second backup appliance when using Data Domain as an NFS target.

Replicating the RMAN backupset is important to protect databases from failures that might affect a whole data center, such as prolonged power or network failures, or events such as a flood that might destroy infrastructure.

These scenarios are rare but should they occur, having the RMAN backups of our databases available at a second location can be the different between an organization surviving such an event, or not.

The DD Boost MML Version 1.1.1.3 supports Version 2 of the RMAN SBT API, allowing it to write backupsets simultaneously to more than one Data Domain. This has the advantage that the RMAN catalog is aware of all copies of the backup, but the backup may also take longer to complete as writes to all locations must be acknowledged before RMAN can process the next part of the backup.

RMAN uses the copies directive to write up to four copies of the backup. When using the copies directive, the backup_tape_io_slaves parameter must be set to TRUE in the database parameter file.

The RMAN format command must also include the secondary locations to be written to.

In this example, we have a primary Data Domain called rstdd0205mgmt.emc.com, and a secondary Data Domain called rstdd0204mgmt.emc.com.

The Data Domain storage unit to write the replicated backupset to must have the same name on both targets. In this example I have created an Mtree called boost_managed_rep on both Data Domains. This Mtree is not replicated by Data Domain, as RMAN will handle the replication instead.

We will set the default device to sbt as we need to RMAN to use the DD Boost MML library. I have configured the backup to use four channels.

configure controlfile autobackup off;
configure retention policy to recovery window of 3 days;
configure default device type to sbt;
configure backup optimization off;

configure channel device type sbt parms 
'BLKSIZE=1048576,
SBT_LIBRARY=/u01/app/oracle/product/11.2.0/dbhome_1/lib/libddobk.so,
ENV=(STORAGE_UNIT=boost_managed_rep,
BACKUP_HOST=rstdd0205mgmt.emc.com,
ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
)';

configure device type sbt backup type to backupset parallelism 4;

run {

   sql 'alter system switch logfile';

   backup as backupset copies 2 incremental level 0 filesperset 64 section size 16G
     database tag 'FULL_BACKUP_DF_REP' 
     format '%u_%p', 'rstdd0204mgmt.emc.com/%u_%p';

   sql 'alter system switch logfile';

   backup as backupset copies 2 filesperset 64 
     archivelog all tag 'FULL_BACKUP_AL_REP'
     format '%u_%p', 'rstdd0204mgmt.emc.com/%u_%p';

   backup as backupset copies 2 
     spfile tag 'FULL_BACKUP_SP_REP'
     format '%u_%p', 'rstdd0204mgmt.emc.com/%u_%p';

   backup as backupset copies 2
     format '%u_%p', 'rstdd0204mgmt.emc.com/%u_%p'
     current controlfile tag 'FULL_BACKUP_CF_REP';

}

The format command here writes the primary copy as %u_%p to generate a unique backupset name. The command includes a second parameter rstdd0204mgmt.emc.com/%u_%p which writes the backupset to the second Data Domain.

Note that when using RMAN managed replication with DD Boost, it is not necessary to use the %c variable in the backupset name.

Datafile backupsets will be tagged FULL_BACKUP_DF_REP, archivelog backupsets will be tagged FULL_BACKUP_AL_REP and the SPFILE backup will be tagged as FULL_BACKUP_SP_REP.

Running the script from RMAN yields the following output – note some output has been removed in the interests of clarity:

Starting backup at 11-AUG-15
allocated channel: ORA_SBT_TAPE_1
channel ORA_SBT_TAPE_1: SID=150 device type=SBT_TAPE
channel ORA_SBT_TAPE_1: Data Domain Boost API
allocated channel: ORA_SBT_TAPE_2
channel ORA_SBT_TAPE_2: SID=149 device type=SBT_TAPE
channel ORA_SBT_TAPE_2: Data Domain Boost API
allocated channel: ORA_SBT_TAPE_3
channel ORA_SBT_TAPE_3: SID=18 device type=SBT_TAPE
channel ORA_SBT_TAPE_3: Data Domain Boost API
allocated channel: ORA_SBT_TAPE_4
channel ORA_SBT_TAPE_4: SID=147 device type=SBT_TAPE
channel ORA_SBT_TAPE_4: Data Domain Boost API
channel ORA_SBT_TAPE_1: starting incremental level 0 datafile backup set
channel ORA_SBT_TAPE_1: specifying datafile(s) in backup set
input datafile file number=00007 name=+DATA/nyc11/datafile/soe.272.886021463
backing up blocks 1 through 2097152

<output removed to aid clarity>

channel ORA_SBT_TAPE_4: starting piece 1 at 11-AUG-15
channel ORA_SBT_TAPE_4: finished piece 1 at 11-AUG-15 with 2 copies and tag FULL_BACKUP_DF_REP
piece handle=8kqeas90_1 comment=API Version 2.0,MMS Version 1.1.1.3
piece handle=rstdd0204mgmt.emc.com/8kqeas90_1 comment=API Version 2.0,MMS Version 1.1.1.3
channel ORA_SBT_TAPE_4: backup set complete, elapsed time: 00:00:07
channel ORA_SBT_TAPE_1: finished piece 5 at 11-AUG-15 with 2 copies and tag FULL_BACKUP_DF_REP
piece handle=88qeas2u_5 comment=API Version 2.0,MMS Version 1.1.1.3
piece handle=rstdd0204mgmt.emc.com/88qeas2u_5 comment=API Version 2.0,MMS Version 1.1.1.3
channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:01:44
channel ORA_SBT_TAPE_3: finished piece 7 at 11-AUG-15 with 2 copies and tag FULL_BACKUP_DF_REP
piece handle=88qeas2u_7 comment=API Version 2.0,MMS Version 1.1.1.3
piece handle=rstdd0204mgmt.emc.com/88qeas2u_7 comment=API Version 2.0,MMS Version 1.1.1.3
channel ORA_SBT_TAPE_3: backup set complete, elapsed time: 00:01:29
channel ORA_SBT_TAPE_2: finished piece 6 at 11-AUG-15 with 2 copies and tag FULL_BACKUP_DF_REP
piece handle=88qeas2u_6 comment=API Version 2.0,MMS Version 1.1.1.3
piece handle=rstdd0204mgmt.emc.com/88qeas2u_6 comment=API Version 2.0,MMS Version 1.1.1.3
channel ORA_SBT_TAPE_2: backup set complete, elapsed time: 00:02:04
Finished backup at 11-AUG-15

In the above output, note that the piece handle output shows each piece being written twice. Once to the default channel setting of the primary Data Domain, and a second which is sent to the second Data Domain.

We can verify this by inspecting the RMAN catalog:

RMAN> list backupset of controlfile;


List of Backup Sets
===================

BS Key  Type LV Size
------- ---- -- ----------
85906   Full    31.00M
  Control File Included: Ckp SCN: 5325628      Ckp time: 11-AUG-15

  Backup Set Copy #2 of backup set 85906
  Device Type Elapsed Time Completion Time Compressed Tag
  ----------- ------------ --------------- ---------- ---
  SBT_TAPE    00:00:06     11-AUG-15       NO         FULL_BACKUP_CF_REP

    List of Backup Pieces for backup set 85906 Copy #2
    BP Key  Pc# Status      Media                   Piece Name
    ------- --- ----------- ----------------------- ----------
    85909   1   AVAILABLE   boost_managed_rep       rstdd0204mgmt.emc.com/8pqeasbe_1

  Backup Set Copy #1 of backup set 85906
  Device Type Elapsed Time Completion Time Compressed Tag
  ----------- ------------ --------------- ---------- ---
  SBT_TAPE    00:00:06     11-AUG-15       NO         FULL_BACKUP_CF_REP

    List of Backup Pieces for backup set 85906 Copy #1
    BP Key  Pc# Status      Media                   Piece Name
    ------- --- ----------- ----------------------- ----------
    85908   1   AVAILABLE   boost_managed_rep       8pqeasbe_1

The rest of the RMAN backup process will follow this same pattern, with identical backup pieces being written to both the primary and secondary Data Domains, and the RMAN catalog showing both pieces.

Once the backup completes we can restore the database to a second host.

In the restore operation we will restore from the secondary Data Domain to verify that the replicated backupset is usable to recover from a failure.

First, let’s check the database sequence number at the end of the backup. We can obtain this using the following SQL:

set linesize 132

col handle for a40
col media for a20

select
  vbr.sequence#,
  vbr.first_change#,
  vbp.copy#,
  vbp.handle,
  vbp.media
from 
  v$backup_redolog vbr, 
  v$backup_piece vbp
where 1=1
and vbr.set_stamp = vbp.set_stamp
and vbr.set_count = vbp.set_count
and vbp.piece# = 1
and vbp.tag = 'FULL_BACKUP_AL_REP'
order by 1,3

Executing this script while connected to the source database, we see that we should be aiming for sequence 3513.

 SEQUENCE# FIRST_CHANGE#      COPY# HANDLE                                   MEDIA
---------- ------------- ---------- ---------------------------------------- --------------------
      3511       5323176          1 8lqeasar_1                               boost_managed_rep
      3511       5323176          2 rstdd0204mgmt.emc.com/8lqeasar_1         boost_managed_rep
      3512       5325223          1 8mqeasar_1                               boost_managed_rep
      3512       5325223          2 rstdd0204mgmt.emc.com/8mqeasar_1         boost_managed_rep
      3513       5325527          1 8nqeasar_1                               boost_managed_rep
      3513       5325527          2 rstdd0204mgmt.emc.com/8nqeasar_1         boost_managed_rep

I am going to restore the database to a different server using the replication target Data Domain as my backup source.

To achieve this I am going to use the RMAN set dbid function, which will allows RMAN to restore from the backupset of the specified database.

The database identifier can be seen from the RMAN catalog as follows:

RMAN> list db_unique_name all;

List of Databases
DB Key  DB Name  DB ID            Database Role    Db_unique_name
------- ------- ----------------- ---------------  ------------------
63367   NYC11    334041916        PRIMARY          NYC11
9936    MYCLONE  527884335        PRIMARY          XIO11WSB_CLONE      
36652   XIO11WSB 2386171380       PRIMARY          XIO11WSB            
36652   XIO11WSB 2386171380       STANDBY          XIO11WSB_XIO11WSB

So we will instruct RMAN to restore database ID 334041916 in our recovery script. There are various methods to achieve this recovery. In this example I am going to restore the controlfiles first, and then mount the database.

Once that is done I will restore the datafiles, set the recovery target, and roll the database forward to the desired point. We should then be able to open our recovered database.

The RMAN recovery script looks as follows:

run {
 
  allocate channel ch1 device type sbt maxopenfiles 4 parms 
  'BLKSIZE=1048576,
  SBT_LIBRARY=/u01/app/oracle/product/11.2.0/dbhome_1/lib/libddobk.so,
  ENV=(STORAGE_UNIT=boost_managed_rep,
  BACKUP_HOST=rstdd0204mgmt.emc.com,
  ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
  )';

  set dbid 334041916;

  restore controlfile;

  release channel ch1;
}

sql 'alter database mount';

run {

  allocate channel ch1 device type sbt maxopenfiles 4 parms 
  'BLKSIZE=1048576,
  SBT_LIBRARY=/u01/app/oracle/product/11.2.0/dbhome_1/lib/libddobk.so,
  ENV=(STORAGE_UNIT=boost_managed_rep,
  BACKUP_HOST=rstdd0204mgmt.emc.com,
  ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
  )';
  
  restore database;
  
  set until sequence 3513;

  recover database;

  release channel ch1;
}

As you can see in the script above, the sbt channel is using the secondary Data Domain rstdd0204mgmt.emc.com, not the primary target of the original backup.

The target instance must be started in a no-mount state. We will use the following simple INIT.ORA parameter file:

_disk_sector_size_override=TRUE
audit_file_dest='/u01/app/oracle/admin/nyc11/adump'
audit_trail ='db'
control_files = (ora_control1, ora_control2)
compatible='11.2.0.4.0'
db_block_size=8192
db_domain=''
db_name='NYC11'
db_recovery_file_dest='/u01/app/oracle/fast_recovery_area'
db_recovery_file_dest_size=64G
diagnostic_dest='/u01/app/oracle'
dispatchers='(PROTOCOL=TCP) (SERVICE=NYC11XDB)'
memory_target=1G
open_cursors=300
processes = 150
remote_login_passwordfile='EXCLUSIVE'
undo_tablespace='UNDOTBS1'

Note that our source database uses 4K redo block sizes, so our target must also have the _disk_sector_size_override parameter set as the ASM disks are using 512 byte sector LUNs.

Also note that the db_recovery_file_dest_size must be set large enough to allow RMAN to restore all the archive logs needed to complete the recovery.

Now we can start the recovery instance:

SQL> startup nomount pfile=$ORACLE_HOME/dbs/initnyc11.ora
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2260088 bytes
Variable Size             671089544 bytes
Database Buffers          390070272 bytes
Redo Buffers                5517312 bytes
SQL>

We can now launch the RMAN recovery. We will only connect to the target database and the RMAN catalog. We will not connect to the primary database:

[oracle@rstemc64vm31 rman_clone]$ rman target / catalog rman/rman@rcat

Recovery Manager: Release 11.2.0.4.0 - Production on Tue Aug 11 14:48:52 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: NYC11 (not mounted)
connected to recovery catalog database

I have annotated the output to highlight certain areas of interest.

RMAN> run {
2>  
3>   allocate channel ch1 device type sbt maxopenfiles 4 parms 'BLKSIZE=1048576,SBT_LIBRARY=/u01/app/oracle/product/11.2.0/dbhome_1/lib/libddobk.so,ENV=(STORAGE_UNIT=boost_managed_rep,BACKUP_HOST=rstdd0204mgmt.emc.com,ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1)';
4> 
5>   set dbid 334041916;
6> 
7>   restore controlfile;
8> 
9>   release channel ch1;
10> }
allocated channel: ch1
channel ch1: SID=63 device type=SBT_TAPE
channel ch1: Data Domain Boost API

executing command: SET DBID
database name is "NYC11" and DBID is 334041916

Starting restore at 11-AUG-15

channel ch1: starting datafile backup set restore
channel ch1: restoring control file
channel ch1: reading from backup piece 8pqeasbe_1
channel ch1: piece handle=8pqeasbe_1 tag=FULL_BACKUP_CF_REP
channel ch1: restored backup piece 1
channel ch1: restore complete, elapsed time: 00:00:01
output file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/ora_control1
output file name=/u01/app/oracle/product/11.2.0/dbhome_1/dbs/ora_control2
Finished restore at 11-AUG-15

released channel: ch1

The first run block of the RMAN recovery script has completed. It selected backup piece 8pqeasbe_1 which was tagged as FULL_BACKUP_CF_REP to restore the controlfiles.

RMAN is now able to mount the recovery instance:

RMAN> 
RMAN> sql 'alter database mount';
sql statement: alter database mount

Now the second run block is executed, which launched the data file restore operations:

RMAN> run {
2> 
3>   allocate channel ch1 device type sbt maxopenfiles 4 parms 'BLKSIZE=1048576,SBT_LIBRARY=/u01/app/oracle/product/11.2.0/dbhome_1/lib/libddobk.so,ENV=(STORAGE_UNIT=boost_managed_rep,BACKUP_HOST=rstdd0204mgmt.emc.com,ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1)';
4> 
5>   restore database;
6> 
7>   set until sequence 3513;
8> 
9>   recover database;
10> 
11>   release channel ch1;
12> }
allocated channel: ch1
channel ch1: SID=63 device type=SBT_TAPE
channel ch1: Data Domain Boost API

Starting restore at 11-AUG-15
Starting implicit crosscheck backup at 11-AUG-15
Finished implicit crosscheck backup at 11-AUG-15

Starting implicit crosscheck copy at 11-AUG-15
Crosschecked 4 objects
Finished implicit crosscheck copy at 11-AUG-15

searching for all files in the recovery area
cataloging files...
no files cataloged


channel ch1: starting datafile backup set restore
channel ch1: specifying datafile(s) to restore from backup set
channel ch1: restoring datafile 00002 to +DATA/nyc11/datafile/sysaux.257.885932477
channel ch1: restoring datafile 00005 to +DATA/nyc11/datafile/example.265.885932543
channel ch1: reading from backup piece 89qeas2v_1
channel ch1: piece handle=89qeas2v_1 tag=FULL_BACKUP_DF_REP
channel ch1: restored backup piece 1
channel ch1: restore complete, elapsed time: 00:00:15
channel ch1: starting datafile backup set restore
channel ch1: specifying datafile(s) to restore from backup set
channel ch1: restoring datafile 00001 to +DATA/nyc11/datafile/system.256.885932475
channel ch1: restoring datafile 00004 to +DATA/nyc11/datafile/users.259.885932477
channel ch1: reading from backup piece 8bqeas31_1
channel ch1: piece handle=8bqeas31_1 tag=FULL_BACKUP_DF_REP
channel ch1: restored backup piece 1

<output removed to aid clarity>

channel ch1: reading from backup piece 88qeas2u_8
channel ch1: piece handle=88qeas2u_8 tag=FULL_BACKUP_DF_REP
channel ch1: restored backup piece 8
channel ch1: restore complete, elapsed time: 00:00:15
Finished restore at 11-AUG-15

We can see that RMAN has selected the backup we made earlier, tagged as FULL_BACKUP_DF_REP, and restore the data files from the backup on the second Data Domain.

RMAN now executes the last part of the second run block, setting the recovery target and restore the archive logs from the RMAN backup:

executing command: SET until clause

Starting recover at 11-AUG-15

starting media recovery

channel ch1: starting archived log restore to default destination
channel ch1: restoring archived log
archived log thread=1 sequence=3512
channel ch1: reading from backup piece 8mqeasar_1
channel ch1: piece handle=8mqeasar_1 tag=FULL_BACKUP_AL_REP
channel ch1: restored backup piece 1
channel ch1: restore complete, elapsed time: 00:00:01
archived log file name=/u01/app/oracle/fast_recovery_area/NYC11/archivelog/2015_08_11/o1_mf_1_3512_bwnl6f7k_.arc thread=1 sequence=3512
channel default: deleting archived log(s)
archived log file name=/u01/app/oracle/fast_recovery_area/NYC11/archivelog/2015_08_11/o1_mf_1_3512_bwnl6f7k_.arc RECID=6603 STAMP=887469069
media recovery complete, elapsed time: 00:00:00
Finished recover at 11-AUG-15

released channel: ch1

Now that the recovery complete, we can open up the recovered database. We will need to use the open resetlogs command since the database has been restored to a different server, and there are currently no online redo logs present on this machine.

[oracle@rstemc64vm31 rman_clone]$ sqlplus "/ as sysdba" 

SQL*Plus: Release 11.2.0.4.0 Production on Tue Aug 11 15:14:40 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> alter database open resetlogs;

Database altered.

Our database has now been restored to a different host using a replicated backupset.

Remember that full testing of all aspects of your Oracle data protection architecture is absolutely critical to be confident that you are able to recover from a failure.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s