Intro
If there is something I honestly always struggled with as a DBA, it’s RMAN incarnation. Both mystical like its name suggests and vital, ensuring absolute consistency of your database and backups. But deep down, you always know it’ll back fire one day out of the shadows. Because that’s where it hides, lurking until a human error happens as one messes up with archive logs or RMAN retentions. It’s no surprise that my skin starts crawling whenever I see that word show up in an RMAN Horror error. Today we’ll just explore one of the many cases leading to this ORA-19906 error as it can be anything.
Use case and Configuration (RMAN Refresh)
- This didn’t occur during a proper Database restore but rather during a refresh from production.
- I was asked to replace the client’s legacy procedure to refresh 12c databases from production by a time effective alternative which is why I picked RMAN duplicate method (more popular for creating standby DBs). let’s describe the course of events prior to the incident
- Configuration:
Source: Oracle 12c NonCDB with enabled archive logging and backup. OS: Windows server
Test: Similar setup than the production where databases used to be refreshed using a simple restore
procedure by-weekly.
2. The steps
- Drop the old refreshed database to be replaced and keep a copy of he pfile
- Import the password file from source instance to destination DB and rename it to fit the target instance
- Add appropriate entry into "tnsnames.ora" to allow connections to target DB from the duplicate server.
ProDB-source =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = Prodserver)(PORT = 1521))
(SEND_BUF_SIZE = 65535)
(RECV_BUF_SIZE = 65535)
(CONNECT_DATA = (SERVER = DEDICATED)
(SERVICE_NAME = ProDB))) - Directories for data and logs as some system parameters will be adjusted in RMAN duplicate script.
- Directories necessary for starting the duplicate database are already there (.i.e audit)
- Make the backup files from the source database available to the destination server(\\Share\PBKUP$\)
- Start the duplicate instance in NoMount mode using the pfile copy saved earlier
- RMAN Connect to the duplicate database as (AUXILIARY) and source DB as TARGET
$ set ORACLE_SID=testDB
SQL> STARTUP NOMOUNT PFILE='**\initTargetDB.ora';
C:> rman TARGET sys@ProdDB-SOURCE AUXILIARY / msglog=targetDB_duplicate.log - Run Rman duplicate script (make sure the last controlfile backup is older than last archivelog backup)
@rman_duplicate.rman
Run {
DUPLICATE DATABASE TO testDB
until SCN XXXXXXXXX; # to be checked in prod
SPFILE # params to adjust
parameter_value_convert ('ProDB','targetDB')
set db_file_name_convert='D:\ORACLE\ORADATA\ProDB\','D:\ORACLE\ORADATA\targetDB\'
'F:\ORACLE\ORADATA\ProDB\TEMP\','F:\ORACLE\ORADATA\targetDB\TEMP\
set log_file_name_convert='F:\ORACLE\ORADATA\ProDB\','F:\ORACLE\ORADATA\targetDB\'
set db_name='testDB'
set db_unique_name='testDB'
set db_recovery_file_dest_size='xx'
set sga_target='xx'
set pga_aggregate_target='xx'
BACKUP LOCATION '\\Share\PBKUP$\'
NOFILENAMECHECK;
}
Error New incarnation created
The duplicate command completes a restore but fails when the recovery fails
Oracle instance shut down
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 05/28/2020 12:45:11
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
ORA-00283: recovery session canceled due to errors
RMAN-11003: failure during parse/execution of SQL statement: alter database recover logfile
'F:\ORACLE\TesttDB\ARCHIVELOG\2020_05_28\O1_MF_1_16380_HDZTLDHN_.ARC'
ORA-00283: recovery session canceled due to errors
ORA-19906: recovery target incarnation changed during recovery
Alert log message:
alter database recover if needed
start until change 15446103440 using backup controlfile
Media Recovery Start Started logmerger process
Thu May 28 05:13:23 2020 Parallel Media Recovery started with 4 slaves
Setting recovery target incarnation to 3 # What the heck ?????
ORA-279 signalled during: alter database recover if needed
... alter database recover logfile
'F:\ORACLE\testDB\ARCHIVELOG\2020_05_28\O1_MF_1_16380_HDZ05NW3_.ARC'
Thu May 28 05:13:50 2020 Errors with log O1_MF_1_16380_HDZ05NW3_.ARC
Thu May 28 05:13:50 2020 Media Recovery failed with error 19906
Recovery interrupted!
ORA-00283: recovery session canceled due to errors
ORA-19906: recovery target incarnation changed during recovery
Investigation:
== DUPLICATE target
>>> Incarnation 3 on production that doesn’t exist
RMAN> list incarnation;
List of Database Incarnations
DB Key IncKey DB Name DB ID STATUS Reset SCN Reset Time
------- ------ ------- ---------- ------- ---------- ----------
1 1 ProDB 1322653252 ORPHAN 1 03-NOV-11
2 2 ProDB 1322653252 ORPHAN 994063 18-JAN-13
3 3 ProDB 1322653252 CURRENT 14331907298 25-APR-19 == Ghost incarnation== PROD source
RMAN> list incarnation;
List of Database Incarnations
DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time
------ ------- ------- ---------- ------- --------- -----------
1 1 ProDB 1322653252 PARENT 1 03-nov-2011
2 2 ProDB 1322653252 CURRENT 994063 18-jan-2013 == valid Incarnation
Incarnation 3 on production doesn’t exist but is visible in the duplicate Database.
Let’s try to check and clean any leftover in the backup location '\\Share\PBKUP$\'RMAN> crosscheck backupset;
Crosschecked 6 objects
There was 13 files in that directory among which 7 where outdated and not even spotted by RMAN crosscheck . Cleaning them and crosschecking didn’t fix the problem so
Real Root cause
- I was so obsessed with the source backup location that I forgot to read the full path of the archive involved in the error.
- In fact, this happened because the FRA directory on the duplicate server wasn’t cleaned before running the RMAN duplicate script. Hence RMAN will implicitly catalog all old files including the Archive logs and add them to the media recovery after the restore is done (triggering new incarnation that actually belonged to the past) .
'F:\ORACLE\testDB\ARCHIVELOG\2020_05_28\O1_MF_1_16380_HDZ05NW3_.ARC'
Solution I removed all old Archive logs and the duplicate finished with no errors after that.
Conclusion
The Rman lesson of the day to remember is: "Always clean up the house before moving in..".
That is clean the existing FRA folders on the duplicate destination before the refresh ;).
PS: You’d be surprise to know that this error popped up for other users a decade ago on RMAN 10g
No comments:
Post a Comment