Lots of people seem to like Oracles Recovery manager. I am not one of them.
I think this is because of a lack of understanding on my part of how it works. It is
a complex beast, and at the same time has some annoying limitations.
I like to automate things. I have a number of scripts to call RMAN to
do backups and restores in common situations. These fail far too often
for my liking. I feel I should look into why. Maybe I can learn to love
RMAN? We shall see.
We got the following error:
RMAN-06569: DATABASE: PROD does not match previous DATABASE: TEST
This is because we have done a clone of production to test using a
SAN snapshot, or by copying the files, then a create control file
command. Then we forget to run a nid, or else a backup kicks in
before we get round to it.
The next time you try to do a duplicate, it gets confused as to which
database is being used. To fix this, you need to log in to
production, and unregister the database:
Logging into the catalogue we can see that it has taken TEST as a standby
for PROD:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| RMAN> connect catalog user/password@CATALOG
connected to recovery catalog database
RMAN> list db_unique_name of database PROD;
List of Databases
DB Key DB Name DB ID Database Role Db_unique_name
------- ------- ---------------- --------------- ------------------
10796359 PROD 2538967823 PRIMARY PROD
10796359 PROD 2538967823 STANDBY TEST
10796359 PROD 2538967823 STANDBY STANDBY1
10796359 PROD 2538967823 STANDBY STANDBY2
|
The solution to this is to log into RMAN from the primary PROD database, and
unregister the test database that was registered erroneously by the backups.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| $ rman target /
Recovery Manager: Release 12.1.0.2.0 - Production on Tue Jan 2 16:40:34 2018
Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved.
connected to target database: PROD (DBID=2538967823)
RMAN> connect catalog user/password@CATALOG
connected to recovery catalog database
RMAN> unregister db_unique_name 'TEST';
database db_unique_name is "TEST", db_name is "PROD" and DBID is 2538967823
Want to unregister the database with target db_unique_name (enter YES or NO)? YES
database with db_unique_name TEST unregistered from the recovery catalog
|
Then if we list the databases, we get a tidy list, and more importantly, the
duplicate works.
1
2
3
4
5
6
7
8
9
| RMAN> list db_unique_name of database PROD;
List of Databases
DB Key DB Name DB ID Database Role Db_unique_name
------- ------- ---------------- --------------- ------------------
10796359 PROD 2538967823 PRIMARY PROD
10796359 PROD 2538967823 STANDBY STANDBY1
10796359 PROD 2538967823 STANDBY STANDBY2
|
It is also possible to unregister a database using a PL/SQL call:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| $ sqlplus
SQL*Plus: Release 12.1.0.2.0 Production on Mon Jan 8 16:58:16 2018
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Enter user-name: user/password@CATALOG
Last Successful login time: Mon Jan 08 2018 16:57:33 +00:00
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning option
SQL> exec dbms_rcvcat.unregisterdatabase(35409171,3917556746);
PL/SQL procedure successfully completed.
SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning option
|
I found after this that the newer incarnation was also unregistered from the
catalogue, so I registered it again.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
| $ rman target /
Recovery Manager: Release 12.1.0.2.0 - Production on Mon Jan 8 17:16:44 2018
Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved.
connected to target database: TEST (DBID=3919541964)
RMAN> connect catalog user/password@CATALOG
connected to recovery catalog database
RMAN> list backup of database;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of list command at 01/08/2018 17:17:10
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20001: target database not found in recovery catalog
RMAN> register database;
database registered in recovery catalog
starting full resync of recovery catalog
full resync complete
RMAN> list backup of database;
List of Backup Sets
===================
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
36124190 Incr 0 8.75M SBT_TAPE 00:00:07 01-JAN-18
BP Key: 36126997 Status: AVAILABLE Compressed: NO Tag: HOT_DB_BK_LEVEL0
Handle: bk_1670_1_964234591 Media: @aaaab
List of Datafiles in backup set 36124190
File LV Type Ckp SCN Ckp Time Name
---- -- ---- ---------- --------- ----
105 0 Incr 15297541132164 01-JAN-18 /TEST/filename/file.dbf
...
|
But, to make sure this doesn’t happen again, we need to make sure
that after a clone that isn’t done by RMAN, we use the nid tool
to give the database a new database identifier (DBID), as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| ORACLE_SID=TEST
ORAENV_ASK=NO
. oraenv
# Nid needs the database in mount mode
sqlplus / as sysdba <<-!
shutdown immediate;
startup mount
exit
!
# Give the database a new DBID
nid TARGET=/ LOGFILE=nid_$ORACLE_SID.log
cat nid_$ORACLE_SID.log
#nid leaves the database down. Start it up
# and open resetlogs
sqlplus / as sysdba <<-!
startup mount
alter database open resetlogs;
shutdown immediate;
startup
exit
!
# Lastly, register the database in the recovery
# catalogue.
rman target / <<-!
connect catalog user/password@CATALOG
register database;
exit
!
|
The real solution is ensuring this is run faithfully after any
clone done without the use of RMAN.