Different Scenarios
In my previous post on this topic, I noted that you could use the snapshot time
on the recover database command to recover the database from a SAN snapshot.
I realise that there are different possible scenarios and my write up wasn’t clear
on which approach is applicable when. Also, the test I did was unrealistic as I
used the logs after the snapshot was taken, and the whole point of using
SAN snapshots is that they contain everything required for a crash consistent
recovery.
Start up in place
If you had a crash consistent snapshot, this can be mounted in the place of the
original and started. Oracle will recover as if the database had crashed, and
all is well. Easy!
Start up a copy
More often, a table will be accidentally corrupted, and we will be asked to
restore it from the snapshot, but we want to keep the rest of the data in the
database is good, we want to leave that as it is. This is more tricky, and is
where the new command comes in.
Here it is really useful to have the create controlfile command. It is possible
to construct it from the files on the snapshot, but this is fiddly. It would
be worth having a cron job to periodically issue:
1
| alter database backup controlfile to trace as '/filesystem/CONTROLFILE.sql' resetlogs;
|
The main thing is to have all the files.
Edit the controlfile as follows:
- Change the paths of the files if the filesystem is mounted in a different place
- Change the name of the database
- Insert the word SET to change the name of the database in the file headers.
So the first line of the create controlfile command looks like this:
1
| CREATE CONTROLFILE REUSE SET DATABASE "RECO" RESETLOGS FORCE LOGGING ARCHIVELOG
|
Now, I have a healthy paranoia, such that before running this command I check
the files in the create controlfile command actually exist like this:
1
2
3
4
5
6
7
8
9
| grep \' CONTROLFILE.sql | cut -f2 -d\' | while read aline
do
if ls $aline;
then
:
else
echo $aline missing;
fi
done
|
I also check the files on the filesystem are in the create controlfile command
1
2
3
4
5
6
7
8
9
| find . -type f -name \*.dbf | while read aline
do
if grep $aline CONTROLFILE.sql
then
echo $aline found
else
echo '****' $aline missing
fi
done
|
Now I am pretty confident in the commands. Ensure an init,ora exists for the database,
and is correct, then run the controlfile command. This will start up the database, create
the controlfile, but fail to recover because it doesn’t know where the redo is.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
| SQL> @CONTROLFILE
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size 3712904 bytes
Variable Size 1694500984 bytes
Database Buffers 436207616 bytes
Redo Buffers 13062144 bytes
Control file created.
ORA-00279: change 15295737437889 generated at 01/19/2018 12:02:30 needed for
thread 1
ORA-00289: suggestion : /RECO/archive/RECO_80_1_963178734.arc
ORA-00280: change 15295737437889 for thread 1 is in sequence #80
ORA-00308: cannot open archived log '/RECO/archive/RECO_80_1_963178734.arc'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Database altered.
Database altered.
ALTER DATABASE OPEN RESETLOGS
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/RECO/system/system01.dbf'
ALTER TABLESPACE TEMP ADD TEMPFILE '/RECO/temp/temp01.dbf'
*
ERROR at line 1:
ORA-01109: database not open
|
Next I made an interesting mistake. If the snapshot time is too early, the database
will tell you. If this happens you can add a second until it works. Or in my case, just
specify the correct year!
1
2
3
4
| SQL> recover database until cancel using backup controlfile snapshot time '19-JAN-2017 13:19:01';
ORA-00283: recovery session canceled due to errors
ORA-19839: snapshot datafile checkpoint time is greater than snapshot time
ORA-01110: data file 1: '/RECO/system/system01.dbf'
|
The database needs to use the online redo log from the old database and it doesn’t
know about this because of the new controlfile. To prove it needs recovery from the redo log I will
try to cancel it.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| SQL> recover database until cancel using backup controlfile snapshot time '19-JAN-2018 13:19:01';
ORA-00279: change 15295737437889 generated at 01/19/2018 12:02:30 needed for
thread 1
ORA-00289: suggestion : /RECO/archive/RECO_80_1_963178734.arc
ORA-00280: change 15295737437889 for thread 1 is in sequence #80
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
CANCEL
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/RECO/system/system01.dbf'
ORA-01112: media recovery not started
|
So, I specify the latest online redo log, and it recovers OK.
1
2
3
4
5
6
7
8
9
10
11
| SQL> recover database until cancel using backup controlfile snapshot time '19-JAN-2018 13:19:01';
ORA-00279: change 15295737437889 generated at 01/19/2018 12:02:30 needed for
thread 1
ORA-00289: suggestion : /RECO/archive/RECO_80_1_963178734.arc
ORA-00280: change 15295737437889 for thread 1 is in sequence #80
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/RECO/flash/PROD/onlinelog/o1_mf_4_f3m1s2y3_.log
Log applied.
Media recovery complete.
|
So now I should be able to open the database resetlogs and all will be fine right?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00344: unable to re-create online log
'/RECO/flash/RECO/onlinelog/o1_mf_4_%u_.log'
ORA-27044: unable to write the header block of file
Linux-x86_64 Error: 28: No space left on device
Additional information: 3
SQL> !df -h /RECO
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vold42p1 1.5T 1.5T 2.5G 100% /d48
|
Oh dear. I have run out of space on the filesystem. I will just clear down
the old archive redo logs and the flash recovery area, and try again.
1
2
3
4
5
6
7
| QL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00392: log 4 of thread 1 is being cleared, operation not allowed
ORA-00312: online log 4 thread 1:
'/RECO/flash/RECO/onlinelog/o1_mf_4_%u_.log'
|
Looking on the internet it seems that because the open resetlogs failed
the database is in a confused state. Lets try clearing the log by hand.
1
2
3
4
5
6
7
8
9
| SQL> alter database clear logfile group 4;
Database altered.
SQL> alter database open resetlogs;
Database altered.
SQL>
|
Phew, that worked! It is probably worth having enough space for this though!