Over the years my opinion of what big data is has changed. Way back in the VAX days, a 1G database was not only big data it was fricken huge data. Now words like gig, tara, and peta are thrown around casualty. Exa is still reserved for really fricken big.
Now, back to the restore. The customer has a data warehouse that has been populated with many years worth of data and the consumers of the data are always asking for more. The easy way to keep a backup of this massive amount of data is to create a standby database. So we have a standby database, archive logs are shipped to the standby database and I get a report every morning of the health of the standby database. Everything is honkey dorey, or so I thought.
The customer was doing a big data push and for reasons that are beyond my control, it was decided that dataguard would be shutdown. I resisted, and was over ruled. After two months of the standby databases being shutdown I was finally told to bring the standby instance back in sync with the primary database. Time for the gremlins to come into the picture. Issues I faced:
-
A dba did an open resetlogs on the primary database instance. I learned this after copying down 2.5T of archive logs down the pipe.
-
So, I did a diff of the database starting at the oldest SCN in the standby database. Again after copying down 2.5T of data, one of the managers got a bit anxious and decided to break the replication of the san before the files were finished replicating. The files were used to do an rman recover to the standby database and it was a failure.
-
Do a complete cold backup of the data warehouse.
-
Send some large fast drives to the data center.
-
Encrypt the drives
-
Courier the drives to the DR site
-
Decrypt and uncompress the backup
-
Restore the backup
-
scp the archive logs to the DR site.
-
Recover the standby database
-
-
Have a beer and get some sleep.
Lessons learned:
-
Under no circumstances shale anyone shutdown the standby database.
-
Under no circumstances shale anyone do an open resetlogs on any database that has a dataguard instance.
-
Do a cost benefit of shipping drives out to the primary site before doing long copies over a pipe that is shared by many. For turn around time, ship the drives out. We have now decided to keep a set of drives at the data center so we can do a much quicker restore of the database. The decision you make will be based on your patients, and you pain tolerance for having a bad standby database.
-
If there is a hand full of archive logs missing, scp is a wonderful thing.
-
If there is much more then a bunch of archive logs that need to be transferred, put them on the drive and ship it to the DR site to recover the standby database.
-