Life of an Oracle DBA: Grumps, dataguard, upgrades and tuning oh my

There are very few things that will get my dander up. One is waisting my time; the other is “Not My Job.” Last week I experienced both; it took a bit but the person who told me it was not their job wound up doing what needed to be done. This caused quite a bit of my time to be wasted. This gets back to my mantra: “You may be asked to do things that our outside of your job description, if so, embrase it and do the best job you can.” You will make yourself more valuable to your employer, customers or team mate.

I also experienced a problem with dataguard that I had not seen before.

ORA-00342: archived log does not have expected resetlogs SCN 137433238

ORA-00334: archived log: ‘/var/opt/data/sor01/arch/1_191_768933256.dbf’

This was weird. This ora error has to do with a different incarnation of the database. What caused the issue? My hunch; someone did a open resetlogs.

I checked what logs have checked the max log on the primary and the max log applied on the standby.

primary> select thread#, max(sequence#) “Last Primary Seq Generated”

from v$archived_log val, v$database vdb where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;

THREAD# Last Primary Seq Generated

———- ————————–

1 847

standby> select thread#, max(sequence#) “Last Standby Seq Received” from v$archived_log val, v$database vdb where val.resetlogs_change# = vdb.resetlogs_change# group by thread# order by 1;

THREAD# Last Standby Seq Received

———- ————————-

1 847

But wait threes more:

standby > select thread#, max(sequence#) “Last Standby Seq Applied” from v$archived_log val, v$database vdb where val.resetlogs_change# = vdb.resetlogs_change# and val.applied in (‘YES’,’IN-MEMORY’) group by thread# order by 1;

THREAD# Last Standby Seq Applied

———- ————————

1 190

Okay, I see where the standby database wants log 191. But still what is the real issue. I needed more information to figure out what the problem is.

What is the scn that is stored in the controlfile?

select db_unique_name, to_char(current_scn, ‘999999999999999999’) as Standby_Current_SCN from v$database;

DB_UNIQUE_NAME STANDBY_CURRENT_SCN

—————————— ——————-

<radicated> 199126920

I have still not found out and wasted a lot of time trying to get the archive logs to apply. So I deeded to get a incremental backup of the primary database starting with the oldest scn applied to the standby database.

standby> select min(to_char(fhscn, ‘99999999999999999999’)) as SCN_to_be_used from X$KCVFH;

SCN_TO_BE_USED

———————

199126921

So now I have a place to start my incremental backup. But I prefer to live on the conservative side of life. These SCNs’ are not matching up very well, so I’m going to take the lower of the two SCNs’ for my backup.

RMAN> BACKUP INCREMENTAL FROM SCN 199126920 DATABASE FORMAT ‘/var/opt/backup/ForStandby_%U’ tag ‘FORSTANDBY’;

After the backup is done, scp the backup to the standby site and do a database restore. But wait, port 22 is not open so I am going to have to rely on SAN replication. I’ll let you know how the restore goes, the files are still replicating.

Migrate 10G on Solaris to 11G on Linux

There are three small databases that needed to be upgraded 10g Solaris to 11g Linux. Because they were small, I decide the easiest way to do the upgrade was to use datapump. The next set have a lot of data and are moving to a RAC environment, so I will experiment with transportable tablespaces.

Tuning Oracle

We have a web based OLTP application where the database was experiencing 100% cpu. Logins were slowing down, users were backing up. In short it was an extreme terror kinda day. Looking at the Oracle database I found that most of the users were backing up on the same query. My first question was, what was the trigger for 100’s of users to start executing the same query at the same time and for it to be going on all day? The second question was, what can I do to speed things up?

It turns out our 75,000+ user base had received a message to do something. So that explained my first question. The second question was a bit harder to address. There was one index that was getting a fast full index scan for each query and it started showing up in v$session_longops. The index is a compressed index and when I looked at it closely the I found the cardinality of the columns in the index were from higher to lower cardinality. If I wanted to compress the index I can reverse the columns in the index. And because this index mostly experiences fast full index scans that would reduce the number of blocks the index had.

SQL> drop index <schema>.<index_name>;

SQL> create index <schema>.<index_name> on <schema>.<table_name> (lowest_cardinality, next_lowest, highest_cardinality> tablespace indx compress 2;

A generic create compressed example: Sorry I can’t give specifics, NDA and all ya’know.create index app.cust_indx on app.custs (sex, eye_color, cust_id) tablespace indx compress 2;

By changing the order of the columns in the compressed index the number of blocks to store the index decreased by 50%. Well, the load finally returned to quasi normal and requests were getting serviced in a timely mannor. It looks like we have maxed out the oracle database server. Thank goodness we are going to upgrading it to Oracle 11G RAC on Linux from Solaris Oracle 10G single instance.

This entry was posted in Database Stuff, Life of a Oracle DBA. Bookmark the permalink.

Leave a Reply