Update to my earlier #quicktip on setting #plsql scope and warnings.

I noticed a error in my code for setting PLSCOPE_SETTINGS and PLSQL_WARNINGS. QuicTip Logon.sql What I did was get the instance name out of v$instance to figure out if I was connecting to a production environment or one of the lower environments. The problem with this is, not everyone is going to have permissions to select on sys.v_$instance. The better way to do this is to use sys_context to get the instance name. This way, you won’t have to chase down additional privileges from your DBA.

<code>

select sys_context('userenv','instance_name')
 into sInst
 from dual;

</code>

Here is the corrected code for my logon.sql

<code>

DECLARE
 sInst varchar2(1);
 BEGIN
 -- rlockard: 2018/02/23 commented out getting instance name from v$instance.
 -- used the more apporiate sys_context('userenv','instance_name')

--select upper(SUBSTR(instance_name, 1,1))
 --INTO sInst
 --FROM SYS.V_$INSTANCE;

select sys_context('userenv','instance_name')
 into sInst
 from dual;

-- test to see if this is a production instance
 -- all production instances start with P so ...
 -- if it's not a production instance set up
 -- session properties approiate for dev / test / sandbox.
 IF sInst != 'P' THEN
 execute immediate 'ALTER SESSION SET PLSCOPE_SETTINGS=' || '''IDENTIFIERS:ALL''';
 execute immediate 'ALTER SESSION SET PLSQL_WARNINGS=' || '''ENABLE:ALL''';
 END IF;
 END;
 /

</code>

Turn off the #http #listener in #Oracle #STIG

Locking down a database (applying STIGs) you need to check to see if the listener is running http. If you don’t need the http service, turn it off. Turning off http will reduce the attack surface.

Step 1) Is http running?
[oracle@vbgeneric db_1]$ lsnrctl stat | grep HTTP
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=vbgeneric)(PORT=8081))(Presentation=HTTP)(Session=RAW))
[oracle@vbgeneric db_1]$

Step 2) Turn off http
RLOCKARD@orcl> select version from v$instance;
VERSION
-----------------
12.1.0.2.0

RLOCKARD@orcl12c> sho parameter dispatchers

NAME TYPE VALUE
———————————— ———– ——————————
dispatchers string (PROTOCOL=TCP)
max_dispatchers integer

RLOCKARD@orcl12c> exec dbms_xdb.sethttpport(0);
PL/SQL procedure successfully completed.

RLOCKARD@orcl12c> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SYS@orcl12c> sho parameter dispatchers

NAME TYPE VALUE
———————————— ———– ——————————
dispatchers string (PROTOCOL=TCP)
max_dispatchers integer
SYS@orcl12c>

[oracle@vbgeneric db_1]$ lsnrctl stop

LSNRCTL for Linux: Version 12.1.0.2.0 – Production on 15-SEP-2016 09:25:29

Copyright (c) 1991, 2014, Oracle. All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1)))
The command completed successfully
[oracle@vbgeneric db_1]$ lsnrctl start

LSNRCTL for Linux: Version 12.1.0.2.0 – Production on 15-SEP-2016 09:25:34

Copyright (c) 1991, 2014, Oracle. All rights reserved.

Starting /u01/app/oracle/product/12.1.0.2/db_1/bin/tnslsnr: please wait…

TNSLSNR for Linux: Version 12.1.0.2.0 – Production
System parameter file is /u01/app/oracle/product/12.1.0.2/db_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/diag/tnslsnr/vbgeneric/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=0.0.0.0)(PORT=1521)))
Notice it’s gone
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1)))
STATUS of the LISTENER
————————
Alias LISTENER
Version TNSLSNR for Linux: Version 12.1.0.2.0 – Production
Start Date 15-SEP-2016 09:25:34
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Default Service orcl12c
Listener Parameter File /u01/app/oracle/product/12.1.0.2/db_1/network/admin/listener.ora
Listener Log File /u01/app/oracle/diag/tnslsnr/vbgeneric/listener/alert/log.xml
Listening Endpoints Summary…
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=0.0.0.0)(PORT=1521)))
Services Summary…
Service “orcl12c” has 1 instance(s).
Instance “orcl12c”, status UNKNOWN, has 1 handler(s) for this service…
The command completed successfully

That was easy.

Oracle DBA Interview tips

I have spent the past several weeks interviewing potential Senior Oracle DBAs. Most did not make the cut. Why.

  1. Resume:
    1. If you have 10 years experience at 10 different places that tells me you will most likely quit or be fired in a year. Job hopping is not impressive. Now that said, if you are working in a consulting capacity (contract to contract) that is pretty normal. Please indicate that on your resume.
    2. Putting a PhD in public policy from a diploma mill will not impress me. We may invite you to the phone interview just to figure out who you are. But then you better perform like a PhD.
    3. If you put <name your technology> Expert on your resume, you better be able to back that up in the interview. I will ask hard questions on the things you say you are an expert on.
  2. Technical questions.
    1. BS your way through the answer. It’s okay to not know the answer. These questions are designed to learn how you think. What I’m looking to hear if you do not know the answer is to say “I don’t know the answer, this is how I would go about finding out the answer.” It’s okay to think out loud, that gives us an idea on how you solve problems. And  NO saying I would open an SR for every problem you have is not the right way to solve a problem.
    2. Once you have answered the question, indicate you are ready for the next question. Don’t ramble on, please, I just start hearing “bla bla bla.”
  3. Energy
    1. OMG, we did an interview today where we were wondering if the person suffered from depression. Okay, you have been out of work, you feel bad about yourself, get over it. We need someone who will come in and be ready to work. So please get yourself pumped up, be happy to talk to us, be professional. Even if it’s the ninth phone interview you did this week and everyone said no. You need to convince me and my coworkers you are ready to come in and get the job done.
    2. Watch Amy Cuddy https://www.youtube.com/watch?v=Ks-_Mh1QhMc  explaining how body language shapes who you are.
  4. Attitude
    1. There is a difference between, confidence and condescending. One gentleman was quite condescending about our current environment. Then rolled his eyes when we talked about how we are starting a consolidation effort. (this is what we were interviewing him for)

Who did we invite back for a face to face?

  1. She showed energy: the young lady spent five minutes explaining to us her history and was proud of her work. (we were wondering if she had one too many espressos) She then followed up the interview with a thank you email addressed to each one of us and summarized why she is a good fit for our team.  She showed us that she believes in herself.  She missed a couple of questions but indicated she did not know the answer off the top of her head and talked through the logic. Hey you get a lot of points with me if you show me you can think.
  2. He said, “I have not worked with TDE so I can’t answer those questions; so I would to the Oracle Documentation and look it up.”  He showed us that he knows what he does not know.

That’s pretty much it. Once you start talking, you have about seven seconds to get my attention. Show your are competent, show that you know what you don’t know and for peats sake, show some energy. Show us that you believe in yourself. We want to hire you. We want you to succeed. 

Insert performance on a unique constraint.

In the past, I have read about the performance of exception handlers and all that is said is, There is performance overhead. Okay, the generic answer of “performance overhead” tells us nothing.

One of the great pleasures of what I do is code reviews.  I get to learn techniques from other developers and I get to pass on some of my knowledge.  Code reviews are a win-win all around.

Recently, I reviewed some code that had the following code block to enforce a unique constraint. (tables and variables have been changed to protect intellectual property)  There is a unique constraint on ins_demo.column1, so the select will do a index unique scan for every select.  The use case for this example is a transactional system where different users will be imputing data.

begin
select count(*)
into dummy
from ins_demo
where column1 = p_value;

if dummy = 0 then
insert into ins_demo values (p_value);
return true;
else
return false;
end if;
end;

My knee jerk reaction is to change this from a select+insert to insert with an exception handler to improve performance.

Begin
insert into ins_demo values (p_value);
return true;
exception when dup_val_on_index then
return false;
end;

Before making recommendations, make sure you first do no harm. So, in this case I decided to run some performance test against the select+insert and the insert with an exception.  The test will attempt to do 100,000 inserts starting will 100% success to 100% failures.

The results of the test have a few surprises.

Select+Insert 100% success:

execution time = 21.93 seconds

Insert CPU= 9.6

Select CPU = 2.55

Insert with exception 100% success:

execution time = 14.29 seconds

Insert CPU = 9.95

Recursive CPU = 8.96

 Select+Insert 100% reject due to duplicate rows:

execution time = 5.86 seconds

Insert CPU = 0

Select CPU = 1.73

Insert with exception 100% rejected due to duplicate rows:

execution time = 135.03 seconds

Insert CPU = 16.85

Exception CPU = 20.7

We can learn a few things from this.

1) In the select + insert method, as the number of rejections increased execution time decreased, insert CPU decreased and select CPU decreased.  This is due to a couple of things.  A unique index scan is faster when the data is found and as the number of rejected rows increase due to unique index violation, the fewer times an insert is executed.

2) In the insert with exception handler, execution time increases as the number of rows are rejected due to unique index violations increases.  This is because Oracle attempts to do the insert and then must roll the transaction back.

I will post a more detailed analysis with the raw performance data tomorrow.

Life of an Oracle DBA – The Basics Access Paths: Index Range Scan

A index range scan gets the rowid by getting the adjacent index entries.  Once Oracle has gathered the rowid’s the rows are returned in ascending order. Unless you use sort descending.

image

From our previous example we saw that if you use a equity predicate on a non-unique index, Oracle will do a index range scan.

image

An index range scan will also be done if your query has a range in the predicate.

Lets drop the normal btree index we created and create a unique index on t1 (object_id).

image

We used a between clause and would up getting a index range scan on t1_idx.

But what if we did this a little different, lets create an index on owner and use an equity predicate.

image

The first thing we notice is we are still doing an index range scan.  We analyzed the table, we analyzed the index but still the optimizer got the cardinality estimate all wrong. We are going to come back and revisit this problem in a later post.  You really need to get the cardinality right, Maria Colgan of Oracle Optimizer fame speaks frequently on this subject.

What are the where clauses that can cause the optimizer to do a index range scan?

LIKE, >, <, =, BETWEEN

You should also realize a index range scan can be bound or un-bound.

Examples of bound:

WHERE column = ‘x’

WHERE column > 1 and column < 5

WHERE column LIKE ‘x%’

WHERE column between 1 and 5

Examples of unbound:

WHERE column > 5

WHERE column < 5

We are going to come back after we address access paths, join methods and join orders to talk about histograms and getting cardinality just right.

Life of an Oracle DBA – The Basics Access Paths: Index Unique Scan

The index unique scan is good for getting one row from and index and returning the rowid. You will get a index unique scan if you have a unique index or primary key on a table.

image
In this example; we created an unique index on object_id.  When we executed the query, the optimizer did a index unique scan.
image
And the index unique scan did just what it was designed to do, select one row.
Now lets try this with a normal btree index.

SQL> drop index t1_idx;

Index dropped.

SQL> create index t1_idx on t1(object_id);

Index created.

SQL> analyze index t1_idx compute statistics;

Index analyzed.

As you can see here, because we uses a normal index thimagee The optimizer decided to do a index range scan.

Next we will be covering Index range scan.

Life of an Oracle DBA – The Basics Access Paths: Table Access By Rowid.

Dang, that just sounds fast, and it is. Rowid points to the datafile:block:location in block. So, what is this good for? How do you get Oracle to access by rowid? If you want to access a specific row or set of rows in the database and can derive the rowid you can get a lot of speed accessing the data. To get the rowid you need to either pass the rowid to the query like I did here or by accessing it through an index.

But what is a rowed? In Oracle 8 and above rowed is 10 bytes that breaks down to:

Bits 1 – 32: object_id

Bits 33 – 44: file number

Bits 45 – 64: block number

Bits 65 – 80: row number inside of block

You can use the package dbms_rowid to get interesting information from rowid.

SQL> variable vrowid varchar2(20);

SQL> execute :vrowid := ‘AABAkfAAIAAIdZKAAA’

PL/SQL procedure successfully completed.

SQL> select dbms_rowid.rowid_object(:vrowid) “object”,

2 dbms_rowid.rowid_relative_fno(:vrowid) “file”,

3 dbms_rowid.rowid_block_number(:vrowid) “block”,

4 dbms_rowid.rowid_row_number(:vrowid) “row”

5 from dual;

object file block row

———- ———- ———- ———-

264479 8 2217546 0

Lets’ look at getting a row from the database by using rowed. One thing you will notice is we did not have to make a round trip to the index t1_idx to find the rowed because we already had it. Therefore Oracle was able to go directly to the row in the database and return it to us. I’m thinking I could have done this better because using trace files and autotrace will create a lot of additional recursive sql in the trace file; such as delete from plan_table and insert into plan_table.

alter session set timed_statistics = true;

alter session set statistics_level = all ;

alter session set sql_trace = true ;

alter session set max_dump_file_size=unlimited;

alter session set tracefile_identifier = owi_trace;

alter session set events ‘10046 trace name context forever, level 12’;

set autotrace on

select object_type, object_name from t1 where rowid = :vrowid;

OBJECT_TYPE OBJECT_NAME

———————————————————————-TABLE FILE$

Execution Plan

———————————————————-

Plan hash value: 2759577407

———————————————————————————–

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

———————————————————————————–

| 0 | SELECT STATEMENT | | 1 | 44 | 1 (0)| 00:00:01 |

|* 1 | TABLE ACCESS BY USER ROWID| T1 | 1 | 44 | 1 (0)| 00:00:01 |

———————————————————————————–

Predicate Information (identified by operation id):

—————————————————

1 – access(CHARTOROWID(:VROWID))

Statistics

———————————————————-

6 recursive calls

4 db block gets

5 consistent gets

0 physical reads

760 redo size

596 bytes sent via SQL*Net to client

488 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1 rows processed

select object_type, object_name

from

t1 where rowid = :vrowid

call count cpu elapsed disk query current rows

——- —— ——– ———- ———- ———- ———- ———-

Parse 1 0.00 0.00 0 0 0 0

Execute 2 0.00 0.00 0 0 13 0

Fetch 2 0.00 0.00 0 1 0 1

——- —— ——– ———- ———- ———- ———- ———-

total 5 0.00 0.00 0 1 13 1

Misses in library cache during parse: 0

Optimizer mode: ALL_ROWS

Parsing user id: 56

Rows Row Source Operation

——- —————————————————

1 TABLE ACCESS BY USER ROWID T1 (cr=1 pr=0 pw=0 time=71 us)

Elapsed times include waiting on following events:

Event waited on Times Max. Wait Total Waited

—————————————- Waited ———- ————

SQL*Net message to client 3 0.00 0.00

SQL*Net message from client 3 1.58 1.58

********************************************************************************

What does this tell me: I got the data in 71 micro seconds. That’s fast. Now I’m leaving out all of the recursive work that Oracle did to execute the command.

Lets look at some more timing information.

mrskew.exe \stage\*.trc > H:\tmp\x.txt

CALL-NAME DURATION % CALLS MEAN MIN MAX

SQL*Net message from client 11.256462 98.9% 4 2.814115 0.001859 9.667408

EXEC 0.120000 1.1% 3 0.040000 0.000000 0.070000

log file sync 0.001551 0.0% 2 0.000776 0.000513 0.001038

SQL*Net message to client 0.000022 0.0% 4 0.000005 0.000004 0.000006

PARSE 0.000000 0.0% 2 0.000000 0.000000 0.000000

FETCH 0.000000 0.0% 2 0.000000 0.000000 0.000000

TOTAL (6) 11.378035 100.0% 17 0.669296 0.000000 9.667408

Using mrskew from Method-R I get some more interesting information from the trace file. What is this telling me? sqlnet message from client was the big wait event; tanking 98.9% of the time. Well that’s network io. After that EXEC was my next highest wait event with 3 calls taking 1.1% of the execution time. BTW: this data includes all recursive sql. So what does the EXEC timing look like?

mrskew.exe \stage\*.trc –name=”EXEC” –ebucket > H:\tmp\x.txt

RANGE {min <= e < max} DURATION % CALLS MEAN MIN MAX

0.000000 0.000001 0.000000 0.0% 1 0.000000 0.000000 0.000000

0.000001 0.000010 0.000000 0.0% 0

0.000010 0.000100 0.000000 0.0% 0

0.000100 0.001000 0.000000 0.0% 0

0.001000 0.010000 0.000000 0.0% 0

0.010000 0.100000 0.120000 100.0% 2 0.060000 0.050000 0.070000

0.100000 1.000000 0.000000 0.0% 0

1.000000 10.000000 0.000000 0.0% 0

10.000000 100.000000 0.000000 0.0% 0

100.000000 1000.000000 0.000000 0.0% 0

1000.000000 Infinity 0.000000 0.0% 0

TOTAL (2) 0.120000 100.0% 3 0.040000 0.000000 0.070000

Life of an Oracle DBA – The Basics Access Paths: Full Table Scan

For the next few weeks I am going to be going over some of the basics of what you need to know to be a good Oracle DBA. Feel Free to speak up if there is something you think I should cover. I should let you know fast is not always cheap and cheap is not always fast. Think of cheap as Oracle is going to use as few resources as possible to satisfy the query. But if you are using parallel query, then Oracle sets the goal to finish the query as fast as possible and to heck with the resources. Now be carful here; you don’t want to be running an OLTP application with 70+ thousand people hitting it and all of them running parallel query. If you do, you are going to starve the machine of CPU. I’ve seen it happen, it’s not pretty.

Yes Virginia there is a Maria Colgan who is always happy to explain the difference between cheap and fast. If you don’t know Maria, she is one of the smartest people in the Oracle community and heads up the Oracle Optimizer Group. Her blog is at: https://blogs.oracle.com/optimizer/

For my first set of installments I am going to go over access paths (how does Oracle get to the data you want.)

Full Table Scan otherwise know as “Dang it; why won’t oracle use my indexes” There are a number of factors for the optimizer consider to select the best access path to the data.

What predicates have been passed in the query?

What indexes exists?

What degree of parallelism is being used?

What is db_multi_block_read_count?

What hints have been put on the query?

Cardinality – you are going to be reading a lot about that, so start studying.

The Oracle cost based optimizer determines the fastest or most efficient way to access data, sometimes the optimizer will select the full table scan. If you pass a predicate in the query that is not indexed then Oracle has no choice then to do a full table scan. If the cost based optimizer decides it is cheaper to do a full table scan over using an index then Oracle will use a full table scan. I know this is hard to believe but it’s true. Here is a simple example, if Oracle needs to read 100 blocks of an index then do another 100 IOs’ to the table to get your data as apposed doing a full table scan of 150 blocks; the cost based optimizer may decide it would just be cheaper to do a full table scan.

SQL> create table t1 as select

* from dba_objects;

Table created.

SQL> analyze table t1 compute statistics;

Table analyzed.

SQL> create index t1_idx on t1(object_type);

Index created.

SQL> analyze index t1_idx compute statistics;

Index analyzed.

SQL> set autotrace on

SQL> select owner, object_name

2 from t1

3 where object_type in (‘TABLE’,’INDEX’);

————snip a lot————-

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

————————————————————————–

| 0 | SELECT STATEMENT | | 2440 | 90280 | 109 (34)| 00:00:01 |

|* 1 | TABLE ACCESS FULL| T1 | 2440 | 90280 | 109 (34)| 00:00:01 |

————————————————————————–

Predicate Information (identified by operation id):

—————————————————

1 – filter(“OBJECT_TYPE”=’INDEX’ OR “OBJECT_TYPE”=’TABLE’)

Hay Maria, Oracle forgot that I had an index on T1. Waaaaaaaa. Oh, it’s cheaper to just read the whole table.

select /*+ index (t1 t1_idx) */

owner, object_name

from t1

where object_type in (‘TABLE’,’INDEX’);

—————————————————————————————

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

—————————————————————————————

| 0 | SELECT STATEMENT | | 2440 | 90280 | 121 (3)| 00:00:01 |

| 1 | INLIST ITERATOR | | | | | |

| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 2440 | 90280 | 121 (3)| 00:00:01 |

|* 3 | INDEX RANGE SCAN | T1_IDX | 2440 | | 10 (10)| 00:00:01 |

—————————————————————————————

Hay Maria, I got Oracle to use my index but it cost more. Waaaaaaa, Oh I had to do more IOs’ to get my data.

As db_multi_block_read_count increases the cost of a full table scan decreases. As parallelism increases, Oracle makes the determination of executing the query as fast as possible as apposed to a cheap as possible. If you put hints into your sql query then those hints will be used as apposed to what the Oracle optimizer believes would be the best way to access the data.

When oracle does a full table scan, blocks are read into the buffer cache and rows that don’t meet the predicate are rejected.

Life of an Oracle DBA: Big Data restore

Over the years my opinion of what big data is has changed. Way back in the VAX days, a 1G database was not only big data it was fricken huge data. Now words like gig, tara, and peta are thrown around casualty. Exa is still reserved for really fricken big.

Now, back to the restore. The customer has a data warehouse that has been populated with many years worth of data and the consumers of the data are always asking for more. The easy way to keep a backup of this massive amount of data is to create a standby database. So we have a standby database, archive logs are shipped to the standby database and I get a report every morning of the health of the standby database. Everything is honkey dorey, or so I thought.

The customer was doing a big data push and for reasons that are beyond my control, it was decided that dataguard would be shutdown. I resisted, and was over ruled. After two months of the standby databases being shutdown I was finally told to bring the standby instance back in sync with the primary database. Time for the gremlins to come into the picture. Issues I faced:

  1. A dba did an open resetlogs on the primary database instance. I learned this after copying down 2.5T of archive logs down the pipe.

  2. So, I did a diff of the database starting at the oldest SCN in the standby database. Again after copying down 2.5T of data, one of the managers got a bit anxious and decided to break the replication of the san before the files were finished replicating. The files were used to do an rman recover to the standby database and it was a failure.

  3. Do a complete cold backup of the data warehouse.

    1. Send some large fast drives to the data center.

    2. Encrypt the drives

    3. Courier the drives to the DR site

    4. Decrypt and uncompress the backup

    5. Restore the backup

    6. scp the archive logs to the DR site.

    7. Recover the standby database

  4. Have a beer and get some sleep.

Lessons learned:

  1. Under no circumstances shale anyone shutdown the standby database.

  2. Under no circumstances shale anyone do an open resetlogs on any database that has a dataguard instance.

  3. Do a cost benefit of shipping drives out to the primary site before doing long copies over a pipe that is shared by many. For turn around time, ship the drives out. We have now decided to keep a set of drives at the data center so we can do a much quicker restore of the database. The decision you make will be based on your patients, and you pain tolerance for having a bad standby database.

    1. If there is a hand full of archive logs missing, scp is a wonderful thing.

    2. If there is much more then a bunch of archive logs that need to be transferred, put them on the drive and ship it to the DR site to recover the standby database.

Life of a Oracle DBA. What is choking my Oracle database server?

The week was dominated by turning. The database server was pegged at 100% CPU all week. I have tuned the offending queries to within an inch of their lives. The customer wants to know, why is this happening? And I’m asking the business end, what is driving more users to be active on website.

Early in my career I learned about the top command. With the top command I learn about a few things.

CPU is pegged at 100%. And I this example the CPU has been pegged all day long. The user experience (UE) is suffering.

ldap is the top process.

Oracle has all the other processes.

load averages: 90.29, 92.29, 76.555 13:48:45

390 processes: 297 sleeping, 76 running, 17 on cpu

CPU states: 0.0% idle, 95.7% user, 4.3% kernel, 0.0% iowait, 0.0% swap

Memory: 32G real, 1285M free, 28G swap in use, 5354M swap free

PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND

1069 ds5user 45 32 0 0K 0K run 31:26 3.04% ns-slapd

5416 oracle 11 23 0 0K 0K run 4:14 1.15% oracle

4322 oracle 11 22 0 0K 0K run 7:59 1.13% oracle

4916 oracle 11 22 0 0K 0K run 4:41 1.11% oracle

5937 oracle 11 23 0 0K 0K run 1:05 1.10% oracle

5494 oracle 11 22 0 0K 0K run 2:13 1.09% oracle

4910 oracle 11 22 0 0K 0K run 4:59 1.08% oracle

Lets dig into a few of the top oracle processes and see the query that is running.

SQL> @pid 5416

old 10: and spid = &1

new 10: and spid = 5416

SPID SID SERIAL# USERNAME SQL_ID

———— ———- ———- ———- ————-

5416 953 55 IFSUSER butxjxuvw2m3y

Here is the pid.sql query I use.

SQL> l

1 select p.spid, s.sid, s.serial#, s.username, s.sql_id from v$process p, v$session s where s.paddr = p.addr and s.status = ‘ACTIVE’ and spid = &1

SQL>

So, sql_id butxjxuvw2m3y is my top CPU consumer. I want to see what the query is and what the plan for the query is.

SQL_ID butxjxuvw2m3y

——————–

select .. from VINBOX this_ where (this_.STATUS<>:1 and this_.RECEIVED>=:2 and (this_.ASSIGNEE=:3 or (this_.RESO in (:4, :5, :6, :7, :8, :9, :10, :11, :12, :13, :14) and (this_.ASSIGNEE is null)))) order by lower(this_.RECEIVED) desc

Plan hash value: 3079052936

———————————————————————————————————————–

| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |

———————————————————————————————————————–

| 0 | SELECT STATEMENT | | | | | 6300 (100)| |

| 1 | PX COORDINATOR | | | | | | |

| 2 | PX SEND QC (ORDER) | :TQ10001 | 873 | 122K| | 6300 (8)| 00:00:13 |

| 3 | SORT ORDER BY | | 873 | 122K| 312K| 6300 (8)| 00:00:13 |

| 4 | PX RECEIVE | | 1 | 77 | | 3 (0)| 00:00:01 |

| 5 | PX SEND RANGE | :TQ10000 | 1 | 77 | | 3 (0)| 00:00:01 |

| 6 | TABLE ACCESS BY INDEX ROWID | INSTANCE | 1 | 77 | | 3 (0)| 00:00:01 |

| 7 | NESTED LOOPS | | 873 | 122K| | 6270 (8)| 00:00:13 |

| 8 | NESTED LOOPS | | 871 | 58357 | | 3831 (11)| 00:00:08 |

| 9 | PX BLOCK ITERATOR | | | | | | |

| 10 | INDEX FAST FULL SCAN | FORM_IDX06 | 871 | 36582 | | 1185 (33)| 00:00:03 |

| 11 | TABLE ACCESS BY INDEX ROWID| CUSTOM_FORM | 1 | 25 | | 3 (0)| 00:00:01 |

| 12 | INDEX RANGE SCAN | CUSTOM_FORM_IDX03 | 1 | | | 2 (0)| 00:00:01 |

| 13 | INDEX RANGE SCAN | INSTANCE_IDX03 | 1 | | | 2 (0)| 00:00:01 |

———————————————————————————————————————–

I did this for a few of the top processes and found everyone is clicking on the inbox link. Now this has been tuned in the past; so how am I going to deal with this? Well first thing, as the app sits right now, a person can click on the link over and over again. We are putting in a change to disable the link until the inbox comes up. This will save me from users that keep clicking the link when they don’t get their inbox fast enough. The next thing I need to do is move through the query and see where I can optimize it.

Now let’s look to see what the user load is on oracle.

SQL> get users

select username, sql_id, count(*) from v$session where status = ‘ACTIVE’ group by username, sql_id order by 1,3

SQL> /

USERNAME SQL_ID COUNT(*)

———- ————- ———-

APPUSER 9vztbhnd0r47b 1

=================SNIP==============

APPUSER 4c176cpxqvyrs 5

APPUSER 24mdr8u4u6vyz 10

APPUSER 3dkmsrzyjbuj5 44

RLOCKARD 5a921h5xhxxcg 1

8820rg9jt4c14 2

25

18 rows selected.

There are two versions of inbox running right now totaling 54 instances. Well inbox is fairly static during the day; can we use a materialized view? There is a cost, what type of refresh should we be using? 1 hour, 30 minutes, 5 minutes.

There will come a time when the hardware you have in place will not support the user load. In the past month we have seen a 300% increase in the number of queries hitting the database and a 100% increase in the user base.

The customer wants a breakdown by date/hour/user/sql_id/exec count to see what is hammering the system. from this report I can look at date, time of day and user and find out what query is getting executed and how many times. This is one of those reports that is very verbose so I spool it to a file and bring it into Excel to present to the customer. I also ran the same report without sql_id to show date/hour/user/sql count to the database. Both of these report will give you a week worth of data to help you drill down into usage times.

set linesize 256

set pagesize 10000

select to_char(trunc((sample_time),’HH’),’MMDDHH24:MI’), username, sql_id, count(*) from DBA_HIST_ACTIVE_SESS_HISTORY h, dba_users u where h.user_id = u.user_id and u.username = ‘IFSUSER’ group by to_char(trunc((sample_time),’HH’),’MMDDHH24:MI’), username, sql_id order by 1,2;

081709:00 APPUSER 0ut13pyz2862b 1090

081709:00 APPUSER 624p6ufu0vf67 1590

081709:00 APPUSER 0887h37jcvrq0 1677

081709:00 APPUSER 3dkmsrzyjbuj5 14854

Well the sql_id 3dkmsrzyjbuj5 is executing a lot so I wanted to dig into that to see what it is. No surprise, it’s the inbox query … AGAIN.

We have ordered more hardware to put the database on. it’s bigger, more CPUs’, more memory, yada yada yada. When the hardware is in, we will migrate this database from Solaris / Oracle 10g to Linux/Oracle 11G.