Thursday, June 3, 2010

Performance tuning in Oracle


                During my previous company project we faced lots of issues regarding performance of a sql query and after lots of debugging, goggling we came with some findings, which we have implemented in existing queries and was able to improve the performance of those existing non performing sql queries.

So here am going to share/discuss all those findings we have, which I think every developer should consider in performance point of view while writing any sql query

Performance tuning parameters which we are going to discuss are
• Index
• Table Partitioning
• Parallel Processing
• Where clause
• Join conditions
• Function
• Explain Plan used for debugging purpose.

Index

In Oracle what does it mean by index on table?

As a name itself indicates it’s nothing but  a sequence number given to an entity from the collection/bunch of entity to recognize correct and exact entity by using some logic .In oracle point view when we create an index on any of the field ,oracle creates/gives a number entry (may be unique) for that particular record. So when we start searching that field record from table which is containing large amount of data in it and if we have an index created on that particular field it becomes very fast to get that particular record .Oracle itself maintains various algorithms to search index field record entry from table data.
So whenever performance issue comes in oracle database, first we have to check whether indexes are created or not  on the table fields which is most commonly going to be used in where clause. If we take real time example, in book library there are number of books present and each book has particular code written on it, what that code is? It is the index created by librarian to search those books faster than normal case. Index creation format chosen by librarian may be of any type of his convenience and same applies with search algorithm as well .In Oracles as well it has its own standards and methods of procedures for  index creation and its searching algorithm .User need not  to worry about it .
We can create an index on any of field from database table,

The syntax for creating a index is:

CREATE [UNIQUE] INDEX index_name
ON table_name (column1, column2, . column_n)
[ COMPUTE STATISTICS ];
UNIQUE indicates that the combination of values in the indexed columns must be unique.

COMPUTE STATISTICS tells Oracle to collect statistics during the creation of the index. The statistics are then used by the optimizer to choose a "plan of execution" when SQL statements are executed.

For example:
CREATE INDEX supplier_idx
ON supplier (supplier_name);

In this example, we've created an index on the supplier table called supplier_idx. It consists of only one field - the supplier_name field.

We can create composite index also, means index created on two different table fields.

CREATE [UNIQUE] INDEX index_name
ON table_name (function1, function2, . function_n)
[ COMPUTE STATISTICS ];

There is more number of ways to create, rename as well as drop an index.

Now where should we use indexes  and where should not
·         If we are not using any where clause in the sql query , then there is no need in wasting time on indexing.
·         When table contains small amount of the data , then there is no need of using indexing.
·         When table contains huge amount of the data ,put index on the field which is being used in where clause.
·         Don’t create indexes on all the fields present in the table because it’s difficult for oracle to maintain synchronization between all the index fields present in table and also it adds the time value difference in scanning ,so in these cases we loss the basic purpose of indexing.
  
Table partitioning

What does it mean by table partitioning?
let’s discuss direct technically, as it’s very easy in understanding the concept behind it..!!

Consider we have a table named as CallDetail having large amount of the data inside it and it is not partitioned . Daily millions of the records get added to it .Now consider we want to generate a report from table data for a specific period of time. let’s say we have table data for last 3 months and we want to generate a report for a first month .So if we written an sql query  on the table which is not partitioned ,it will search whole data present inside the table i.e. 3 months of data, which doesn’t make a good sense . why should we search last 2 months data ? , if we are explicitly only looking for first month of data, here we are clearly wasting our time and resources for looking into unwanted data.
Here partitioning comes into picture .Oracle provides different types of partitioning features for oracle user .In example If we partitioned CallDetail table on the table field called “received_date” which is of data type as Date and we are partitioned table on month basis. In this situation Oracle internally split table into three parts , each part represent one month of the table data .Its explicitly an internal operation performed by database itself and use don’t have to do anything else and externally it looks like a single table .Now If we run the same sql having where clause on received_date field , oracle internally went to scan only first table (table for first month) and didn’t went for rest two months data scanning  . In this case we are definitely going to save time which were initially spend for scanning unwanted two months of table data.

Different types of the table partitioning methods provided by Oracle.

• Range Partitioning
• Hash Partitioning
• List Partitioning
• Composite Partitioning

You can get the syntax and examples for the same @ below mentioned links.


Few point to be consider while using table partitioning…

• Table partitioning generally created / used on the table having large amount of table data.
• How  take the advantages of the partitioning , it’s all depends on end user. If we created partitioning on some of the field , but never used that field in where clause , then it doesn’t make any sense
    having table partitioned .
                • Partitioning generally created on the date field from table.

Parallel Processing

It’s quite simple to discuss parallel processing as everyone is aware of its dictionary meaning, same applies here as well .In oracle parallel processing applies on the table level scanning.

For example :
select /*+ PARALLEL(transact, 2) */ gl.debitglcode, gl.creditglcode, SUM(transact.amount) amount
from appcrm.transaction transact;

In above example we are assigning two separate threads to scan table data in parallel way, so it will definitely  save some amount of time than if it was assigned to a single thread.

Few points to be consider while using parallel processing…

• Always put/assign parallel processing on the table having large amount of data rather than table having small amount of data .
            • Don’t assign number of threads blindly as more number of threads more CPU it needs , which really  doesn’t good for any app­lication…So in short use this feature quite smartly .
                • Generally number of threads assign in parallel processing are in even number .

Where clause and Joins

“Where clause” is the most important part in any kind of the performance tuning , because generally most of the issues regarding performance are generated because of the incorrect use of where clause condition and always ends with modification in the where clause condition. All features mentioned above provided by oracle to user, but to use it correctly and smartly all depends on the use of the where clause . So always be cautious while using it and use it smartly.

Lets discuss this with an example,
Suppose I have a sql ,which uses two tables Calldetail (having large amount of data inside it ) and SubscriptionType (having very few/less amount of data)

Select c.ban , c.amount from
calldetail c ,
subscriptionType t
Where receiveddate > startDate
and receivedDate <  endDate
and t.id = c.id
and t.id = 0;

        If we saw above sql ,here we are using where clause condition on larger table first and smaller table later , means we are using larger table as Driver table .In this case it will first search the whole data from Calldetail table and then go and check condition on smaller table .So rather doing this if we check condition on smaller table first and then check condition on  larger table and because of this approach if condition checked on smaller table fails query wont execute condition put on larger table as we used “And” condition here.

        Similarly we are not putting alias name before mentioning received date so oracle will check from both the tables where that field present so if we explicitly mentions the table/alias name before mentioning any filed , that will also help to reduce the execution time.

        Similarly we haven’t put alias name before mentioning receiveDate so oracle will internally check for both the tables for the presence of receiveDate field. So In this case if we explicitly mention table/alias names before mentioning any table field, that will help to reduce execution time.

        If partitioning present on the tables always try to use partitioned field in where clause .it will definitely gives result set in faster manner .


So to get the result set in faster way, ideally query should be written as below

Select c.ban , c.amount
from calldetail c ,
subscriptionType t
Where t.id = c.id
and t.id = 0
and c.receiveddate > t.startDate
and c.receivedDate < t.endDate 
Functions and few more things

• When we have to perform some repetitive operations in sql query or in cursor , always try to use oracle/Database provided function. Oracle have mechanism to maintain a cache of the operation we performed so if we use the same value to perform same operation and if we use function ,we can take the benefits of the cache memory (I didn’t go deep regarding this feature so not confident about it , but functions always make our code more readable as well).
               
• If we are using some calculated value again and again , try to write stored procedure and use variable inside it to store calculated value in variable and use it rather than calculating again and again .

• In some "for looping " cases use "where clause" rather than using “For Looping”.


Explain plan for Debugging purpose

Whenever you face any issue regarding the performance of the sql always debug using “Explain plan for query” .It will provide you all detail information regarding the execution of sql like ,

• How many joins formed inside the sql.
• Which types of joins are formed in query result set.
• Is there any Cartesian product happening in query  (if its , then it’s clear indication that you wrote an incorrect sql )
• How much total time it will going to take to execute whole sql ,Also it provide all details regarding time it has taken to execute every single join.
• Is sql going for whole table scan or partition scan ?
You will get more information about what is “Explain plan for” and how to use it in my next blog J

3 comments:

  1. It's nice. Simple to understand. I would be happy to know more about Where clause dos and dont as it is very common.

    ReplyDelete
  2. need to look in to parallel threads..looks interesting..

    ReplyDelete