From:
Richard2011
Views: 14
Comments: 0
At Exampdf, people can get all the required exam information for Oracle 1Z0-007 test, which cover all the knowledge points of the actual test.
Passcert offer the latest 1Z0-007 certification exam prepare material and 100% ensure pass exam in the first attempt or will give a full defund.still now Stop wasting time and money re-taking failed certification exams and start becoming more produc (more)
Passcert offer the latest 1Z0-007 certification exam prepare material and 100% ensure pass exam in the first attempt or will give a full defund.still now Stop wasting time and money re-taking failed certification exams and start becoming more productive. (less)
Slide 1: Applying a Blockcentric Approach to Oracle Tuning
Overview
In recent years Oracle tuning methodology has shifted away from ratios and percentages to waits and events. The new methods focus on the key component to user satisfaction, response time. By determining elapsed time for each set of operations and identifying the bottlenecks, system performance can be improved, with a corresponding increase in user satisfaction. This paper presents a new method for application tuning that shifts the focus away from rows and towards blocks. Simply stated, tune by reducing the number of blocks that a statement must read. It is built upon a foundation of solid research from the leaders (see references) in Oracle tuning and troubleshooting. The presentation is very high level and will not provide all the answers. In fact, the intent is to make you begin to ask questions, and provide some guidance in finding the answers that are right for your database. “Trust, But Verify.” Your current perceptions of tuning may be challenged. Be patient, kind reader, and all shall be made clear (I hope). In the meantime, indulge my tuning heresy. “Burn him. Burn him at the stake. He speaks not the truth”
Oracle’s I/O Model
The cornerstone of the Blockcentric Approach is the Oracle I/O subsystem, how it reads and writes data. Most of the work that the database does is the storage and retrieval of data. Let's set the foundation (of nice dry, burnable wood) for the approach.
It’s The Block
When a query requests a single-byte column from a single row in a table, the block is read in its entirety. In order to make large read operations more efficient, Oracle can request multiple blocks from disk with a single operation. These multiblock reads are most often associated with full table scans, but are also used with full index scans. Full table scans read ALL blocks from the segment header up to the high water mark, the last block ever formatted to hold data.
Logical Reads
A logical read is one where the block is already resident in memory. However, it is not as simple as looking up a memory address. The data block address must be determined and then hashed to a lookup value. Latches must be acquired, chains must be walked, additional latches allocated, if possible. If a read consistent version must be created from undo, the undo block(s) must be read using the same process and the undo applied.
Physical Reads
A physical read is where the block is not found in memory and must be requested from the storage subsystem. Many of the current disk storage systems do not always issue a disk read request for each block read request
Slide 2: that Oracle issues. Blocks may be stored in hardware caches or prefetched while Oracle is processing the block and preparing the next request.
Block Sizes
As databases have grown, so have standard block sizes. Not too many versions ago, many databases used 2k for a block size. Now 8k is the defacto minimum, with some braving the waters of 32k. Larger block sizes may mean more data per block and a higher density of rows. This allows for more rows to be read with a single I/O request. While this has many benefits, it also has implications for tuning.
Mythconceptions Buffer Cache Hit Ratio
The higher the buffer cache hit ratio the better the statement This Buffer Cache Hit Ratio (BCHR) has long been a keystone of the tuning foundation. Although many performance specialists regard this measure as something less than useful, it still persists among texts, documentation, papers and other sources. When faced with a less than optimal BCHR, the usual response was to increase the block buffers. If the ratio was still not good enough, add more memory so you could increase the amount devoted to block buffers. Unfortunately, there is little correlation between this ratio and the user’s satisfaction with system performance. This ratio does not measure the response time. When faced with a slower than desired query, a user is rarely satisfied with the response that it is "as good as it can be since the statement’s BCHR is 99.999%". In reality, this does not address the user issue. A statement with a lower BCHR may perform better If you compare two statements that return the same data and one has a higher BCHR, is this the better performer? Not necessarily! (“Burn him…”) The key is to compare the amount of resource that each statement consumes, as measured by i/o operations, both logical and physical. By calculating the comparative costs, the less resource intensive one can be identified. In order to determine statement cost, we need to assign costs to each of 2 components, logical i/os and physical i/os. Please note that we are not addressing either the parse nor execute phases of sql. Testing and research have shown that a read from memory is not nearly as expensive as has been accepted. How could this be? A logical i/o is not simply reading from disk, but it requires cpu processing, latch acquisition, consistent read generation, etc. A physical i/o does not necessarily require a disk read as the block may already reside in the disk cache. In reality, physical I/O operations are only 40 times as expensive as logical I/Os, not 10,000 times as is commonly accepted. Where does 40 come from? This is based upon data gathered and analyzed by Hotsos, et.al. The data is not tied to a particular Oracle release or server platform. Several independent sources have similar numbers, so the data can be presumed to be accurate.
Slide 3: The intent is not to provide a 'HERE IS THE MAGIC NUMBER' solution, but rather a simple method for calculating relative costs. There is a relatively simple formula for determining statement cost in terms of I/O. Statement_cost = logical_ios + (physical_ios * 40). Let’s apply this formula to two statements that return the exact same data. If the first statement has a 99% BCHR and does 6000 logical and 60 physical reads, while the second statement has a 75% BCHR and 400 logical and 100 physical reads, the second statement will require less resource and should run faster. To calculate the cost of each statement, multiply the number of physical reads by 40 and add to the number of logical reads. Using this method, the first statement will cost 8400 (6000 + (60*40)) while the second costs 4400 (400 + (100*40)), thus indicating that the second statement is actually better performing. Identifying Tuning Candidates There are 2 ways to determine the statements most in need of tuning. You can wait until the line of disgruntled users is beginning to form outside your cubicle or you can look at what is currently being executed in the database. Many tuning tools will allow you to see the current sql statements with statistical information. V$SQLAREA contains the columns that will provide you the information you need. If you sort according to the sum of DISK_READS and BUFFER_GETS (in descending order), the most 'expensive' statements will be listed first. In the example below, the first statement listed is consuming almost 20 times the resource as the second 'worst' statement. Also notice that the top 4 most resource intensive statements have a 100% Buffer Cache Hit Ratio…all buffer gets and no disk reads!
select address, substr(sql_text,1,50), disk_reads disk, buffer_gets buffer, executions execs from v$sqlarea where (disk_reads + buffer_gets) > 100000 order by (disk_reads+buffer_gets) desc; ADDRESS ---------------0000000387DFD0B0 00000003889FC7C0 0000000384327720 00000003855230D8 0000000389C1A098 0000000387E59AE0 0000000389816890 0000000387E3AFF0 0000000389C62170 0000000389B22088 SUBSTR(SQL_TEXT,1,50) DISK BUFFER EXECS -------------------------------------------------- ----- -------- ------DELETE FROM USER_INFORMATION_V WHERE CONNECTED_USE 0 99857399 85968 DELETE FROM CONNECTED_SESSION WHERE ID = :1 0 5059551 4324 DELETE FROM USER_INFORMATION_V WHERE CONNECTED_USE 0 5049585 4324 select count(1) from "VIRT"."USER_INFORMATION" whe 0 5020164 4324 SELECT di.db_name from global_name gn, db_name 8 3012275 501844 BEGIN CreateNewUserAndConnection_v3(:1, :2, :3, : 69 2825583 30550 INSERT INTO ACTIVE_USER_V (ID, CON 50 2235420 30550 UPDATE concurrent_users_v set running_co 22 1760499 33152 select count(*) from dual 4 945086 315018 SELECT user_session_seq.NEXTVAL FROM DUAL 0 165386 54359
Index Usage
If more than 15% of the rows are returned use an index This ratio has long been one of the most important considerations to create and use an index. If the number of rows to be returned by the query was less than 15% (5% for parallel query), then it was ‘more efficient’ to use an index. This is a result of reading several index blocks to locate the requested value and the number of blocks that can be read at a single time with multiple blocks.
Slide 4: The number of rows to be read is not a valid criteria , as Oracle does not read rows. The rows of interest could be clustered together, scattered amongst many blocks or have chained/migrated so that each row requires several blocks to be read. The percentage of blocks read is the criteria Remember, when a query retrieves rows, it reads blocks. If the same number of rows are clustered in a few blocks or scattered amongst most of the blocks, different access paths are indicated. If the table is badly fragmented with many empty blocks below the high water mark, index access paths may produce better performing statements. It may be better to use an index to return 100% of the rows and a full table scan to return less than 1% of the rows. (“Sniff, sniff…is that burning wood I smell?”) The more telling criteria are the number of blocks to be returned. Ask the question, "How many blocks must this statement read in order to satisfy the query?" Answer the question in proportion to the number of blocks in the table that are below the high water mark, not the number of blocks that contain data. There are three areas that highlight this issue. The examples below are from a combination of tests, using a 1 million row table. Table Block Fragmentation Table Block Fragmentation occurs when there are empty blocks in the table below the high watermark. The usual culprits are massive deletes and failed transactions. When rows are deleted and blocks become 'empty' they are still allocated to the table so that they can be reused. If the table contains a large proportion of empty blocks below the high water mark, an index may be the best choice, even when all of the rows are to be returned by the query. If a full table scan is performed, many of the blocks read will be empty of data. An index lookup will only read those blocks that actually contain data. Retrieve 100% of the Rows using Full Table Scan Using an Index I/Os 14,769 8,102
As you can see from the above example, with a badly fragmented table, using an index instead of a full table scan consumes 45% less I/O. The table had 1 million rows inserted, then all but 30,000 rows were deleted so that all the contents of individual blocks were removed, leaving completely empty blocks while others were filled with data. When a table has a high high watermark, the I/O savings are significantly higher. Even though a small fraction of the blocks actually contain data, the highwater mark is set as if the table contained 10 times the data. Retrieve 100% of the Rows using Full Table Scan Using an Index I/Os 13,406 1,560
In this example, the difference in I/O is very significant, with an Index read consuming almost 1/10 the amount of I/O that a Full Table Scan uses. This table had 1 million rows inserted and then all but 10,000 rows were deleted. All the rows were in the blocks at the ‘beginning’ of the segment, in the same extent as the segment header.
Slide 5: Scattered Data If the data of interest is scattered among a high proportion of the blocks, a full table scan may be a better performing access path even when a very small fraction of the rows are to be returned. One key factor to consider is the density of the rows within blocks. For a very small row size with a large block size, scattering becomes more common. Retrieve 1% of the Rows using Using an Index Full Table Scan I/Os 13,451 13,180
In this example, the full table scan performed fewer I/Os. The I/O savings are not nearly as dramatic as the other examples, but it does indicate that the accepted rule of indexing does provide correct answers at all times. Because the data is scattered, all of the table blocks (and a significant number of index blocks) must be read. Clustered Data If the data of interest is clustered, index scans may produce better results. Depending on the health of the index, an index may be more efficient with over 15% of the rows. Clustering can occur naturally, as in the case of daily sales records, or it may be artificial, as in the case of partitioned tables. Retrieve 7% of the Rows using an Index Data Scattered Data Clustered Data Partitioned (Scan of Partition) I/Os 7,282 7,122 3,751
In this example, the clustering of data made a small (< 5%) improvement in the number of I/Os. Oracle provides partitioning as a method to enforce data clustering and enable the sql engine to take advantage of it. Here, the number of I/Os drops by almost 50% by scanning the table partition.
Determining Block Selectivity
To determine the block selectivity for either a table or a data value, you need to determine how many blocks are below the high watermark and how many blocks are used by data. The tables must be analyzed and you must be able to execute the dbms_rowid package. In the user_table view, the column blocks indicates the number of blocks below the high watermark. This is the number of blocks read by a full table scan. There is one condition that makes it very difficult to determine the block selectivity of individual data values. If a row is chained or migrated, the dbms_rowid package only reads the rowid of the head piece. If the value of chain_cnt in the user_table view is high, it may cause the calculations to be incorrect.
Identifying Table Block Fragmentation
For each table, use the following query to determine the number of blocks containing data and those that are below the high watermark. Replace sequential_data with the table name that you want to examine.
select count(dbms_rowid.rowid_block_number(s.rowid)), blocks from sequential_data s, user_tables
Slide 6: where table_name = 'SEQUENTIAL_DATA' group by blocks;
In the below example, the query was executed for 3 different tables. It indicates that a full table scan of hwm_data will read 6483 blocks, while only 76 of the blocks actually contain data, while a full table scan of sequential_data will access very few empty blocks. Table sequential_data frag_data hwm_data
Blocks In Use Blocks below HWM % Of Used Blocks
6348 1967 76
6406 6483 6483
99% 30% 1%
Identifying Block Selectivity for Data Values
In the past, indexing was based upon the % of Rows rule. However, this is not a reliable indicator as we have seen. It is better to identify the number of blocks to be read for a given data value. Using the num_rows and blocks columns from the user_tables view, we can determine the block selectivity for a given value. In the statement below, we are checking the block selectivity for the column small_random_num in the sequential_data table.
select t.small_random_num data_value, count(t.small_random_num) data_rows, num_rows, round(((count(t.small_random_num)/num_rows)*100),2) pct_rows, count(distinct dbms_rowid.rowid_block_number(t.rowid)) data_blocks, blocks, round(((count(distinct dbms_rowid.rowid_block_number(t.rowid))/blocks)*100) ,2) pct_blocks from sequential_data t, user_tables where table_name = 'SEQUENTIAL_DATA' group by blocks, num_rows, t.small_random_num;
The query provides the following data, which indicates that there is a slight discrepancy between the row level selectivity of a value and the block level selectivity. This indicates that the data value of 0 exists in almost 25% of the data blocks.
DATA_VALUE DATA_ROWS NUM_ROWS PCT_ROWS DATA_BLOCKS BLOCKS PCT_BLOCKS ---------- --------- -------- -------- ----------- ------ ---------0 152495 1000000 15.25 1564 6406 24.41 1 148751 1000000 14.88 1572 6406 24.54 2 147019 1000000 14.70 1584 6406 24.73 3 149802 1000000 14.98 1567 6406 24.46 4 132798 1000000 13.28 1396 6406 21.79 5 133659 1000000 13.37 1409 6406 22.00 6 135476 1000000 13.55 1424 6406 22.23
If we execute the same query (modified for the new table) against the frag_data table, the results are quite different. In this case, the data value of 0 exists in 5% of the blocks.
DATA_VALUE DATA_ROWS NUM_ROWS PCT_ROWS DATA_BLOCKS BLOCKS PCT_BLOCKS ---------- --------- -------- -------- ----------- ------ ----------
Slide 7: 0 1 2 3 4 5 6
4431 4410 4753 4755 3887 4105 4139
30480 30480 30480 30480 30480 30480 30480
14.54 14.47 15.59 15.60 12.75 13.47 13.58
322 325 350 347 284 297 291
6483 6483 6483 6483 6483 6483 6483
4.97 5.01 5.40 5.35 4.38 4.58 4.49
Examining a different data value in the sequential_data table, where the data value is very, very scattered, the results are quite dramatic. Even though the individual values (not all shown) compose an extremely small percentage of the rows, the value exists in almost every block in the table!
DATA_VALUE DATA_ROWS NUM_ROWS PCT_ROWS DATA_BLOCKS BLOCKS PCT_BLOCKS ---------- --------- -------- -------- ----------- ------ ---------0 6348 1000000 .63 6348 6406 99.09 1 6348 1000000 .63 6348 6406 99.09 2 6348 1000000 .63 6348 6406 99.09 3 6348 1000000 .63 6348 6406 99.09 4 6348 1000000 .63 6348 6406 99.09 5 6348 1000000 .63 6348 6406 99.09 6 6348 1000000 .63 6348 6406 99.09 7 6348 1000000 .63 6348 6406 99.09 8 6348 1000000 .63 6348 6406 99.09 9 6348 1000000 .63 6348 6406 99.09 10 6348 1000000 .63 6348 6406 99.09
Conclusion
I realize that I have presented no definitive answers to tuning and probably generated far more doubt and questions. If I have, then I have succeeded. At some point, you have probably raised your eyebrows and said, “That is WRONG!” My response, “Prove it!” No, seriously, set up a test case, run it, send me the results. Of course, if your test case proves me right, send it to me as well. Any and all information related to this approach will help the Oracle community. I can be reached at optimaldba@yahoo.com and the ongoing results of the research will be posted at www.optimaldba.com.
References
The number of papers, presentations and books are too numerous to list individually. Instead, here is a list of websites and 1 book that has provided valuable information and research. I am deeply indebted to the authors for their hard work that I have synthesized within this article. Oracle Performance Tuning 101 – Vaidyanatha, Deshpande & Kostelac. THE performance tuning manual. “Learn it, Know it, Live it.” www.hotsos.com - Cary Millsap’s website. His Oracle Internal’s presentation at IOUG-A Live 2002 provided me with the ‘Ah-Ha’ moment that inspired this approach. www.evdbt.com - Tim Gorman’s site, with great contributions by Jeff Maresh. Excellent research, papers and presentations.
Slide 8: www.orapub.com - Craig Shallahamer’s site. Another of the gurus that consistently provide new ideas backed with solid research.