Performance Analysis and Tuning for DB2 UDB - OracleEnviado por brqx el Mié, 10/01/2007 - 16:50.
Informática : Bbdd Experto Independiente Informacion
Performance Analysis and Tuning for DB2 UDB
Performance Analysis and Tuning for DB2 UDB
Metron Technology Limited
This paper is based on the Author's experiences in performance analysis and tuning for DB2 UDB on distributed platforms. The paper is intended as an introduction to DB2 performance, particularly for those who already have a grasp of relational database concepts through products like SQL Server or Oracle. The paper compares and contrasts DB2 and Oracle, describes the primary sources of DB2 performance and resource consumption data, and covers memory management, physical data storage, indexing & sorting, and locking. The paper concludes with a summary of some new features expected in DB2.
The IBM engineers who were working on System R in the early Seventies would no doubt have been astonished had they been told that they were about to give birth to a huge world-wide industry worth many billons of dollars. Several database management systems have been developed, based on that original work by IBM. In what has become a bitterly competitive industry, the consensus of opinion is that only three major offerings will survive: SQL Server from Microsoft, Oracle from Oracle Corporation and DB2 UDB from IBM.
Microsoft's SQL Server is only available for Windows. Versions of Oracle are now available for the vast majority of platforms that are in use in the industry today. IBM take a middle course, with versions of DB2 available for their proprietary z-Series mainframes, Windows, and the major UNIX implementations including Linux, IBM's own AIX, HP-UX from Hewlett Packard, and Solaris from SUN Microsystems.
This paper concentrates on DB2 UDB for distributed systems and is intended as an introduction to DB2 performance concepts for those with some experience of other DBMSs. It is by no means an exhaustive or complete picture, but highlights the topics that I have found to be most useful, and have the greatest potential impact on performance.
Section 2 compares and contrasts DB2 UDB with Oracle. No attempt is made to assess their relative merits. The choice of DBMS must, of course, be driven by the application and end-user requirements in each case.
Section 3 outlines the primary sources of DB2 performance and resource consumption data. The following four chapters form the bulk of the material covering memory management, physical data storage, indexing & sorting, and locking. The paper concludes with some important new features that will be provided in the upcoming new release of DB2.
2. Oracle and DB2
There are many more similarities than differences between DB2 and Oracle. In common with many other areas of computing technology however, the use of similar terminology to mean different things serves to cloud and complicate the issue - particularly for the DBA who has to administer both products.
In Oracle, any given server can have multiple Oracle instances. The instance consists of the processes (SMON, DBWR etc.) System Global Area (SGA) and initialisation parameters. Each instance has a database that consists of the log, data and control files.
An instance in DB2 (also sometimes referred to as the database manager) means something quite different. In DB2 an instance is an occurrence of the DB2 software. Any given server can have multiple instances that could, for example, be running different DB2 release levels. The instance consists of the instance memory, processes (db2sysc, db2log etc.) and instance parameters. Each instance can have multiple databases. The database consists of the data and log files, the database memory (bufferpool, utilheap etc.) and the database parameters.
The primary memory structure for an Oracle instance is the System Global Area (SGA). This contains the Database Buffer Cache, the Log Buffer, and the Shared Pool (principally the library cache and the dictionary cache).
Records are read into the database buffer cache and are kept there whilst being modified. The reading and writing of pages to and from the buffer cache can occur asynchronously. The direct equivalent to this in DB2 is the bufferpool, although a DB2 database can have multiple bufferpools, each of which can have different page sizes (see section 4.1 for more on DB2 bufferpools).
The DB2 equivalent to Oracle's library cache (which is used to store parsed SQL) is the package cache. The DB2 equivalent to Oracle's dictionary cache (which is used to store database definition objects) is the catalog cache.
The primary Oracle background processes are: -
" PMON performs recovery when a process fails
" SMON performs instance recovery at start-up
" DBWR writes updated blocks from the buffer cache to the database files
" LGWR writes data from the redo log buffer to the redo logs
" ARCH copies redo logs to archive locations.
The primary DB2 background processes are: -
" db2sysc start-up/shutdown of an instance
" db2pfchr asynchronously reads data in the bufferpools from disk (pre-fetch)
" db2pclnr writes modified pages from the bufferpools to disk (page-cleaning)
" db2loggr writes the log buffer to the log file
" db2wdog performs recovery when a process fails (watch-dog)
" db2agent compiles and executes SQL and returns the result (co-ordinator agent)
" db2dlock finds and resolves deadlocks
2.4 Physical Data Storage
Both Oracle and DB2 use the term tablespace in a similar way. In Oracle a tablespace is a set of data files that hold the table data. In DB2 a tablespace is a set of containers, which can be mapped to file locations on disk, or an entire disk drive. DB2 allows the definition of two types of tablespace; these are described in more detail in section 5.
Extents are also used similarly in Oracle and DB2. In both cases an extent is a set of contiguous data blocks that is assigned to a table or index. A DB2 tablespace will usually have multiple containers, and DB2 will write one extent to a container before moving on to the next.
2.5 Redo and Rollback
Oracle has two logs - the rollback segments, which contain before images used for read consistency and instance recovery, and the redo logs, which contain after-images for roll-forward recovery. DB2 has one log structure that handles the equivalent of both redo and rollback. In both cases the performance of the logging subsystem is critical to the performance of the database as a whole, and the logs should be kept on their own disks, away from table data and indexes.
Both Oracle and DB2 support intra-query parallelism, where a query or database operation is split up and each portion is handled by a separate processor 'node'. There are major differences, however, in the way that they have implemented this partitioning.
In its Extended Enterprise Edition (usually abbreviated to 'EEE' and pronounced 'Triple-E') DB2, like SQL Server, uses a 'shared nothing' approach, where each node has its own subset of the database files. Each node processes its portion of the query against its own data, and the results are then co-ordinated before being passed back to the application.
Oracle uses a 'shared everything' approach, where the entire disk subsystem is available to all of the processing nodes. Each approach has its benefits. The Oracle approach is much more resilient to individual node failure. The approach taken by DB2 and SQL Server is ultimately more scalable.
There are two monitors supplied as standard with the database manager - the Snapshot Monitor and the Event Monitor. The Snapshot Monitor allows us to snapshot the performance data at regular intervals. The Event Monitor logs a set of data when each requested event occurs. These two monitors are discussed separately in sections 3.1 and 3.2 below.
3.1 Snapshot Monitor
The snapshot monitor captures a limited amount of basic information by default. A number of monitor switches are available to turn on collection of additional metrics. Collection of additional data will, of course, incur more overhead costs. As always, the utility of the information must be balanced against the cost of collecting it.
The snapshot monitor switches (and the additional information that they provide) are shown in figure 1 below: -
Switch Information returned
TABLE activity by table including number of rows read/written
STATEMENT resource consumption and occurrence count for each SQL statement
SORT sort performance including SORTHEAP usage
LOCK number of locks held, deadlocks
BUFFERPOOL I/O and timing information
UOW unit of work start/end times and completion status
Figure 1. Snapshot Monitor Switches
See Appendix A for a listing of the snapshot monitor output metrics and the switches required to collect them.
The monitor switches can be set from the command line processor (CLP). The following command turns on the BUFFERPOOL command switch at the database manager level: -
update dbm cfg using DFT_MON_BUFPOOL on
To see the current state of the switches you can use the command: -
get database manager monitor switches
This command produces the output shown in figure 2 below: -
DBM System Monitor Information Collected
Switch list for node 0
Buffer Pool Activity (BUFFERPOOL) = ON 05-28-2002 14:21
Lock Information (LOCK) = OFF
Sorting Information (SORT) = OFF
SQL Statement Information (STATEMENT) = OFF
Table Activity Information (TABLE) = OFF
Unit of Work Information (UOW) = OFF
Figure 2. Snapshot Monitor Switch Status
This shows the BUFFERPOOL switch is 'ON' and the date and time that the switch was changed. The monitor switches can also be set using the db2MonitorSwitches () API.
To view the snapshot monitor data you can use the following CLP command which will show the bufferpool snapshot data: -
get snapshot for BUFFERPOOLS on sample
The output provided by this command is shown in figure 3 below.
Bufferpool name = IBMDEFAULTBP
Database name = SAMPLE
Database path = F:\DB2
Input database alias = SAMPLE
Buffer pool data logical reads = 34
Buffer pool data physical reads = 15
Buffer pool data writes = 0
Buffer pool index logical reads = 57
Buffer pool index physical reads = 33
Total buffer pool read time (ms) = 3
Total buffer pool write time (ms) = 0
Asynchronous pool data page reads = 0
Asynchronous pool data page writes = 0
Buffer pool index writes = 0
Asynchronous pool index page reads = 0
Asynchronous pool index page writes = 0
Total elapsed asynchronous read time = 0
Total elapsed asynchronous write time = 0
Asynchronous read requests = 0
Direct reads = 30
Direct writes = 0
Direct read requests = 2
Direct write requests = 0
Direct reads elapsed time (ms) = 0
Direct write elapsed time (ms) = 0
Database files closed = 0
Data pages copied to extended storage = 0
Index pages copied to extended storage = 0
Data pages copied from extended storage= 0
Index pages copied from extended storage = 0
Figure 3. Bufferpool Snapshot
In the same way as for the monitor switches, there is an API to get the snapshot monitor data - db2GetSnapshot ().
Whilst an individual snapshot of the database provides some useful information, it is only when regular snapshots are taken and analyzed that a clear picture of the pattern of activity and resource consumption can be derived. There are number of commercially available monitoring tools that can assist in this process.
3.2 Event Monitor
The event monitor is used to trap and record statistics about events (or 'state changes') occurring to various database entities. These entities include transactions, deadlocks, statements etc. An event monitor must first be created using the following statement: -
CREATE EVENT MONITOR mytab FOR TABLES WRITE TO FILE 'E:\MONOUT'
The entry in the FOR clause is an event type. The supported event types are shown in figure 4 below, including a summary of the type of data that is recorded: -
Event Type Data Recorded
Tables Rows read and written
Bufferpools Data for pre-fetchers, page cleaners and bufferpool counters
Tablespaces Direct I/O and bufferpool data
Database Database-level counters
Connections Application-level counters
Transaction Start/stop time, CPU, locking and logging metrics
Statements Start/stop time, CPU, SQL text
Deadlocks Locks in contention and the applications involved
Figure 4. Monitor Event Types
Once created, the monitor must be activated using the following statement: -
SET EVENT MONITOR mytab STATE 1
The monitor will then trap the required events and write the output to a file (in this case in the directory 'E:\Monout') or a named pipe. Disconnecting from the database will terminate the monitor and flush the events to the required location. If the database is still active, the monitor must be 'flushed' using the statement: -
FLUSH EVENT MONITOR mytab
In order to convert the output file and make it readable, use the db2evmon command as follows: -
db2evmon -path F:\monout
The output from db2evmon in figure 5 below is from a TABLE event monitor that shows 8 rows being updated in the employee table.
EVENT LOG HEADER
Event Monitor name: MYTAB
Server Product ID: SQL07021
Version of event monitor data: 6
Byte order: LITTLE ENDIAN
Number of nodes in db2 instance: 1
Codepage of database: 1252
Country code of database: 1
Server instance name: DB2
Database Name: SAMPLE
Database Path: F:\DB2\NODE0000\SQL00001\
First connection timestamp: 05-29-2002 08:41:29.61156
Event Monitor Start time: 05-29-2002 08:43:36.85323
3) Table Event ...
Table schema: DB2ADMIN
Table name: EMPLOYEE
Record is the result of a flush: FALSE
Table type: User
Rows read: 104
Rows written: 8
Overflow Accesses: 0
Page reorgs: 0
Table event timestamp: 05-29-2002 08:48:00.419560
4) Table Event ...
Table schema: SYSIBM
Table name: SYSTABLES
Record is the result of a flush: FALSE
Table type: Catalog
Rows read: 309
Rows written: 0
Overflow Accesses: 0
Page reorgs: 0
Table event timestamp: 05-29-2002
Figure 5. TABLE Event Monitor Output
4. Memory Management
4.1 Buffer Pools
Bufferpools are memory areas used to temporarily store data that is required by an application. If an update is required, the application will update the data in the bufferpool. The updated pages will subsequently be copied to disk. Both the retrieval of data into the bufferpool, and the writing of updates (or 'dirty') pages to disk can occur asynchronously.
Clearly the way in which the bufferpools are used can have a very significant impact on performance. DB2 works very hard behind the scenes to improve the chances of the application finding that a page of data that it needs is already in the bufferpool, in order to avoid physical disk I/O. Regardless of what DBMS you are using, actual physical disk I/O is death to database performance.
The frequency with which DB2 finds that a page that it needs is already in the bufferpool is called the bufferpool hit-ratio. For good OLTP performance this should be as high as possible. The formula for the overall bufferpool hit-ratio is: -
100 * (1 - (BDPR + BIPR) / (BDLR + BILR))
where BDPR is 'Bufferpool data physical reads', BIPR is 'Bufferpool index physical reads', BDLR is 'Bufferpool data logical reads' and BILR is bufferpool index logical reads'. All of these metrics are available from Snapshot Monitor for each bufferpool.
The achievable buffer cache hit-ratio will vary considerably from installation to installation and will be heavily dependant on the characteristics of the application. There is therefore no such thing as a god (or bad) hit-ratio. The performance analyst should track this ratio's behaviour and respond to any sudden deterioration.
Before increasing bufferpool sizes in an attempt to improve the bufferpool hit-ratio, it is important to consider the overall memory availability from an operating system viewpoint, in order to ensure that there is sufficient real memory available. It is perfectly possible to increase bufferpool sizes (giving an improvement in the hit-ratio) to the extent that the operating system starts to page the bufferpool to and from disk - thus generating far more I/O than was saved by improving the hit-ratio!
Whilst a good bufferpool hit-ratio is desirable, it is not the Holy Grail of DB2 performance. It is easy to consume very considerable CPU resources in scanning and sorting tables that are held completely in the bufferpool - a hit ratio of 100% is not necessarily indicative of a healthy database!
OLTP applications generally benefit from the use of multiple bufferpools. Try to put data that is accessed in different ways into different bufferpools. For example, do not mix randomly accessed and sequentially pre-fetched data in the same bufferpool.
4.2 Catalog and Package Caches
The catalog cache (analogous to Oracle's dictionary cache) is used to store database definition objects. The catalog cache hit-ratio is calculated using the following formula: -
100 * (1 - ( CCI / CCL ))
where CCI is 'Catalog cache inserts' and CCL is 'Catalog cache lookups'.
The package cache (analogous to Oracle's library cache) is used to store parsed SQL. The package cache hit-ratio is calculated using the following formula: -
100 * (1 - ( PCI / PCL ))
where PCI is 'Package cache inserts' and PCL is 'Package cache lookups'.
In both cases the metrics are available from the default database snapshot monitor, and the target value for the hit-ratio is 95% or above. Again in both cases, the number of cache overflows (also available from the snapshot monitor) should be zero.
The SORTHEAP is used as a workspace area for in-memory sorts. The important metric to look for here is the number of sort heap overflows. This metric is not provided by the default Snapshot Monitor, the monitor switch 'SORT' must be turned on to obtain it.
If a sort will not fit into the sort heap it will overflow into the TEMPSPACE tablespace via the bufferpool. Overflows can have a very significant impact on performance. However, increasing the SORTHEAP size to avoid overflows carries the same caution on overall memory availability as described earlier for bufferpool sizing. Section 6 provides additional guidance on sort performance.
5. Physical Data Storage
The relationship between the logical constructs of the tables and the physical data storage is defined in tablespaces. The physical storage is called a container, which can be a file location on disk or an entire disk drive. Each tablespace will usually consist of multiple containers. DB2 allows for two types of tablespace, system managed space (SMS) or database managed space (DMS).
In an SMS tablespace the operating system controls the storage space, space is allocated as required and the containers are directories in the file space of the operating system. In a DMS tablespace the database manager controls the storage space, and the space is pre-allocated. SMS tablespaces are very simple to administer and space is not allocated until required. DMS tablespaces are much faster but can waste space.
Since the requirement for temporary tablespaces is highly dynamic and application dependant, these are usually configured using SMS. The TEMPSPACE tablespace should have at least three containers on separate physical disks.
6. Indexing and Sorting
Choosing the right indexing strategy for your tables is vitally important for achievement and maintenance of good performance. The appropriate indexing strategy cannot be derived solely from an examination of the data. You need to look at the way that the data will be used. Good candidates for indexes are columns which: -
" have a low update frequency
" are often used in ORDER BY clauses
" are often used in WHERE clauses
" have a sufficient number of distinct values (cardinality) for the index to be chosen by the optimizer
One indexing technique that is particularly useful where range scans are required is a clustered index. When an index is defined as a clustering index, the data is physically stored in the sequence defined by the index. If, for example, a query asks for all orders placed between two dates, and a clustering index has been defined on date, DB2 simply uses the index to find the first order in the date range, and then sequentially reads the data for the rest.
Figure 6. Clustered vs. non-clustered indexes
A covering index is where the index itself contains all the data required to satisfy a query. This can be achieved by a composite index which includes all the data requested in the SELECT clause, or by the addition of data columns to the index through the use of the INCLUDE clause when the index is defined.
The DB2 optimizer will examine the statistics for each table when deciding on an access path including whether to use a index. These statistics are not maintained automatically and have to be periodically updated using the RUNSTATS utility. If the statistics are not kept up to date, then there is a danger that the optimizer will choose a less-than-optimal execution plan.
DB2 provides an invaluable tool for checking the choices made by the optimizer - the Visual Explain Plan. Figure 7 below shows the execution plan itself, with the steps that the optimizer has chosen in order to execute the query and the cost of each step.
Figure 7. Visual Explain Access Plan
Simply double-clicking on each element of the chart will provide further details as shown in figure 8 below.
Figure 8. Visual Explain Step Details
A useful indicator for the absence of indexes in an OLTP system is a high number of rows read per transaction. For the purposes of this calculation 'transaction' is the sum of commits and rollbacks. The actual number that you should be aiming for here is highly dependant on the characteristics of the application, but the higher the number is (for an OLTP system), the more certain you can be that there are missing indexes. DB2's Index Advisor can also be used to suggest (and even implement) the creation of additional indexes.
Whilst missing indexes are detrimental to performance, so are redundant indexes! Remember that it takes CPU and I/O resources (at update time) to maintain an index, and indexes consume space just like data. One technique for addressing this is to trap SQL statements from your applications, run an Explain Plan for each, and derive a list of those indexes which are never chosen by the optimizer.
Many DB2 systems spend a considerable percentage of their resources simply sorting data. The guiding principle here should be that the only good sort is the sort that doesn't happen. If DB2 can use an index to retrieve data in the sequence requested in an ORDER BY clause, then it can avoid the need to sort the data before it is given back to the application. Judicious use of clustering indexes can also reduce or eliminate the need for sorts.
Some sorting is unavoidable, so make sure that your SORTHEAP size is sufficient to prevent sort overflows as described in section 4.3. Try to make sure that not more than 5% of your sorts overflow.
Applications have to lock some database objects in order to ensure consistency and completeness. Well-designed applications do not hold locks either for longer than absolutely necessary, or at a higher level than required (for example holding a table lock when a row lock would do).
All of the current locks for a DB2 database are held in a single memory area, the size of this area is defined by the locklist parameter. If locklist runs out of space, or any given application occupies more than 22% (10% on AIX) of the locklist space, a process called lock escalation starts, in an attempt to reduce the space required by locklist. Lock escalation achieves this by replacing any multiple row locks held by an application on a table, with a single table lock. Lock escalation can have a dramatic adverse impact on performance and should be avoided at all costs. The number of lock escalations is provided by the default Snapshot Monitor.
The percentage of free locklist space can be derived from the Snapshot Monitor with the 'LOCK' switch set to 'on'. Make sure that you have a sufficient locklist size to keep average occupancy at less than 50%.
Another important parameter here is LOCKTIMEOUT that defines, in seconds, how long an application should wait for a lock before backing out and re-starting. This should be set to no more than 15 seconds for an OLTP environment. Unfortunately DB2 provides a deadlock monitor (Snapshot Monitor 'LOCK' switch required) but not a lock timeout monitor.
8. Upcoming DB2 Features
The section provides a list of just some of the performance and tuning features that may be included in the upcoming release of DB2 UDB, scheduled for release in late 2002.
" Monitor Data - both the Snapshot and Event monitor data will be available through table functions, and so can be retrieved using SQL and simply stored in DB2 tables for analysis
" Health Center - a GUI tool providing cross-platform diagnostics, alerting and corrective action
" Performance Configuration Advisor - a new CLP command - AUTOCONFIGURE which can also be used as an option at database creation time
" db2trc - dramatic performance and usability improvements for db2 trace
" MDC - Multi-dimensional clustering for tables
No magic or divine intervention is required in order to address DB2 performance and tuning. By using the data sources and following the simple recommendations included in this paper, good performance will be achievable in the majority of cases. It is not possible in a paper such as this to provide more than a brief introduction. There is a wealth of material available from both IBM and others on DB2 performance topics; some of the resources that I have found most useful are included in the bibliography at the end of the paper.
One final thought. You can obtain a huge amount of useful information about performance from both the operating system and DB2 itself. It is vitally important that decisions are made by looking at both sources, rather than one in isolation. This said, by far the most important source of information is the user community. Only by talking to the people who actually use the database information can you make the right decisions (particularly where indexing strategies are concerned) and only the users will be able to tell you where to focus your tuning efforts, and whether you have been successful!
All product and company names referred to in this paper are used for identification purposes only and may be trademarks of their respective owners.
DB2: The Complete Reference, Melnyk, Zikopoulos et.al. Osborne ISBN 0-07-213344-9