Onstat -g iof is the onstat command that is most useful in monitoring reading and writing activity to and from the disk subsystem. Given the sensitivity of the database engine to disk access times, monitoring disk activity is of critical importance in extracting maximum performance from the IDS engine. The onstat -D command can give you similar information, but onstat -g iof is somewhat simpler and easier to understand.
This is true in OLTP systems and even more true in DSS systems. Decision support systems rely more on IDS's inherent parallelism and smart disk access to reduce the times needed for the many sequential scans that are typically seen in DSS systems. The major objective for disk tuning is to spread the activity as evenly as possibly across multiple fast disk drives. The general rule is "the more spindles, the better." Thus, if you have otherwise equally attractive opportunities to use either four 2-gig drives or one 8-gig drive, the four drive setup will give you better performance, assuming that other access factors are equal.
$ onstat -g iof Informix Dynamic Server Version 7.30.UC3 -- On-Line -- Up 2 days 22:14:37 -- 18464 Kbytes AIO global files: gfd pathname totalops dskread dskwrite io/s 3 rootdbs 0 0 0 0.0 4 config_data 0 0 0 0.0 5 eba_data 0 0 0 0.0 6 demotest_data 0 0 0 0.0 7 /dev/vg00/rraw0 0 0 0 0.0 8 /dev/vg0l/rrawl 0 0 0 0.0 9 app1dbs_data 0 0 0 0.0 10 logspace_data 0 0 0 0.0 11 tempspace_data 0 0 0 0.0
Chunk numbers are not listed in the onstat -g iof output, but the chunks are listed in the same order as in onstat -d. Field meanings are:
• gfd | Global file descriptor. Earlier numbers were created earlier |
• pathname | Location of the chunk |
• totalops | Total of all disk operations to the chunk |
• dskread | Read operations from the disk |
• dskwrite | Write operations from the disk |
• I/O/s | I/O operations per second |
The system from which this sample output was taken was inactive, but should you see one chunk with much more activities per second, you might want to run oncheck -pe to find which tables or fragments reside on the chunk and consider modifying either your table locations or your fragmentation strategy.
The ideal situation would be to have each chunk represent a different disk. What is really important for performance is not reads and writes by chunk but reads and writes per disk spindle. With a one-to-one chunk to spindle relationship, the disk activities become obvious. What this would mean, though, is that the maximum physical disk size that you could use would be 2 gigabytes, as that is the maximum chunk size that IDS can address. With the advent of larger disks and the need to go to larger disks, this is not always feasible. Another way to make it easier to analyze the data in that case would be to follow a stringent chunk naming convention. Thus, if you have four 2-gig chunks all residing on an 8-gig disk called diskA, name the chunks in such a way that you can identify the physical disk from the chunk name, something like this:
dAchunkl dAchunk2 dAchunk3 dAchunk4 dBchunk5 dBchunk6, etc
While this will not alleviate any performance problems that arise from using larger disks, it will at least make the monitoring easier.
This command seems to have disappeared from the "onstat --" help screen, but as of IDS 7.30, it still existed in the utility. Whether this is a purposeful omission or not on Informix's part is open to discussion.
This command presents information about the IDS I/O queue lengths and operations. There is one line of output for every physical chunk in the system. These are identified by "gfd <number>" where gfd is the global file descriptor as in the onstat -g iof output. There is also one queue for each CPU-VP.
D:INFORMIXin>onstat -g ioq Informix Dynamic Server Version 7.30.TC3 -- On-Line -- Up 00:27:13 -- 9536 Kbytes AIO I/O queues: q name/id len maxlen totalops dskread dskwrite dskcopy kio 0 0 0 0 0 0 0 kio 1 0 3 121 113 8 0 adt 0 0 0 0 0 0 0 msc 0 0 1 1015 0 0 0 aio 0 0 1 47 14 0 0 pio 0 0 0 0 0 0 0 lio 0 0 0 0 0 0 0 gfd 3 0 0 0 0 0 0 gfd 4 0 0 0 0 0 0 gfd 5 0 0 0 0 0 0 qfd 6 0 0 0 0 0 0
Meanings of the output fields are:
If you notice an excessive number in the maxlen column for any of the aio queues, it may be possible to increase your disk performance by adding an additional AIO-VP thread.
To use this command to debug performance problems, look for abnormally high numbers in the len and totalops fields for a particular gfd, which will map to a particular chunk. If you see high numbers for particular gfds, your disk access is skewed and you should look at either moving some tables around or fragmenting large tables that are located on these devices.
While onstat -g ioq gives you information about the queue lengths for the various disk chunks, the -g iov report gives you I/O per second, total I/O operations broken down into reads, writes, and copies. It also gives information about how many times these VPs were wakened from a sleeping state and how many I/O operations were performed on the average for each wakeup.
This is broken down by virtual processors, so you can get a feeling for how many virtual processors your system actually needs. In the sample below, there are 18 AIO-VPs and none of them is doing anything. If you were to see many VPs that had low numbers of I/O per wakeup during a relatively busy period, you would want to consider cutting back on the number of AIO-VPs and using their resources for other tasks.
In addition to the aio class of virtual processors, you will also see:
• msc | Handles miscellaneous writes, such as writes to the system log |
• pio | Writes to the physical log |
• lio | Writes to the logical log |
• aio | "Normal" AIO-VPs which access data from the disk |
$ onstat -g iov Informix Dynamic Server Version 7.30.UC3 -- On-Line -- Up 2 days 22:14:34 -- 18464 Kbytes AIO I/O vps: class/vp s io/s totalops dskread dskwrite dskcopy wakeups io/wup errors msc 0 i 0.0 0 0 0 0 0 0.0 0 aio 0 i 0.0 0 0 0 0 0 0.0 0 aio 1 i 0.0 0 0 0 0 0 0.0 0 aio 2 i 0.0 0 0 0 0 0 0.0 0 aio 3 i 0.0 0 0 0 0 0 0.0 0 aio 4 i 0.0 0 0 0 0 0 0.0 0 aio 5 i 0.0 0 0 0 0 0 0.0 0 aio 6 i 0.0 0 0 0 0 0 0.0 0 aio 7 i 0.0 0 0 0 0 0 0.0 0 aio 8 i 0.0 0 0 0 0 0 0.0 0 aio 9 i 0.0 0 0 0 0 0 0.0 0 aio 10 i 0.0 0 0 0 0 0 0.0 0 aio 11 i 0.0 0 0 0 0 0 0.0 0 aio 12 i 0.0 0 0 0 0 0 0.0 0 aio 13 i 0.0 0 0 0 0 0 0.0 0 aio 14 i 0.0 0 0 0 0 0 0.0 0 aio 15 i 0.0 0 0 0 0 0 0.0 0 aio 16 i 0.0 0 0 0 0 0 0.0 0 aio 17 i 0.0 0 0 0 0 0 0.0 0 pio 0 i 0.0 0 0 0 0 0 0.0 0 lio 0 i 0.0 0 0 0 0 0 0.0 0
Under the appropriate circumstances, Dynamic Server can bypass the LRU queues when it performs a sequential scan. A sequential scan that avoids the LRU queues is termed a light scan. Light scans can be used only for sequential scans of large data tables and are the fastest means for performing these scans. System catalog tables and tables smaller than the size of the buffer pool do not use light scans. Light scans are allowed under Dirty Read (including nonlogging databases) and Repeatable Read isolation levels. Repeatable Read full-table scans obtain a shared lock on the table. A light scan is used only in Committed Read isolation if the table has a shared lock. Light scans are never allowed under Cursor Stability isolation.
C:INFORMIXin>onstat -g lsc Informix Dynamic Server Version 7.30.TC3 - On-Line Up 00:14:54 -- 536 Kbytes Light Scan Info descriptor address next_lpage next_ppage ppage_left bufcnt ook_aside
There is an environmental variable, LIGHT_SCANS, that can be set to the value of FORCE in some IDS systems that may help you ensure that light scans are used. Try it in your engine to see if it helps.
This command provides you with a summary of the types and numbers of AIO (async I/O) threads.
$ onstat -g iog Informix Dynamic Server Version 7.30.UC3 -- On-Line -- Up 2 days 22:14:45 -- 18464 Kbytes AIO global info: 6 aio classes 12 open files 64 max global files
Big buffers are used for long read and write operations. They are more efficient than normal buffers and reside in the virtual portion of shared memory. These big buffers are composed of several pages of shared memory. When the server needs to access multiple pages that are contiguous for either reads or writes, it will use big buffers. Examples are in sequential scan reads and some sorted writes. It also uses the big buffers during checkpoint writes. This is what makes the checkpoint writes more efficient than normal writes. Once a series of pages are read into a big buffer, they are then automatically copied into regular buffers.
The onstat -g iob command shows to what extent the various IO-VP classes are able to utilize these big buffers. It shows read and write activity as well as holes and holes per operation. The concept of holes is not defined anywhere in Informix documentation, so we are left to guess at the meanings.
This command doesn't really affect anything that happens in the database, but it will give you a feeling for just what is happening under the hood. 'There's not much you can do to act on this data, however.
$ onstat -g iob Informix Dynamic Server Version 7.30.UC3 -- On-Line -- Up 2 days 22:14:49 -- 18464 Kbytes AIO big buffer usage summary: class reads writes pages ops pgs/op holes hl-ops hls/op pages ops pgs/op kio 0 0 0.00 0 0 0.00 0 0 0.00 adt 0 0 0.00 0 0 0.00 0 0 0.00 msc 0 0 0.00 0 0 0.00 0 0 0.00 aio 0 0 0.00 0 0 0.00 0 0 0.00 pio 0 0 0.00 0 0 0.00 0 0 0.00 lio 0 0 0.00 0 0 0.00 0 0 0.00
IDS can speed up DSS queries by using parallel threads to accomplish DSS queries. This is most useful when the tables in question are fragmented across multiple disk spindles. This onstat command allows the DBA to monitor many factors by partition. Among the most important data elements in this command are the isrd (ISAM read) and iswrt (ISAM write) fields. These do not correspond directly to disk reads and disk writes, but they are a good indicator of the amount of activity to those partitions.
$ onstat -g ppf Informix Dynamic Server Version 7.30.UC3 -- On-Line -- Up 2 days 22:14:53 -- 18464 Kbytes Partition profiles partnum lkrqs lkwts dlks touts isrd iswrt isrwt isdel bfrd bfwrt seqsc 15 0 0 0 0 0 0 0 0 0 0 0 23 0 0 0 0 0 0 0 0 0 0 0 1048577 0 0 0 0 0 0 0 0 0 0 0 1048578 0 0 0 0 0 0 0 0 0 0 0 ... 7340211 0 0 0 0 0 0 0 0 0 0 0 8388609 0 0 0 0 0 0 0 0 0 0 0 9437185 0 0 0 0 0 0 0 0 0 0 0
Field meanings are:
• partnum | Partition number |
• lkreqs | Lock requests |
• lkwts | Lock waits |
• dlks | Deadlocks |
• touts | Timeouts |
• isrd | Read calls |
• iswrt | Write calls |
• isrwt | Rewrites (updates) |
• isdel | Deletes |
• bfrd | Buffer reads |
• bfwrt | Buffer writes |
• seqsc | Sequential scans |
To relate the partnums to actual table names requires you to go to the system tables. The partnum field is found as partn in the sysfragments table. Also in sysfragments will be the tabid which is the table ID. Go to the systables table with the tabid and get the tabname. This will work only if there are actually fragmented tables in the database. If there are no fragmented tables, there will be no rows in the sysfragments table.