Queries that utilize the parallel features of the database may be expensive both in terms disk accesses and memory utilization. These decision support queries are identified by the fact that they are queries that run with a PDQ priority greater than zero. One of the things that the database administrator needs to be aware of is the status of the PDQPRIORITY for any queries that are running. If a query is run with a high PDQPRIORITY, it can hog the system resources to such an extent that the entire system slows to a crawl.
All memory and other resources used in DSS queries is controlled by the Memory Grant Manager in IDS. It controls:
Number of concurrent queries (specified in DS_MAX_QUERIES in ONCON-FIG)
Number of scan threads
Number of PDQ threads
Amount of memory and CPU resources used
No, we're not talking about particle physics or strange quarks here. This is complicated stuff, but it's not rocket science. When Informix allocates memory to decision support queries, it allocates the memory in chunks, the size of which is defined by the ONCONFIG setup of the engine. For each system, the minimal amount of memory allocated to a query is called a quantum.
A quantum is defined as
DS_TOTAL_MEMORY / DS_MAX_QUERIES
When a decision support query is submitted to the engine, the engine first registers the query with the Memory Grant Manager, which is the mechanism that allocates the memory and other resources for the query. MGM will reserve an amount of memory for the query equal to:
DS_TOTAL_MEMORY * ( 1 / 100 * PDQPRIORITY ), rounded down to the nearest quantum
This shows the importance of managing the PDQPRIORITY correctly. If the PDQPRIORITY is set to 100 percent, the first decision support query will go out and reserve the entire contents of the decision support memory, thus blocking any further queries from running. We'll look at the output from onstat -g mgm in several sections, since the report is fairly long and each section is different:
$ onstat -g mgm Informix Dynamic Server Version 7.30.UC3 -- On-Line -- Up 2 days 22:16:32 -- 18464 Kbytes Memory Grant Manager (MGM) -------------------------- MAX_PDQPRIORITY: 100 DS_MAX_QUERIES: 2 DS_MAX_SCANS: 1048576 DS_TOTAL_MEMORY: 256 KB
This first section simply lists configuration data from the ONCONFIG file. The DS_MAX_SCANS parameter is usually defaulted to a very high number in the engine, in this case equal to (1024 * 1024). This parameter is the total number of scan threads that the engine may allocate.
When a DSS query on a fragmented table runs, the engine attempts to allocate a number of scan threads equal to:
reserved_threads = min (nfrags, (DS_MAX_SCANS * PDQPRIORITY / 100 *MAX_PDQPRIORITY / 100) )
Usually, this will mean the number of threads reserved for a query will equal the number of fragments (nfrags) to be scanned. If for some reason the scan threads cannot be allocated, the query will be held in a ready state until the threads become free.
Queries: Active Ready Maximum 0 0 2 Memory: Total Free Quantum (KB) 256 256 128 Scans: Total Free Quantum 1048576 1048576 1
The second section gives information about the number of active queries and number of ready queries, along with the DS_MAX QUERIES parameter from the ONCONFIG file. If you see a positive number under "Ready," it means that something is blocking these queries from running and that you could possibly speed up performance by eliminating the blockage.
The next line gives total and free decision support memory, as well as the calculated figure for the smallest amount that can be allocated (the quantum). If you see a very small amount of free memory, you have another source of blockage.
The third line is the total and free number of scans, as well as the least number of scan threads that can be allocated at any one time.
Load Control: (Memory) (Scans) (Priority) (Max Queries) (Reinit) Gate 1 Gate 2 Gate 3 Gate 4 Gate 5 (Queue Length) 0 0 0 0 0
This next section gives a summary of the queries' status. When a query is registered with the MGM, one of the first things that MGM does is test for resources that may be required by the query. If any needed resources or conditions cannot be met, the query is held in a ready state and is added to a queue waiting for one of the following:
Memory: Queries waiting for other queries to release memory. These other queries have already reserved all the available memory and no memory is now available.
Scans: Your system has already reached the DS_MAX_SCANS parameter and this query has to wait until some scan threads free up. With the default 1048576 DS_MAX_SCANS, you should never see queries waiting here.
Priority: The PDQPRIORITY does more than just allocate memory. Threads are run in order of PDQPRIORITY. If a query is run at a lower PDQPRIORITY, it cannot run until all queries with a higher priority have run.
Max Queries: You've reached the DS_MAX_QUERIES parameter limit and this query cannot run until another query finishes.
Reinit: If the user or DBA changes either of the Informix environmental variables DS_TOTAL_MEMORY or DS_MAX_PARAMETERS, the change does not become effective until all currently running queries are completed. Any new queries registered before this happens must wait in the reinit queue until all the other queries have completed and the parameters can be reinitialized.
Active Queries: --------------- Session Query Priority Thread Memory Scans Gate 23 c053f23 50 c12345cl 1024/1024 3 / 4 Ready Queries: None
This section gives more information about the queries that are active and queued up and waiting for resources. The session ID is the same as is found in onstat -g ses. The Thread column is the number of the primary thread that registered the query with MGM. The Memory field is not the amount of memory as might be expected, rather it is the number of internal calls to get memory.
Scans tells the number of scans used/allocated. The Gate field lists the queue that the query is waiting in (only for ready queries; an active query is not waiting for any resources).
Free Resource Average # Minimum # -------------- --------------- --------- Memory 0.0 +- 0.0 32 Scans 0.0 +- 0.0 1048576 Queries Average # Maximum # Total # -------------- --------------- --------- ------- Active 0.0 +- 0.0 0 0 Ready 0.0 +- 0.0 0 0 Resource/Lock Cycle Prevention count: 0
This final section may be of more use than any of the others in tuning your PDQ usage. It gives the averages, minimums, and totals for various queries since either the engine was restarted or since any of the parameters were dynamically changed with an onmode command. The free resources section shows averages and minimums for memory and scans. The minimum number are minimum free numbers, not the minimums actually seen. In the queries section, you get averages, minimums, and totals for active and ready queues. The maximum field contains the maximum sizes of the queues during the period. Total field gives the total number of each queue. Here, you can get a count of exactly how many of your queries are running as DSS queries.
The Average fields in both sections have one interesting data item that you will not see in any other reports. Note that these numbers have a "+-" in them. The second number is actually a calculated standard deviation from the norms in the first numbers. If you are attempting to make some sense out of these numbers, make sure you understand the statistical concepts of standard deviation so that you don't try to make assumptions based on insufficient data. Basically, if the +- figure is large, it means that there was a lot of deviation from the averages. If you lay with your head in the oven and your feet in the icebox, you'll be comfortable in the mean, but the standard deviation will kill you.
Since MGM only reserves memory for queries and doesn't force its usage, you may find that it is overallocating memory resources for your DSS queries. If you suspect that's the case, check the output of the onstat -g mem command and see how much of the memory is actually being used by the query in question. If there's a lot of unused memory, you could benefit by running this query with a much lower PDQPRIORITY, thus freeing resources for other uses.
The amount of memory available to your DSS system is a primary factor in the system's overall performance. One task that the DBA has is to determine whether or not the system has enough memory for its intended use in the first place, and later to monitor the usage and allocation of this memory.
The memory being displayed is in the virtual portion of the shared memory segment. The DBA needs to continuously monitor memory utilization with the aim of either managing the utilization or detecting when the system needs more memory. Actual memory in use doesn't become an issue with IDS unless there is not enough of it, and then it becomes a big issue. When the system begins to start swapping or paging out in an attempt to deal with memory shortages, the system's performance goes down the drain. The virtual portion of shared memory is used by both OLTP and DSS queries, but requirements for OLTP are minimal. It's the DSS queries that use up the virtual portion of shared memory. DSS queries use the shared memory for various activities, including:
Session pools
Thread pools
Sort pools
Joins
Sorts
Group operations
The first issue is usually balancing the use of shared memory between DSS and OLTP queries. This is best done by manipulating the value of the DS_TOTAL_MEMORY parameter in ONCONFIG. Use this to restrict the amount of shared memory that is used by DSS. Then monitor the memory usage by OLTP queries and see how much memory the remaining queries utilize. If you find that the OLTP queries do not use all of the remaining portion of the virtual shared memory, increase the DS_TOTAL_MEMORY to take advantage of this unused memory.
For pure DSS systems, increase the DS_TOTAL_MEMORY to it's maximum, which is half the total shared memory configured in the IDS system. For such pure DSS systems, reduce the amount of BUFFERS in ONCONFIG and use as much of the memory as possible for DSS memory. If you do this, keep in mind that for sequential scans in DSS systems, the most efficient means of reading the tables are by light scans, and for a table to be eligible to use light scans, it must be larger than your total BUFFERS. All of this monitoring of memory usage is done by the onstat -g mem command. This command can be executed in one of three flavors:
• Onstat -g mem | Global memory information | |
• Onstat -g mem | <pool name> | Memory usage for a named pool |
• Onstat -g mem | <session> | Memory usage for a numbered session |
Since the last two invocations are simply subsets of the first, we'll look at the global memory with onstat -g mem:
$ onstat -g mem Informix Dynamic Server Version 7.30.TC3 -- On-Line -- Up 00:21:46 --9536 Kbytes Pool Summary: name class addr totalsize freesize #allocfrag #freefrag resident R c002018 417792 12144 2 2 res-buff R c068018 835584 12144 2 2 global V c152018 540672 162672 612 55 mt V c154018 1417216 639760 888 45 rsam V c18c018 212992 760 87 1 aio V c1ce0l8 1720320 151632 198 47 gls V c1d2018 49152 2240 709 6 dictpool V cld6018 49152 1936 34 5 procpool V c1d8018 8192 1992 9 1 XTF_mem V c20e018 409600 17464 4 3 main_loop() V c310018 40960 23680 115 4 2 V c31c018 8192 3336 8 1 3 V c33e018 8192 3336 8 1 4 V c360018 16384 3048 207 1 lgflushpool V c4be018 8192 7912 3 1 aslogflush V c4c0018 16384 4632 108 1 btclean V c4d4018 16384 4632 108 1 onmode_mon V c4fe018 16384 4632 108 1 510 V c502018 57344 27360 272 8 511 V c4fc018 8192 3336 8 1 Blkpool Summary: name class addr size #blks global V c15a1d8 0 0
The output is divided into two sections, the pool summary and the block pool summary. Both outputs share the following fields:
• Name | Pool name or session ID. The names that consist of numbers only are the session pools. In the example, we have session pools for session 3, session 4, session 510, and session 511. |
• Class | R=resident, V=virtural |
• Addr | Address in shared memory |
The Pool Summary section has these fields that are specific to this section:
• Totalsize | Total size in bytes of the pool |
• Freesize | Total unused bytes in the pool |
• #allocatedfrag | Number of allocated fragments |
• #freefrag | Number of free fragments |
The block pool summary uses these fields in addition to the first three that are common to both sections:
• size | Total size in bytes of the pool |
• #blks | Number of 8K blocks |
The major fields of interest are the pool names, classes, total and free sizes. By looking at these outputs, you can get a pretty good idea of how your memory is being used and what it is being used for.
In the onstat -g mem output above, we showed sessions 2, 3, and 4 as being active. We can use the onstat -g ses command to get more information about these sessions.
$ onstat -g ses 3 Informix Dynamic Server Version 7.30.TC3 -- On-Line -- Up 00:43:59 -- 9536 Kbytes session #RSAM total used id user tty pid hostname threads memory memory 3 informix - 0 - 0 8192 4856 Memory pools count 1 name class addr totalsize freesize #allocfrag #freefrag 3 V c33e018 8192 3336 8 1 name free used name free used overhead 0 120 scb 0 96 ostcb 0 40 sqscb 0 4552 fragman 0 48
We can also invoke the onstat -g ses command without giving it a session number. Depending upon how you get interested in a particular session, you may run onstat -g ses first and then drill down into a particular session of interest.
C:INFORMIXetc>onstat -g ses Informix Dynamic Server Version 7.30.TC3 -- On-Line -- Up 00:46:49 — 9536 Kbytes session #RSAM total used id user tty pid hostname threads memory memory 1104 informix - 0 - 0 8192 4856 4 informix - 0 - 0 16384 13336 3 informix - 0 - 0 8192 4856 2 informix - 0 - 0 8192 4856
The last two fields of the onstat -g mem command discussed allocated and free pool fragments. Use the next three onstat -g commands to further drill down and look at the actual composition of those fragments.
$ onstat -g afr 3 Informix Dynamic Server Version 7.30.TC3 -- On-Line -- Up 00:56:45 -- 9536 Kbytes Allocations for pool name 3: addr size memid c33e000 120 overhead c33e078 48 scb c33eOa8 48 scb C33e0d8 40 ostcb c33el00 4456 sqscb c33f268 48 fragman c33f298 40 sqscb C33f2c0 56 sqscb
Use this command to drill down into the information about memory pools to get information about free pool fragments.
$ onstat -g ffr 3 Informix Dynamic Server Version 7.30.TC3 -- On-Line -- Up 00:58:49 -- 9536 Kbytes Free list for pool name 3: addr size c33f2f8 3336
This command simply summarizes the individual pool fragment allocations from the onstat-g afr command.
$ onstat -g ufr 3 Informix Dynamic Server Version 7.30.TC3 -- On-Line -- Up 01:00:50 -- 9536 Kbytes Memory usage for pool name 3: size memid 120 overhead 96 scb 40 ostcb 4552 sqscb 48 fragman
This onstat -g command shows how much resident and virtual shared memory is allocated and used by the IDS engine.
$ onstat -g seg Informix Dynamic Server Version 7.30.UC3 -- On-Line -- Up 2 days 22:13:20 — 18464 Kbytes Segment Summary: id key addr size ovhd class blkused blkfree 4 1381451777 c0e96000 10706944 1204 R 1300 7 (shared) 1381451777 cl8cc000 8200192 736 V 657 344 Total: - - 18907136 - - 1957 351 (* segment locked in memory)
The id, key, and addr fields are implementation-specific means of identifying the shared memory. They differ between NT and UNIX, which have different mechanisms for handling shared memory. The major points of interest are in the size (total size in bytes), class (Virtual or Resident), blkused (blocks used), and blkfree (blocks free) fields. The total field should equal the final field in the header for onstat in Kbytes.
One highly implementation-specific caveat comes in systems running on the Hewlett Packard HP/UX 10.XX operating system on PA-RISC processors. HP systems have some problems with the algorithms for adding shared memory segments. These systems should be monitored carefully with the onstat -g seg command, and they should never be allowed to grow beyond three shared memory segments. Once the fourth segment is added, performance goes into the toilet. The trick here is to allocate enough shared memory at boot time by adjusting the SHMVIRTSIZE parameter to the largest number that is likely to be needed, and then adjust the SHMADD and SHMTOT parameters in the ONCONFIG file to disallow any dynamic growth of the shared memory beyond the amount already configured in the first three segments. This is a hardware limitation of the PA-RISC processor. This processor has 4 special-purpose address registers that are the only way to access shared memory. If you get more than 4 shared memory segments, the system is forced to thrash by swapping the shared memory base addresses into and out of these 4 registers.
The next two onstat -g commands show bitmaps for the resident and nonresident segments of shared memory. Each bit set represents an 8-kilobyte block in use. If the bit is not set, that block is not in use. This command lists the block usage for the communication message area of the resident section.
$ onstat -g rbm Informix Dynamic Server Version 7.30.UC3 -- On-Line — Up 2 days 22:13:26 — 18464 Kbytes Block bitmap for resident segment address c0e96000: (bitmap address = c0e9625c, bitmap size = 292) c0e9625c:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff c0e9627c:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff c0e9629c:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff c0e962bc:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff c0e962dc:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff c0e962fc:fffffOOO 00000000 eObGOOOO 00000000 00000001 00000001 00000000 00000000 c0e9631c:00000000 00020000 c0e96000 52574801 52574801 00001000 00000001 00a36000 c0e9633c:007d2000 00000000 00000000 00000000 00000000 02000000 c0e96000 00000000 c0e9635c:00000000 00000000 c0e96000 c209eOOO cOe9a800 00000000 00000000 00000000 c0e9637c:00000000
Prints block bit map for the nonresident segments, one bit per 8-kilobyte block. Bit set indicates block free. By converting the values to binary, you can determine how much of the nonresident (virtual) segment is actually in use at any one time. If you see "00000000" for many segments, this indicates that the virtual portion of shared memory is not being used to its fullest and you may be able to cut back on the nonresident portion, thus freeing up memory for other uses.
$ onstat -g nbm Informix Dynamic Server Version 7.30.UC3 -- On-Line — Up 2 days 22:13:31 -- 18464 Kbytes Block bitmap for virtual segment address cl8cc000: (bitmap address = c18cc25c, bitmap size = 128) cl8cc25c:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff cl8cc27c:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff cl8cc29c:ffffffff ffffffff fffffffe ffffffff ffc22888 00000003 80000000 00000000 cl8cc2bc:00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000