|
|
|
Происхождение этого драфта неизвестно 1. Introduction1.1 What is performance?This may seem like an obvious question, but it is not. Different people have different views of what they mean by 'performance'. Also, their view may change depending on circumstances. When discussing performance, the first thing to do is to ensure that all the parties involved agree as to what they are talking about. 1.2 Defining terms of referenceThis follows from 1.1 above. When conducting any sort of performance exercise (tuning, investigation, benchmark) one needs to have a clear statement of objectives, metrics, constraints etc. 1.3 Performance investigationsWhen conducting a performance investigation one should be careful to investigate all areas and not just concentrate on one. For example, a poorly designed/implemented application can result in very poor performance no matter how well OnLine is tuned. 2. ApplicationsA very important area. A badly designed or implemented application can drag down the performance of the most highly tuned OnLine server. 2.1 Using ESQL/CProperly used, ESQL/C is the most performant solution for writing Informix client applications. However, it is not necessarily the easiest to use. Most of the performance related issues that apply to ESQL/C apply to ESQL/COBOL, 4GL and NewEra as well. 2.2 SQL usage
2.3 Data Design
2.4 IndexingAvoid highly duplicate indexes. Highly duplicate indexes are inefficient and degrade performance. If you feel you need to index on a column containing highly duplicate values, try making the index a composite with a second column which contains (nearly) unique values. The server can still use this index for access through the duplicate column but the index will be much more efficiently processed. Choose indexes carefully The objective of indexing is to facilitate rapid data access. Only create indexes that give a significant performance improvement to SQL operations. Use the Query Plan and testing to determine the best indexing strategy. Don't over index Too many indexes are as bad as not enough/incorrect indexes. Each index present on a table adds storage overhead. Indexes also add processing overhead to inserts, deletes and maybe updates on the table. From V7.2: don't need ascending and descending indexes on the same columns V7.2 can process an index in reverse so any index is effectively both ascending and descending Use FILLFACTOR to allow room for inserts Optimiser's choice of a suitable index is based on the order of index creation E.g. table with index on colA and another index on colA, colB and colC. A filter on colA would use the index that was created first, which may not be the best one. Utilise index only reads/scans Consider:
If colb is small (e.g. an integer) and cola is indexed then it might be better to make the index a composite on cola and colb. This query could then be satisfied by just reading the index entry without accessing the data row at all. This technique is only worthwhile for frequently executed queries. 2.5 Stored ProceduresCan reduce client /server traffic Use SPs to reduce client server traffic by moving complex application logic to the server (of course this increases the server processing overhead). Avoid small SPs SPs are quite useful for enforcing application logic etc. but they can be a performance hit. Avoid small SPs with few statements. Avoid nested or recursive SPs SPs which call other SPs, which call other SPs : are bad for performance SPs can considerably reduce Online's ability to parallelise queries OnLine cannot currently parallelise most queries involving SPs, though it can parallelise the individual SQL statements within the SP. SPs used in an expression in a WHERE clause will be evaluated for every row examined 2.6 Query PlanThe Optimiser Query Plan (produced by SET EXPLAIN ON) is very useful in looking at the efficiency of SQL statements. One can see how the effects of statistics, data distributions, indexes, fragmentation and SQL statement composition affect query performance. Note that the writing of the query plan has a significant impact on performance so do NOT have SET EXPLAIN ON when doing timings or for live running. 3. Loading data3.1 Load methods:High Performance Loader (Parallel load) Very flexible, parallel load utility. Available as from OnLine 7.2. High performance. Custom ESQL load programs Can provide high performance, depending on sophistication and complexity. Bulk Loader (BLOADER) Earlier version of parallel load utility. Works with older versions of OnLine. Not officially supported. Good performance. DBLOAD Standard load utility. Loads from ASCII files. Adequate performance for smaller data volumes. DBACCESS LOAD statement An easy way to load delimited ASCII files. 3.2 FET_BUF_SIZE affects insert cursor bufferNote that, contrary to the documentation, FET_BUF_SIZE affects the size of the buffer used for insert cursors as well as fetch cursors and is therefore very useful when loading data using ESQL/C, DBLOAD and DBACCESS (but not 4GL). 3.3 Turn off logging during loadingLoading data is faster without logging. 3.4 Add indexes, Referential Integrity and constraints after data is loadedThe presence of indexes, RI etc. can greatly degrade load performance. 4. OnLine4.1 General points4.2 Cooked vs. Raw disk spaceAlways use raw space, never use cooked. By raw space we mean a raw disk partition accessed via a character mode special file (a 'c' in the first position of the output from ls -l). We do not mean: Block mode special files associated with disk partitions (a 'b' in the first position of the output of ls -l) Volume manager control devices Anything else that isn't a proper raw disk partition 4.3 Update StatisticsThis is very important. Ensure that your data distributions and statistics are up to date by running the appropriate UPDATE STATISTICS commands when the distribution of values in any relevant columns has changed significantly. Run UPDATE STATISTICS HIGH for columns that head an index and use MEDIUM for all other columns. There is an ATG utility that will generate appropriate UPDATE STATISTICS statements. Run the updates using multiple simultaneous jobs to improve overall throughput as internal parallelism is low for UPDATE STATISTICS processing. You may need to increase the value of DBUPSPACE environment variable. 4.4 Read AheadUseful for non-parallel sequential scans (data only, index only and index/data). Monitor via onstat -p If sum of ixda-RA, idx-RA, and da-RA is close to RA-pgsused then try increasing read ahead parameters. If the sum is much smaller than RA-pgsused, decrease the parameters. Generally keep RA_PAGES <= 32 and RA_THRESHOLD around half of RA_PAGES 4.5 OLTP vs. DSSOLTP characteristics:
DSS characteristics:
4.6 Disk Layout and FragmentationChoice of disk hardware:
Separate OnLine disk space from other system activity
Separate Physical and Logical logs For any OnLine instance where there is a lot of write activity (insert, update and delete), separate the logical logs and physical logs onto different disks / controllers. Try not to have anything else on the disks where the physical and logical logs are located Alternate logical logs on different disks Avoids contention between current log and log backup. Minimise head movement If possible, allocate raw partitions to minimise disk head movement. Control table fragmentation Try and prevent tables becoming fragmented into many extents. This is especially important with OnLine V5.0 and earlier.
For V7.1 and later:
For V5.0 and earlier
For V6.0
For V7.1 and later
4.7 OPTCOMPINDOLTP - set to 0 For OLTP, one normally wants to inhibit the use of hash joins as they are processor intensive. DSS - set to 2 For DSS, hash joins can give good join performance. 4.8 LRUSMinimum 4, otherwise = NUMCPUVPS 4.9 Page Cleaners< 20 disks, 1 thread per disk >= 20 & <= 100 disks, 1 thread per 2 disks > 100 disks, 1 thread per 4 disks 4.10 LRU_MIN_DIRTY, LRU_MAX_DIRTYUse these to control when page cleaning occurs. For OLTP generally want cleaning to occur evenly and not all at checkpoints 4.11 PDQPRIORITY / MGM ParametersDSS queries are controlled by PDQPRIORITY and MGM. A DSS query is any query for which PDQPRIORITY is > 0. DS_MAX_QUERIES Specifies the maximum number of concurrent DSS queries. DS_TOTAL_MEMORY Specifies maximum memory allowed for all DSS operations. Allocated from the virtual portion. DS_MAX_SCANS Specifies the maximum number of concurrent DSS scans. MAXPDQPRIORITY Specifies the maximum allowed PDQPRIORITY. PDQPRIORITY Controls DSS query resource usage and parallelism. Single DSS queries in isolation For best performance of single DSS queries being run in isolation (e.g. a single query benchmark) set PDQPRIORITY to 100. (also make sure MAXPDQPRIORITY is set to 100 in ONCONFIG file) Multiple DSS queries Choose suitable value for PDQPRIORITY for each query depending on what you want to achieve. Mixed DSS and OLTP workload This is tricky to tune for. The objective is normally to try and prevent the DSS queries degrading the OLTP performance too much. OLTP only Can either disable PDQ altogether (PDQPRIORITY / MAXPDQPRIORITY = 0) or set PDQPRIORITY / MAXPDQPRIORITY to 1 to allow parallel scan only (be very careful not to get throttled by DS_TOTAL_MEMORY, DS_MAX_SCANS, DS_MAX_QUERIES etc.) 4.12 BuffersFor OLTP, want large buffer pool and high cache hit rate For typical OLTP workloads, best performance is achieved when most data access is to a page in the OnLine cache. To achieve this one needs a correctly sized cache. The efficiency of the cache is shown by onstat -p. This shows a %cached figure for both reads and writes. You want to get these as high as possible, within the constraints imposed by available memory. It is often possible to get read hit rates above 95% and write hit rates above 85%. For DSS, a small buffer pool usually suffices For typical DSS workloads, most access is via light sequential scans which do not go through the buffer pool. As a result there is no need for a large buffer pool. Instead the memory should be allocated to DS_TOTAL_MEMORY to make it available for scan buffers, hash joins etc. 4.13 Buffered vs. Unbuffered loggingBuffered logging more performant than unbuffered With buffered logging, the logical log buffer is only flushed to disk when it becomes full. With unbuffered logging, every commit forces a log buffer flush. Note that as the logical log buffer is common to all databases, just one high activity database using unbuffered logging can significantly reduce the benefit of using buffered logging. Buffered logging less 'safe' than unbuffered Because the log buffer is only flushed when it is full, if there is an OnLine or system failure, the contents of the log buffer will be lost. Of course at the time of failure, the buffer may well contain commit records that have not yet been written to disk. OnLine fast recovery will roll back those transactions. All databases will be self consistent but may be inconsistent with the applications view of them as transactions that the application thought were successfully committed have in fact been rolled back. 4.14 LOGBUFFVery important when using buffered logging Less important for unbuffered logging Unless you have a lot of clients doing write activity or are using in-table BLOBS Monitor with onstat -l Tune to have pages/io around about 90%-95% of buffer size on average. Keep increasing until no improvement seen. For mainly unbuffered logging may not be able to get close to this figure. Not too important for systems with little write activity 4.15 PHYSBUFFImportant for instances with a lot of write activity Monitor with onstat -l Tune so that pages/io are around 90%-95% of buffer size on average. 4.16 CKPTINTVLDuring normal operations, two main events cause a checkpoint to occur: the checkpoint interval expiring or the physical log becoming 75% full. The amount of work that occurs between checkpoints determines how long fast recovery will take after a system failure. If recovery time is important for you, configure the checkpoint interval to ensure acceptable recovery time. Otherwise, you can make the interval high and allow the physical log 75% full condition to drive the bulk of your checkpoints. 4.17 KAIO vs. AIOKAIO Kernel AIO (KAIO) is not available in all ports / platforms. Where it is available it is usually the most efficient form of disk I/O but is only available when using raw disk space. When using KAIO, you still need to configure one or two AIO VPs for explain output etc. Informix AIO In the absence of KAIO, the Informix AIO mechanism is a quite efficient. Start with 2 AIO VPs per disk that contains OnLine data (as opposed to logical/physical logs). Monitor AIO VP performance using onstat -g ioq and onstat -g iov. Add or remove AIO VPs as necessary. It does not hurt too much to have a couple of AIO VPs that are mostly idle. 4.18 ConnectivityClient <-> OnLine vs. OnLine <-> OnLine Choose best communications mechanism Local clients: Shared Memory or Stream Pipes For clients running on the same machine as the OnLine server, choose shared memory (ipcshm) or stream pipes (ipcstr) connectivity. Where available, stream pipes is generally faster, more flexible and more secure than shared memory. Avoid network connections (TCP/IP) for local clients as the performance is much reduced compared to ipcshm or ipcstr connections. Remote clients: TCP/IP Use TCP/IP for remote clients. Only use Netware IPX/SPX if there are specific reasons to do so. Investigate effects of varying the fetch buffer size (FET_BUF_SIZE environment variable) and the socket buffer size (set in sqlhosts file) Minimise client/server exchanges using the techniques described earlier. Poll threads and NET VPs Poll threads handle incoming data from clients Data from server to client is sent by individual sqlexec threads A single poll thread can handle about 200 typical clients Varies depending on many factors, only a guideline For lots of clients, very active clients, or large amounts of data input (e.g. lots of inserts), may need more poll threads Poll threads can run in CPU VPs (inline poll) or NET VPs Only one 'inline' protocol at any one time. Inline Poll threads Best performance with lower numbers of clients Helps TCP/IP considerably Imposes additional overhead on CPU VPs Can't have more poll threads than CPU VPs NET VPs for poll threads For large numbers of clients NET VPs may offer better performance Can have as many poll threads as you want Offloads processing from CPU VPs Avoid the Relay Module Try and avoid using the Special Purpose Relay Module to connect pre 6.0 clients to >= 6.0 OnLine. IT IS A PERFORMANCE KILLER. Move to native 6.x/7.x clients ASAP. 4.19 Dictionary, Distribution and Stored Procedure CacheOnLine cache's data distributions, data dictionary information and Stored Procedures Individual caches, separate from main buffer cache Distribution cache Monitor with onstat -g dsc Control size with DS_HASHSIZE (number of hash buckets) and DS_POOLSIZE (maximum number of entries) Dictionary cache Monitor with onstat -g dic Control size with DD_HASHSIZE (number of hash buckets) and DD_HASHMAX (maximum entries per bucket) Stored Procedure cache Monitor with onstat -g prc Control size with PC_HASHSIZE (number of hash buckets) and PC_POOLSIZE (maximum number of entries) 4.20 Monitoring SessionsSessions can be monitored using onstat and SMI For problem sessions, investigate:
4.21 Single CPU VP flagAllows OnLine to eliminate some mutex operations If you are only using one CPU VP and are sure you will not want to add more dynamically, be sure to set the SINGLE_CPU_VP flag on in the ONCONFIG file. This allows OnLine to eliminate many mutex calls used for inter CPU VP synchronisation. 4.22 MULTIPROCESSOR flagOptimises internal OnLine operations for MP systems Set this flag correctly based on the number of processors in your system. It allows OnLine to optimise internal operations such as latching and VP idling based on the type of machine (uniprocessor or multiprocessor). Currently, other than performance, there is no problem if the flag is incorrectly set. This may not remain true in future releases. 4.23 Monitoring OnLine performanceonstat Command line tool. Displays information about many different aspects of OnLine operation. SMI Similar information to onstat but via an SQL interface. Can develop custom monitoring tools. Onperf / Xtree Graphical (X/Motif) based performance tool available on some platforms. Can display query trees within OnLine server. Oncheck Check table fragmentation, index integrity etc. etc. 5. O/S and Hardware5.1 Always look at the whole pictureThere is usually little point in just looking at OnLine's performance metrics. One needs to consider the system as a whole e.g. disk / controller utilisation, CPU utilisation, network utilisation etc. 5.2 O/S tools: sar, vmstat, iostat etc.Unix vendors provide various system monitoring utilities. Some common command line utilities are sar, vmstat and iostat. Many vendors offer graphical utilities (e.g. GlancePlus on HP). You must be very careful when using these utilities. They often report seemingly similar metrics, but the exact meaning of a particular metric can often differ radically between different Unix vendors, or even different Unix releases from the same vendor. Also, these utilities often suffer from subtle bugs that make some metrics unreliable. 5.3 Disks and controllersUse available O/S monitoring tools to investigate disk I/O Try to balance I/O evenly across all disks (not just OnLine disks) and controllers RAID5 can have poor write performance unless there is a large hardware cache Try to keep %busy below ~30%. If it reaches ~60% you will see degraded performance. Do not be fooled into worrying to much about iowait time. This metric is difficult to interpret meaningfully and is mostly misleading. Try to minimise head movement on each disk by careful positioning of raw partitions. OS / hardware mirroring can often help performance Generally helps performance more than OnLine mirroring Does depend on the sophistication of the vendors implementation. 5.4 NetworkDifferent Unix vendors offer different levels of control over TCP/IP parameters. Some you can often affect are: Number of Stream Buffers More buffers may improve performance for heavy client / server loads. Packet Size For reliable networks where clients often send / receive large amounts of data, increasing the packet size may improve throughput. For unreliable networks or situations where clients send / receive small amounts of data a smaller packet size may be better The packet size should correlate to the socket buffer size specified in the sqlhosts file. Connect Queue Length This can affect the performance of connection requests when many requests arrive at once. Increasing this value can reduce the number of client connection retries. 5.6 CPU usageCPU usage monitoring is an important aspect of tuning OnLine performance High system or OnLine CPU usage is not necessarily indicative of a performance problem You want as much of the available to be resource to be utilised as possible. This only becomes a problem where processing is delayed due to a lack of resource. You can monitor the overall CPU usage of OnLine via onstat -p Individual VP CPU usage can be monitored via onstat -g glo Overall system CPU usage can be monitored via sar, vmstat, etc. What should you be looking for:
6. Conclusion6.1 Define the problemClearly define objectives, metrics and criteria. 6.2 Tune the applicationIf the application is badly designed / implemented, nothing else matters much! 6.3 Tune OnLineOptimise OnLine for your target workload. 6.4 Tune the system as a wholeEnsure system resources are adequate and correctly allocated. |
|
Украинская баннерная сеть
|