19 Optimizing Your Queries

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

19.
Optimizing Your Queries

MySQL allows us to analyze individual queries to see how long they take and exactly how they are executed on the database. By looking at this information and armed with an understanding of how MySQL tries to optimize your queries for you, you can sometimes improve performance. In this chapter we'll look at the following:

• Finding slow queries

• Benchmarking your queries

• Using the slow query log

• Using EXPLAIN to see how queries are executed

• Understanding MySQL's built-in query optimization

• Optimization tips

Finding Slow Queries

To optimize any application, you must first find out which parts of it take up the most execution time. There are a few ways to do this:

• Observation: Frequently, the reason we notice that optimization is needed is that one particular query runs at the speed of continental drift.

• Benchmarking: Test your application to see which parts of it are slow.

• Slow query log: This log tracks slow queries, as you might expect from its name.

After a slow query has been identified, you can work out how it is being executed by MySQL by using the EXPLAIN statement and can take steps to optimize the query.

It is not usually necessary to attempt to speed up all queries in an application. Generally, only a small portion of your code takes up most of the execution time. Unless your system is very, very busy, the programming time spent attempting to accelerate parts that do not take much time is wasted.

Benchmarking Your Queries

In any kind of programming optimization problem, benchmarking is useful. By benchmarking, we mean specifically timing how long your queries take. This is best done by running a query many times and seeing how long it takes on average. A single execution of the query will be subject to load issues, so it may give unpredictable results. You will also find that the second time you run a query, it should be quicker because the query is cached.

You can, of course, use external scripts or programs to run queries multiple times. For an example, download the source distribution of MySQL and look at the benchmarking code that is included in the sql-bench directory.

You can also test the speed of evaluation of any expression (including a query) in MySQL by using the built-in BENCHMARK() function. For example:

select benchmark(1000000, 6*9);

This will produce something similar to the following results:

1 row in set (0.25 sec)

The function takes two parameters: the number of times to evaluate the expression (in this case, one million) and the expression we want to evaluate (in this case, six times nine).

We do not really want to see the result of the SELECT query. The BENCHMARK() function always returns zero. What we are interested here is the time the query took to execute. In this sample output, you can see that evaluating 6×9 one million times on my system took a quarter of a second.

You can also pass BENCHMARK() a query, for example:

select benchmark(10000000,
'select employee.name, department.name
from employee, department
where employee.departmentID=department.departmentID'),

Using the Slow Query Log

The slow query log can be used to track which of your queries is running slowly. You define what is meant by “slowly,” but it must be measured in whole seconds.

You can turn on slow query logging with the --log-slow-queries=filename option when starting the MySQL server or in your configuration file. If you also turn on the --log-long-format option, all queries that run without using an index are also logged. This can help you to see where you should be targeting your optimization efforts.

You can define what is meant by a slow query by using the long_query_time variable. You can set this in your configuration file or by using the SET command. This variable is in seconds.

You can read the slow query log manually because it's just a text file. You may find it more useful to view a summary of the queries that are running slowly. You can do this by running the mysqldumpslow script (in the scripts directory of your MySQL installation). This is a Perl script, so Windows users note that you will need to install Perl if you do not already have it. (Get it from www.activestate.com.)

One current limitation of MySQL is that you cannot configure it to log slow queries that take less than one second. On systems that process a large number of relatively simple queries, one second would be very, very slow. The administrator might want to be informed when queries take more than one-tenth of a second or some other fraction. This will be addressed in a future version.

Using EXPLAIN to See How Queries Are Executed

The EXPLAIN command tells MySQL to explain to you how a query will be executed. As a very simple example, we might type this:

explain
select e.name, d.name
from employee e, department d
where e.departmentID = d.departmentID;

As you can see, we have simply prefixed an unremarkable query with the word EXPLAIN. This will not actually execute the query, but will instead return some information about how MySQL plans to execute the query. You should see some results similar to the following:

2 rows in set (0.00 sec)

What does this all mean? You can see at a glance that there is one row in the result set per table in the query. (You can see which table is being discussed in the index column.) The order of the rows shows the order in which the tables will be joined.

The columns in this table are as listed here:

• id This is a sequence number. If there is more than one SELECT in a query—for example, if you are using a subquery—each SELECT is numbered.

• select_type This is the type of SELECT statement being performed. Most of the time this will be SIMPLE as in the preceding example, meaning that it's a plain vanilla SELECT. If you are using subqueries, the outer query will be marked PRIMARY and the inner queries will be marked SUBSELECT or DEPENDENT SUBSELECT for correlated subqueries.

• table This is the table this row is about.

• type This is one of the most important columns for optimization. It tells you how the table is being joined to the other tables in the query.

• possible_keys This tells you which indexes could have been used in the query. It will be NULL if there are no relevant indexes.

• key This tells you which index was selected for use in the query. It will be NULL if no index was selected.

• key_len This is the length of the index MySQL decided to use.

• ref This is the value being compared to the key to decide whether to select rows.

• rows This is an estimate of the number of rows from this table MySQL will read to generate the results of the query. You can work out how many rows will be read overall by multiplying the rows' values together. This gives a basic benchmark for how fast the query will run.

• Extra Additional information may be given in this column. For example, the comment Using index means that MySQL can retrieve the result of the query completely from an index without reading data from the table.

So in this case, what is the output telling us?

The join type ALL for the employee table tells us that all rows will be scanned in this table. This is obviously going to be slow if the table has a lot of data in it. In fact, the join type ALL is the worst possible result. You will get this typically if a table has no useful index. The obvious optimization here is to add an index. We'll look at this in a moment.

The row for department has join type eq_ref, which means that a single row will be read from the department table for each row in the employee table. This is one of the best types. The only better values for type are system and const, meaning that the table has only one matching row and can effectively be treated as a constant. This part of the join we are pretty happy with.

These are the other possible values for type:

• ref All rows with matching index values will be read from the table. This is the next best option after eq_ref and represents situations in which you are dealing with non-unique keys.

• range This is not as good as eq_ref or even ref. It means that all rows in a particular range will be read from the table.

• index This is better than ALL, but worse than the other types mentioned previously. Seeing index means that the complete index will be scanned. This is preferable to scanning the complete table, but is far from ideal.

Next, let's look at the values of the possible_keys and key columns. The department table has one option—PRIMARY, the primary key, which is the one used. The employee table has the value NULL in both of these columns, meaning that there's no key to use and therefore no key will be used. Again, this is a pretty strong hint that we should be adding another index!

Based on this information and assuming that we're going to execute this query reasonably frequently and therefore would like it to be faster, we'll create another index as follows:

create index ename_did on employee(name, departmentID);

If we then rerun EXPLAIN, we get the following output:

2 rows in set (0.00 sec)

You can see that there are some changes. The type for employee is now index because we now have an appropriate index to refer to. The new index is listed as a possible key, but it is not actually being used. Under Extra you will see that only the index for this table is used, rather than the table itself. This should be slightly faster.

The most basic use of EXPLAIN is to see where you can make better use of indexes to speed up your queries, but you might discover other approaches to speed up a query.

Understanding MySQL's Built-In Query Optimization

MySQL applies many optimization rules to queries.

MySQL uses its estimated number of rows (as shown in EXPLAIN) to work out the best order in which to join tables. If you notice that its estimate is off, you may want to experiment with using a STRAIGHT JOIN to force the table order. Benchmarking before and after cases will tell you whether you are helping or hindering.

To choose an index, MySQL looks for the most relevant index that spans less than 30% of the rows. If it can't find an index fitting these criteria, the table will be scanned instead. (This was what happened in the EXPLAIN query we looked at earlier, after we added the new index.)

Expressions in WHERE clauses are optimized in a similar way to the way many programming compilers optimize expressions. For example, unnecessary parentheses in expressions are removed. This is one reason you should feel free to make your queries more readable with parentheses.

If a query can be resolved wholly from indexes, it will be done without any reference to the actual rows in the table. Evaluation of COUNT(*) is also evaluated without reading or counting the rows in a table because this data is stored separately.

A more comprehensive list of the optimizations MySQL performs is in the manual, so we have not reproduced it here. But even that list is incomplete. You can read through the source code if you are interested in more information about how the query optimizer works.

Optimization Tips

There are three main things you can do to optimize your queries:

• Add indexes. If you have a frequent query on an unindexed column, add an index. For more information about how MySQL uses indexing, refer to Chapter 18, “Optimizing Your Database.” Remember though that although an appropriate index might speed up operations that need to find data in the table, keeping indexes up-to-date increases the time required to write data. Do not add indexes that will not be used.

• Use ANALYZE TABLE. (See Chapter 18 for syntax.) This updates the information MySQL stores about key distribution. This information is used to decide the order in which tables are joined. If MySQL seems to be joining your tables in a strange order, try ANALYZE TABLE.

• Use OPTIMIZE TABLE. (See Chapter 18 for syntax.) This defragments the table storage, sorts the indexes, and updates the table statistics as used by the query optimizer.

Summary

• Find slow queries using BENCHMARK() or the slow query log.

• See how queries are executed using EXPLAIN.

• Speed things up by adding indexes and recheck with EXPLAIN.

• ANALYZE TABLE and OPTIMIZE TABLE will help the MySQL query optimizer to do its job well.

Quiz

1.	The `BENCHMARK()` function returns a. the result of the benchmarked expression b. the time taken to execute the benchmarked expression c. zero, regardless of the parameters d. none of the above
2.	You run `EXPLAIN` on a query and it tells you that the `type` for one of the tables in the join is `ALL`. This means that a. only one row will be read from the table b. all the rows with matching index values will be read from the table c. all of the index will be scanned d. all of the table will be scanned
3.	You run `EXPLAIN` on a query and it tells you that the `type` for one of the tables in the join is `eq_ref`. This means that a. only one row will be read from the table b. all the rows with matching index values will be read from the table c. all of the index will be scanned d. all of the table will be scanned
4.	You run `EXPLAIN` on a query and it tells you that the `type` for one of the tables in the join is `index`. This means that a. only one row will be read from the table b. all the rows with matching index values will be read from the table c. all of the index will be scanned d. all of the table will be scanned
5.	The output of `EXPLAIN` tells you that MySQL is joining two tables in a suboptimal order. You should a. run `ANALYZE TABLE` on the two tables b. force the join order using `STRAIGHT JOIN` c. either a) or b) d. neither a) nor b)

Exercises

Consider the following query:

select department.name
from client, assignment, employee, department
where client.name='Telco Inc'
and client.clientID = assignment.clientID
and assignment.employeeID = employee.employeeID
and employee.departmentID = department.departmentID;

Use BENCHMARK to time this query over a large number of executions, and use EXPLAIN to see how it is executed. See whether you can increase the speed of execution by adding an index and rebenchmarking.

Answers

Quiz

1.	c
2.	d
3.	a
4.	c
5.	c

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 19 Optimizing Your Queries

Create new playlist

Sign In

Sign Up

19.Optimizing Your Queries

Finding Slow Queries

Benchmarking Your Queries

Using the Slow Query Log

Using EXPLAIN to See How Queries Are Executed

Understanding MySQL's Built-In Query Optimization

Optimization Tips

Summary

Quiz

Exercises

Answers

Quiz

Table of Contents for
19 Optimizing Your Queries

19.
Optimizing Your Queries