2.6.1.10. Database query optimization
This article explains the logic of optimizing database queries, since most programmers test their programs with a small number of records in the tables, and the site owner's problems begin later, when he fills the product catalogs.
For example, there is a request:
SELECT p.product_id, (SELECT AVG(rating) AS total FROM mc_review r1 WHERE r1.product_id = p.product_id AND r1.STATUS ='1' GROUP BY r1.product_id) AS rating FROM mc_product p LEFT JOIN mc_product_description pd ON (p.product_id = pd.product_id) LEFT JOIN mc_product_to_store p2s ON (p.product_id = p2s.product_id) WHERE pd.language_id = '2' AND p.STATUS = '1' AND p.date_available <= NOW() AND p2s.store_id = '0' AND p.product_id IN (SELECT pt.product_id FROM mc_product_tag pt WHERE pt.language_id = '2' AND LOWER(pt.tag) LIKE '%roxolana%') ORDER BY rating ASC LIMIT 0,20
If the request is executed with the condition EXPLAIN
at the beginning, we will get the query execution scheme:
Sample type | Table | Type | Possible keys | Key | Key length | Link | Strings | Add. information |
---|---|---|---|---|---|---|---|---|
PRIMARY | p | ALL | PRIMARY | 2907 | Using where; Using filesort |
|||
PRIMARY | pd | eq_ref | PRIMARY | PRIMARY | 8 | mebelnyc_db.p.product_id, const | 1 | Using where; Using index |
PRIMARY | p2s | eq_ref | PRIMARY | PRIMARY | 8 | mebelnyc_db.p.product_id, const | 1 | Using where; Using index |
DEPENDENT SUBQUERY | pt | ALL | 6803 | Using where | ||||
DEPENDENT SUBQUERY | r1 | ref | product_id | product_id | 4 | mebelnyc_db.p.product_id | 1 | Using where |
- If we remove from this query the condition
LIMIT
then it will return 2907 records. It is 2907 times that the nested in the condition will be executedSELECT
inquiry. If this part of the query is put into a separate query, then it will reduce the load on the database by 2907/20 = 145 times. Although, judging by the name of the request, we can conclude that in such an interesting way the author of the program tries to count product statistics every time a visitor visits the site, which can be recalculated, for example, once a day or even better — when adding a review to the product and added to a separate column of the tablemc_product
to get rid of this subquery. - In condition
WHERE
we see a subquery that is executed in the conditionIN
... If the author of the program in the conditionIN
specified not a subquery, but just static values, for exampleIN (121, 1235, 43554)
then MySQL would use the index and run quickly. But with nested queries, the situation is completely different — MySQL executes them without using indexes, or rather like this —FIN_IN_SET(p.product_id, '121,1235,43554')
... In such cases, you need to write the request separately, and then substitute the result of its execution into the conditionIN
.