18.104.22.168. Database optimization
Using examples of errors encountered by hosting users, we will consider how you can optimize the database.
An example of a table that consists of more than 40,000 records and to which requests are sent, which create a considerable load on the server, since there are a lot of them:
CREATE TABLE `links` ( `id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT, `source` INT(11) UNSIGNED NOT NULL, `category` INT(11) UNSIGNED NOT NULL, `title` text NOT NULL, `description` text NOT NULL, `text` text NOT NULL, `link` text NOT NULL, `publication` datetime NOT NULL, `scan` datetime NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `unique` (`id`) USING BTREE, KEY `normal` (`source`,`category`,`title`(100),`publication`,`description`(100),`text`(100),`scan`) USING BTREE ) ENGINE=InnoDB AUTO_INCREMENT=40000 DEFAULT CHARSET=utf8
An example of incoming requests:
SELECT COUNT(*) FROM links WHERE link = 'http://www.......com.ua/31-08-2014/trolleybook-podgotovil-prazdnichnuyu-programmu-k-limba-noastr-/n96349/' AND UNIX_TIMESTAMP(publication) = 1409472019
By executing a request with the prefix
EXPLAIN, we see that the entire table is scanned to get the results — about 40 thousand rows. To optimize the work of the table, add an index on the field
publication. This operation will already be enough to increase the speed of work and reduce the load on the MySQL server processor.
Adding an indexAn index can be added in any of the following ways:
- Through SQLqueries:
ALTER TABLE `table_name` ADD INDEX(`column_name`);
In the request:
table_name- the name of the table,
column_name— the name of the field for which you want to add an index.
As a supplement, we recommend:
- Change field type
- Remove index
UNIQon the field
id... There is no need for it, since this function is already performed by the index.
PRIMARYon the same field.
- Index with title
normaleven more doubtful, since the server has to work hard to create it. It is simply huge, and it is extremely unlikely that the search occurs in all the fields that are specified in it. Most likely it is «dead» an index that hinders rather than helps. Developers should be aware that the index will only be used if the request contains fields that are in it from left to right. So, for example, this index will be used if in the condition
WHEREfields will be specified
source... If the field
sourcewill not be specified in the request, then the index will not be used.
And finally, there is a question regarding the need to use the request itself
COUNT(*). It is very likely that its author wants in this way to obtain information about the presence of a record in the table, and not the total number of records in the table with the specified URL for the specified date. In this case, it is correct to use the query without
COUNT(*) with the condition
SELECT id FROM links WHERE link = 'http://www.......com.ua/31-08-2014/trolleybook-podgotovil-prazdnichnuyu-programmu-k-limba-noastr-/n96349/' AND UNIX_TIMESTAMP(publication) = 1409472019 LIMIT 1
In this case, the database will stop the search when the first record is found, and in the case of
COUNT will scan the entire table.
PS Despite the fact that there is a search in two columns in the SQL query
link, the field was not added to the index
link... This is due to the fact that for this table the field
publication with a date is pretty unique and will contain at most a few lines with different
link... Therefore, the overhead of forming an index for the field
link significantly exceeds the cost of scanning multiple rows at a single index