网站运营 | 站长学院 | 技术文档 | 成语 | 歇后语 | 桌面壁纸 | 帝国时代 | 代码收藏 | IP地址查询 | 生活百科 | 生日密码 | CSS压缩 | 用户评论

Table Design and MySQL Index Details

【 hackmysql.com 更新时间:2011-04-16 | 字体:
[导读]It's not often I get to work with a true homemade database design. In this case the customer (or their developer to be accurate) had designed the whole database backend of their website from scratch....
It's not often I get to work with a true homemade database design. In this case the customer (or their developer to be accurate) had designed the whole database backend of their website from scratch. However, problems arose as usual: An overly powerful server seeing sustained high loads and CPU usage by MySQL—which is actually an understatement given that the load would spike to 50+. After enabling slow query logging and examining the log a day later one query stood out in terms of frequency:
# Query_time: 7 Lock_time: 1 Rows_sent: 2 Rows_examined: 45023
SELECT * FROM hints WHERE game_id = 374 ORDER BY date DESC;
Seven seconds is bad for a query this simple, as is examining 45,000 rows to return 2. Furthermore, this query was logged over 4,000 times in one day, meaning MySQL spent about 28,000 seconds a day on this query, or 7 hours. Let's first consider design and later we'll see how it effects the details of indexes. (You may notice the numbers between examples don't match; this is because some of the examples were really taken from the production server and others were re-created lab experiements.)
      mysql> DESCRIBE hints;
      +------------+------------------+------+-----+---------+----------------+
      | Field      | Type             | Null | Key | Default | Extra          |
      +------------+------------------+------+-----+---------+----------------+
      | hints_id   | int(10) unsigned |      | PRI | NULL    | auto_increment |
      | game_id    | int(11)          |      |     | 0       |                |
      | hint_title | text             |      |     |         |                |
      | hint       | text             |      |     |         |                |
      | status     | text             |      |     |         |                |
      | date       | text             |      |     |         |                |
      +------------+------------------+------+-----+---------+----------------+
   
Unless there's some special condition I was not made aware of, the date column should not be type TEXT. What complicates the issue more is that the data in the date column is not normalized. That is, there are different representations of data (dates) in the column: Some dates are typical YYYY-MM-DD format and others are Unix timestamps. The first issue this creates is increased code complexity: Having to account for data that may be in different formats. Secondly, it's space inefficient. A date as TEXT will require 10 + 2 bytes or, times 45,000 rows, 540k. As a DATE column type, 3 bytes or 135k. The third issue involves fixinig the query: How to index a date column as TEXT. Indexes on TEXT columns require a prefix length; that is, since TEXT columns are truly variable you have to tell MySQL how much of it you want to index. In this case, since the data is not normalized and everything counts down to the last second for dates and times, we would have to index 10 bytes, another 450k for the index (minus compression). Finally, it's error prone: A DATE column type will reliaibly have a date. A TEXT column may have a DATE and a recipe for biscuits. Perhaps I'm just being overly zealous and picky; afterall, what's another 450k here or there, or a little more code to determine if the date is DATE or a Unix timestamp? Nothing, actually, but the point is: Why add these complications when they're completely unnecessary? Just use a DATE colum type. In short: KISS.

Index Details

      mysql> SHOW INDEXES FROM hints;
      +-------+------------+----------+--------------+-------------+-------------+----------+
      | Table | Non_unique | Key_name | Seq_in_index | Column_name | Cardinality | Sub_part |
      +-------+------------+----------+--------------+-------------+-------------+----------+
      | hints |          0 | PRIMARY  |            1 | hints_id    |       45021 |     NULL |
      +-------+------------+----------+--------------+-------------+-------------+----------+

      mysql> EXPLAIN SELECT * FROM hints WHERE game_id = 374 ORDER BY date DESC;
      +-------+------+---------------+------+---------+------+-------+-----------------------------+
      | table | type | possible_keys | key  | key_len | ref  | rows  | Extra                       |
      +-------+------+---------------+------+---------+------+-------+-----------------------------+
      | hints | ALL  | NULL          | NULL |    NULL | NULL | 45021 | Using where; Using filesort |
      +-------+------+---------------+------+---------+------+-------+-----------------------------+
   
MySQL executes the query by doing a full table scan (type: ALL) plus a filesort (caused by ORDER BY). Not surprising since there are no indexes MySQL can use (possible_keys: NULL). We at least want to prevent a table scan and reduce the number of rows examined. To do this we can create an index, but the result isn't going to be as spectacular as in was in Case 1: Basic Indexes:
      mysql> CREATE INDEX game_id_date ON hints (game_id, date(10));
      Query OK, 1001 rows affected (0.04 sec)
      Records: 1001  Duplicates: 0  Warnings: 0
      
      mysql> EXPLAIN SELECT * FROM hints WHERE game_id = 374 ORDER BY date DESC;
      +-------+------+---------------+--------------+---------+-------+------+-----------------------------+
      | table | type | possible_keys | key          | key_len | ref   | rows | Extra                       |
      +-------+------+---------------+--------------+---------+-------+------+-----------------------------+
      | hints | ref  | game_id_date  | game_id_date |       5 | const |    1 | Using where; Using filesort |
      +-------+------+---------------+--------------+---------+-------+------+-----------------------------+
   
Success: No table scan and only one row examined. However, Extra doesn't say "Using index" even though our multi-column key includes both game_id and date. MySQL won't retrieve the values from the index because "SELECT * " requires more values: It requires all 6 columns for every matching row but the index only has 2 columns. In Case 1 only columns that were part of the key were included in the SELECT criteria, which is why things worked out better there than here. If we alter the table we can at least get rid of the filesort. The filesort is caused by date being TEXT. If we drop the index, normalize the data, change the column type to DATE, and re-add the index (these commands left out for brevity):
      mysql> EXPLAIN SELECT * FROM hints WHERE game_id = 374 ORDER BY date DESC;
      +-------+------+---------------+--------------+---------+-------+------+-------------+
      | table | type | possible_keys | key          | key_len | ref   | rows | Extra       |
      +-------+------+---------------+--------------+---------+-------+------+-------------+
      | hints | ref  | game_id_date  | game_id_date |       5 | const |    1 | Using where |
      +-------+------+---------------+--------------+---------+-------+------+-------------+
   
Not a slow query anymore. And it goes to show that using efficient column types is important, unless you like filesorts.

A Bigger, Slower Query

The following query was logged about 3,700 in the same day:
# Query_time: 8 Lock_time: 0 Rows_sent: 1 Rows_examined: 303908
SELECT * FROM links WHERE
   game_id = 5 OR (other_id = 200 AND class = 'articles') AND
   has_link = 'true' LIMIT 0,1;
Of the slow queries logged (there were four total), this query was first in terms of the number of rows examined (the previous query we looked at was second). Let's start by looking at the existing indexes and EXPLAIN (the table structure won't help us and it's sufficient to say that any column ending in "_id" is an INTEGER and all others are, unfortunately, TEXT):
      mysql> SHOW INDEXES FROM links;
      +-------+------------+----------+--------------+-------------+-------------+----------+
      | Table | Non_unique | Key_name | Seq_in_index | Column_name | Cardinality | Sub_part |
      +-------+------------+----------+--------------+-------------+-------------+----------+
      | links |          0 | PRIMARY  |            1 | links_id    |      307551 |     NULL |
      | links |          1 | other_id |            1 | other_id    |        NULL |     NULL |
      | links |          1 | game_id  |            1 | game_id     |        NULL |     NULL |
      +-------+------------+----------+--------------+-------------+-------------+----------+

      mysql> EXPLAIN SELECT * FROM links WHERE
          -> game_id = 5 OR (other_id = 200 AND class = 'articles') AND
          -> has_link = 'true' LIMIT 0,1;
      +-------+------+------------------+------+---------+------+--------+-------------+
      | table | type | possible_keys    | key  | key_len | ref  | rows   | Extra       |
      +-------+------+------------------+------+---------+------+--------+-------------+
      | links | ALL  | other_id,game_id | NULL |    NULL | NULL | 307551 | Using where |
      +-------+------+------------------+------+---------+------+--------+-------------+
   
Of course EXPLAIN predicts a full table scan (type: ALL) because there's no suitable key to do otherwise (key: NULL). Something has to be done about this query but simply adding an index like we usually do isn't so easy anymore because there's multiple conditions. In this case we get lucky:
      mysql> SELECT DISTINCT(has_link) FROM links;
      +----------+
      | has_link |
      +----------+
      | true     |
      +----------+
   
Ah ha! If every has_link = 'true' then we probably don't need that AND condition. (Not to mention, referring to effecient column types again, has_link as TEXT 1.8 Megs, as BOOL (TINYINT) 308k.) Now the query becomes:
SELECT * FROM links WHERE game_id = 5 OR (other_id = 200 AND class = 'articles') LIMIT 0,1;
We would like MySQL to use an index to find matching rows (avoid a table scan) but this won't be possible given the OR conditions. For it to be possible, the conditions have to form a leftmost prefix of an index. In this case there are two conditions, one on each side of OR. You can't have one leftmost prefix OR another leftmost prefix for the same index. In other words, an index can't start with game_id OR other_id. Therefore, MySQL 4.x cannot use an index with this query. We're left with two options: Reduce the query even further or use MySQL 5.x.

In this case we can simply make the query into two queries:
      mysql> EXPLAIN SELECT * FROM links WHERE game_id = 5 LIMIT 0,1;
      +-------+------+---------------+------+---------+-------+------+-------------+
      | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
      +-------+------+---------------+------+---------+-------+------+-------------+
      | links | ref  | gid           | gid  |       5 | const |   36 | Using where |
      +-------+------+---------------+------+---------+-------+------+-------------+

      mysql> EXPLAIN SELECT * FROM links WHERE other_id = 200 AND class = 'articles' LIMIT 0,1;
      +-------+------+---------------+------+---------+-------+------+-------------+
      | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
      +-------+------+---------------+------+---------+-------+------+-------------+
      | links | ref  | oid           | oid  |       5 | const |    1 | Using where |
      +-------+------+---------------+------+---------+-------+------+-------------+
   
This is a much better approach: These two queries can execute in a fraction of the time than the original. And since either or will match what we're looking for we can try the first query, if it matches we're done, if not try the second.

Or we can use the index merge ability in MySQL 5. Don't feel alone if you've ever wanted to yell at MySQL 4 for not just using multiple indexes when it was clear that was the solution. MySQL 5 has heard our lament:
      mysql> EXPLAIN SELECT * FROM links WHERE game_id = 5 OR (other_id = 200 AND class = 'articles') LIMIT 0,1;
      +----+-------------+-------+-------------+---------------+---------+---------+------+------+-------------+
      | id | select_type | table | type        | possible_keys | key     | key_len | ref  | rows | Extra       |
      +----+-------------+-------+-------------+---------------+---------+---------+------+------+-------------+
      |  1 | SIMPLE      | links | index_merge | gid,oid       | gid,oid | 5,5     | NULL |   37 | Using where |
      +----+-------------+-------+-------------+---------------+---------+---------+------+------+-------------+
   
Notice in the previous two queries the first matched 36 rows, the second 1 for 37 total and this index merge matches 37 rows—fun stuff huh? Makes you feel better that "it all adds up."

Final Thoughts

The customer had the correct idea: Redesign the whole database. Although the queries could have been fixed well enough to keep the server load reasonable, this is a case where form hinders function. A well designed database—following normal form rules, normalizing data, using efficient column types, reducing query conditions—is much easier to optimze.

原文:http://www.chinawobo.com/case2

友荐云推荐
  • 转载请注明来源:网站运营 网址:http://www.chinawobo.com/ 向您的朋友推荐此文章
  • 特别声明: 本站除部分特别声明禁止转载的专稿外的其他文章可以自由转载,但请务必注明出处和原始作者。文章版权归文章原始作者所有<。对于被本站转载文章的个人和网站<,我们表示深深的谢意。如果本站转载的文章有版权问题请联系我们,我们会尽快予以更正<<。
RSS订阅
  • QQ邮箱
  • 填写您的邮件地址,订阅我们的精彩内容:
更多
© 2014 网站运营 - T086.com(原itlearner.com)
  • 歹徒抢劫超市前请人占卜 梦到头发着火代表会有钱? 2018-05-27
  • 薪资倒退至17年前水平 台北网友叹买房困难 2018-05-27
  • 四川馆藏国家重点档案195万余卷 2018-05-27
  • 张天爱变胖了?淡雅甜笑险露双下巴 2018-05-26
  • 论对汉文化东传的基本认识 2018-05-26
  • 4月台州全市新建商品住宅成交3444套 2018-05-25
  • 河北盐山:“点、线、面”结合推进人才强县 2018-05-25
  • 北京市幼儿园年内完成全员培训 2018-05-25
  • 河南职业技术学院全国技能(导游类)大赛获佳绩 2018-05-24
  • Футбол -- Арены ЧМ-2018 -- Стадион Нижний Новгород 2018-05-24
  • 美媒:美国对华战略缺乏头脑 2018-05-23
  • 中国钢企布局沿海 柳钢瞄准东盟经济圈 2018-05-23
  • 2018的偶像元年是否只是一场美丽的谎言?偶像练习生创造101选秀 2018-05-22
  • 上海临港有个“儿童大学” 2018-05-20
  • 2017年世界大学学术排名发布 清华进前50大学学术排名大学排行榜 2018-05-20
  • 微商货源 | 冠珠陶瓷 | 6688电视家官网 | 中康体检网 | 安徽极热网 | 北京赛车开奖号码 | 快猴网 | 迪威乐云商devmsn | 易奇八字 | wwe美国职业摔角 | 八字算命 | 河南旅游景点大全 | 冠珠陶瓷 | 广东旅游景点大全 |
    RunTime:18.63ms QueryTime:7