MySQL 故障诊断:一个 ALTER TALBE 执行了很久,你慌不慌?¶
先了解下 MySQL 数据字典¶
当我们对一张大表执行了一个 ALTER TABLE 操作,执行了很久,也不知道是否执行完成,进程在那挂着,此时的你,干瞪眼,进度看不到,进程不敢杀,就问你慌不慌?
在具体解决这个慌乱之前,我们先了解下 MySQL 的数据字典。
数据字典我们用一句话来概括,就是数据的数据,用于查询数据库中数据的信息内容。
MySQL 在 5.7 开始,对数据字典的使用有了很大的改进,使用上更加的方便,提供的能力也更高。performance_schema 可以查询事务信息、获取元数据锁、跟踪事件、统计内存使用情况等等。
到这里你是不是发现了什么?
performance_schema 这个是什么,是不是用它来让我们解除慌张?
有关 MySQL 数据字典,可以看我的另外一个 Chat:
[MySQL 地基基础:数据字典](./MySQL 地基基础:数据字典.md)
使用一些重要的重要功能¶
既然了解到 MySQL 数据字典可以帮助我们,那它是如何实现呢?
我们先看看官网是如何解释的
https://dev.mysql.com/doc/refman/5.7/en/monitor-alter-table-performance-schema.html
You can monitor ALTER TABLE progress for InnoDB tables using Performance Schema.
There are seven stage events that represent different phases of ALTER TABLE. Each stage event reports a running total of WOR_COMPLETED and WORK_ESTIMATED for the overall ALTER TABLE operation as it progresses through its different phases. WORK_ESTIMATED is calculated using a formula that takes into account all of the work that ALTER TABLE performs, and may be revised during ALTER TABLE processing. WORK_COMPLETED and WORK_ESTIMATED values are an abstract representation of all of the work performed by ALTER TABLE.
大意就是我们可以在 Performance Schema 看到 ALTER TABLE 的进度,包括执行的时间,大概剩余的时间。
想要使用这个能力,我们需要开启几个功能。
Enable the stage/innodb/alter% instruments
-> SET ENABLED = 'YES' -> WHERE NAME LIKE 'stage/innodb/alter%';
Enable the stage event consumer tables
-> SET ENABLED = 'YES' -> WHERE NAME ='events_stages_current';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> UPDATE performance_schema.setup_consumers
-> SET ENABLED = 'YES' -> WHERE NAME ='events_stages_history';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> UPDATE performance_schema.setup_consumers
-> SET ENABLED = 'YES' -> WHERE NAME ='events_stages_history_long';
功能开启了,接下来我们进行直观的验证环节。
直观的观察事件执行进度¶
首先,我们有一张大表(你可以用 SysBench 建一个,或者其他各种途径都可以),这里我已经有一张大表 sbtest.sbtest1,表结构如下:
mysql> desc sbtest.sbtest1;
+-------+-----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| k | int(11) | NO | MUL | 0 | |
| c | char(120) | NO | | | |
| pad | char(60) | NO | | | |
+-------+-----------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
数据量 500W:
mysql> select count(\*) from sbtest.sbtest1;
+----------+
| count(\*) |
+----------+
| 5000000 |
+----------+
1 row in set (0.67 sec)
新增一个字段:
重头戏来了,查看一下进度:
mysql> select * from performance_schema.events_stages_current\\G;
**** **** **** **** **** **** ***1. row** **** **** **** **** **** **** *
THREAD_ID: 28 EVENT_ID: 14 END_EVENT_ID: NULL EVENT_NAME: stage/innodb/alter table (read PK and internal sort) SOURCE: TIMER_START: 159726265417733000 TIMER_END: 159819571346680000 TIMER_WAIT: 93305928947000 WORK_COMPLETED: 118256 WORK_ESTIMATED: 302958
NESTING_EVENT_ID: 13
NESTING_EVENT_TYPE: STATEMENT
1 row in set (0.00 sec)
......
mysql> select * from performance_schema.events_stages_current\\G;
**** **** **** **** **** **** ***1. row** **** **** **** **** **** **** *
THREAD_ID: 28 EVENT_ID: 14 END_EVENT_ID: NULL EVENT_NAME: stage/innodb/alter table (read PK and internal sort) SOURCE: TIMER_START: 159726265417733000 TIMER_END: 159910492100061000 TIMER_WAIT: 184226682328000 WORK_COMPLETED: 230688 WORK_ESTIMATED: 302958
多执行几次,发现数据是有变化的,这些内容代表了什么呢? * THREAD_ID:线程 ID * EVENT_ID:事件 ID * END_EVENT_ID:结束事件 ID * EVENT_NAME:事件名称,说明了当前执行的事件 * SOURCE:源码位置 * TIMER_START:事件开始时间(皮秒) * TIMER_END:事件结束时间(皮秒,如果没有执行完成,时间就是当前之间) * TIMER_WAIT:事件等待事件(皮秒) * WORK_COMPLETED:任务完成情况 * WORK_ESTIMATED:任务估算情况 * NESTING_EVENT_ID:事件对应的父事件 ID * NESTING_EVENT_TYPE:父事件类型(STATEMENT、STAGE、WAIT)
收下这个常用的 SQL¶
- 查看事件任务完成情况:
mysql> SELECT pt.INFO, ec.THREAD_ID, ec.EVENT_NAME, ec.WORK_COMPLETED, ec.WORK_ESTIMATED, pt.STATE FROM performance_schema.events_stages_current ec left join performance_schema.threads th on ec.thread_id = th.thread_id left join information_schema.PROCESSLIST pt on th.PROCESSLIST_ID = pt.ID where pt.INFO like 'ALTER%';
+-------------------------------------------+-----------+------------------------------------------------------+----------------+----------------+----------------+
| INFO | THREAD_ID | EVENT_NAME | WORK_COMPLETED | WORK_ESTIMATED | STATE |
+-------------------------------------------+-----------+------------------------------------------------------+----------------+----------------+----------------+
| alter table sbtest.sbtest1 add d char(20) | 28 | stage/innodb/alter table (read PK and internal sort) | 201496 | 308223 | altering table |
+-------------------------------------------+-----------+------------------------------------------------------+----------------+----------------+----------------+
1 row in set (0.25 sec)
- 查看任务完成事件:
mysql> select stmt.sql_text as sql_text, concat(work_completed, '/' , work_estimated) as progress, (stage.timer_end - stmt.timer_start) / 1e12 as current_seconds, (stage.timer_end - stmt.timer_start) / 1e12 * (work_estimated-work_completed) / work_completed as remaining_seconds from performance_schema.events_stages_current stage, performance_schema.events_statements_current stmt where stage.thread_id = stmt.thread_id and stage.nesting_event_id = stmt.event_id;
+-------------------------------------------+---------------+-----------------+--------------------+
| sql_text | progress | current_seconds | remaining_seconds |
+-------------------------------------------+---------------+-----------------+--------------------+
| alter table sbtest.sbtest1 add d char(20) | 135192/308223 | 102.461512532 | 131.13954949201502 |
+-------------------------------------------+---------------+-----------------+--------------------+
1 row in set (0.00 sec)
总结¶
这样我们通过 MySQL 的这个数据字典就可以很直观地看到 ALTER 的执行情况了,当你看到这样的执行进度,是不是就不那么慌了。 ```