ANALYZE

This statement updates the statistics that TiDB builds on tables and indexes. It is recommended to run ANALYZE after performing a large batch update or import of records, or when you notice that query execution plans are sub-optimal.

TiDB will also automatically update its statistics over time as it discovers that they are inconsistent with its own estimates.

Currently, TiDB collects statistical information in two ways: full collection (implemented using the ANALYZE TABLE statement) and incremental collection (implemented using the ANALYZE INCREMENTAL TABLE statement). For detailed usage of these two statements, refer to introduction to statistics

Synopsis

AnalyzeTableStmt

ANALYZE TABLE - 图1

AnalyzeOptionListOpt

ANALYZE TABLE - 图2

AnalyzeOptionList

ANALYZE TABLE - 图3

AnalyzeOption

ANALYZE TABLE - 图4

AnalyzeColumnOption

ANALYZE TABLE - 图5

TableNameList

ANALYZE TABLE - 图6

TableName

ANALYZE TABLE - 图7

ColumnNameList

ANALYZE TABLE - 图8

IndexNameList

ANALYZE TABLE - 图9

PartitionNameList

ANALYZE TABLE - 图10

  1. AnalyzeTableStmt ::=
  2. 'ANALYZE' ( 'TABLE' ( TableNameList ( 'ALL COLUMNS' | 'PREDICATE COLUMNS' ) | TableName ( 'INDEX' IndexNameList? | AnalyzeColumnOption | 'PARTITION' PartitionNameList ( 'INDEX' IndexNameList? | AnalyzeColumnOption )? )? ) | 'INCREMENTAL' 'TABLE' TableName ( 'PARTITION' PartitionNameList )? 'INDEX' IndexNameList? ) AnalyzeOptionListOpt
  3. AnalyzeOptionListOpt ::=
  4. ( WITH AnalyzeOptionList )?
  5. AnalyzeOptionList ::=
  6. AnalyzeOption ( ',' AnalyzeOption )*
  7. AnalyzeOption ::=
  8. ( NUM ( 'BUCKETS' | 'TOPN' | ( 'CMSKETCH' ( 'DEPTH' | 'WIDTH' ) ) | 'SAMPLES' ) ) | ( FLOATNUM 'SAMPLERATE' )
  9. AnalyzeColumnOption ::=
  10. ( 'ALL COLUMNS' | 'PREDICATE COLUMNS' | 'COLUMNS' ColumnNameList )
  11. TableNameList ::=
  12. TableName (',' TableName)*
  13. TableName ::=
  14. Identifier ( '.' Identifier )?
  15. ColumnNameList ::=
  16. Identifier ( ',' Identifier )*
  17. IndexNameList ::=
  18. Identifier ( ',' Identifier )*
  19. PartitionNameList ::=
  20. Identifier ( ',' Identifier )*

Examples

  1. mysql> CREATE TABLE t1 (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, c1 INT NOT NULL);
  2. Query OK, 0 rows affected (0.11 sec)
  1. mysql> INSERT INTO t1 (c1) VALUES (1),(2),(3),(4),(5);
  2. Query OK, 5 rows affected (0.03 sec)
  3. Records: 5 Duplicates: 0 Warnings: 0
  1. mysql> ALTER TABLE t1 ADD INDEX (c1);
  2. Query OK, 0 rows affected (0.30 sec)
  1. mysql> EXPLAIN SELECT * FROM t1 WHERE c1 = 3;
  2. +------------------------+---------+-----------+------------------------+---------------------------------------------+
  3. | id | estRows | task | access object | operator info |
  4. +------------------------+---------+-----------+------------------------+---------------------------------------------+
  5. | IndexReader_6 | 10.00 | root | | index:IndexRangeScan_5 |
  6. | └─IndexRangeScan_5 | 10.00 | cop[tikv] | table:t1, index:c1(c1) | range:[3,3], keep order:false, stats:pseudo |
  7. +------------------------+---------+-----------+------------------------+---------------------------------------------+
  8. 2 rows in set (0.00 sec)

The status of the current statistics is pseudo, which means the statistics is inaccurate.

  1. mysql> ANALYZE TABLE t1;
  2. Query OK, 0 rows affected (0.13 sec)
  3. mysql> EXPLAIN SELECT * FROM t1 WHERE c1 = 3;
  4. +------------------------+---------+-----------+------------------------+-------------------------------+
  5. | id | estRows | task | access object | operator info |
  6. +------------------------+---------+-----------+------------------------+-------------------------------+
  7. | IndexReader_6 | 1.00 | root | | index:IndexRangeScan_5 |
  8. | └─IndexRangeScan_5 | 1.00 | cop[tikv] | table:t1, index:c1(c1) | range:[3,3], keep order:false |
  9. +------------------------+---------+-----------+------------------------+-------------------------------+
  10. 2 rows in set (0.00 sec)

The statistics is now correctly updated and loaded.

MySQL compatibility

TiDB differs from MySQL in both the statistics it collects and how it makes use of statistics during query execution. While this statement is syntactically similar to MySQL, the following differences apply:

  1. TiDB might not include very recently committed changes when running ANALYZE TABLE. After a batch-update of rows, you might need to sleep(1) before executing ANALYZE TABLE in order for the statistics update to reflect these changes. #16570.
  2. ANALYZE TABLE takes significantly longer to execute in TiDB than MySQL. This performance difference can be partially mitigated by enabling fast analyze with SET GLOBAL tidb_enable_fast_analyze=1. Fast analyze makes use of sampling, leading to less accurate statistics. Its usage is still considered experimental.

MySQL does not support the ANALYZE INCREMENTAL TABLE statement. TiDB supports incremental collection of statistics. For detailed usage, refer to incremental collection.

See also