Dynamic Partition

Dynamic partition is a new feature introduced in Doris version 0.12. It’s designed to manage partition’s Time-to-Life (TTL), reducing the burden on users.

At present, the functions of dynamically adding partitions and dynamically deleting partitions are realized.

Dynamic partitioning is only supported for Range partitions.

Node: This feature will be disabled when synchronized by CCR. If this table is copied by CCR, that is, PROPERTIES contains is_being_synced = true, it will be displayed as enabled in show create table, but will not actually take effect. When is_being_synced is set to false, these features will resume working, but the is_being_synced property is for CCR peripheral modules only and should not be manually set during CCR synchronization.

Noun Interpretation

  • FE: Frontend, the front-end node of Doris. Responsible for metadata management and request access.
  • BE: Backend, Doris’s back-end node. Responsible for query execution and data storage.

Principle

In some usage scenarios, the user will partition the table according to the day and perform routine tasks regularly every day. At this time, the user needs to manually manage the partition. Otherwise, the data load may fail because the user does not create a partition. This brings additional maintenance costs to the user.

Through the dynamic partitioning feature, users can set the rules of dynamic partitioning when building tables. FE will start a background thread to create or delete partitions according to the rules specified by the user. Users can also change existing rules at runtime.

Usage

Establishment of tables

The rules for dynamic partitioning can be specified when the table is created or modified at runtime. Currently,dynamic partition rules can only be set for partition tables with single partition columns.

  • Specified when creating table

    1. CREATE TABLE tbl1
    2. (...)
    3. PROPERTIES
    4. (
    5. "dynamic_partition.prop1" = "value1",
    6. "dynamic_partition.prop2" = "value2",
    7. ...
    8. )
  • Modify at runtime

    1. ALTER TABLE tbl1 SET
    2. (
    3. "dynamic_partition.prop1" = "value1",
    4. "dynamic_partition.prop2" = "value2",
    5. ...
    6. )

Dynamic partition rule parameters

The rules of dynamic partition are prefixed with dynamic_partition.:

  • dynamic_partition.enable

    Whether to enable the dynamic partition feature. Can be specified as TRUE or FALSE. If not filled, the default is TRUE. If it is FALSE, Doris will ignore the dynamic partitioning rules of the table.

  • dynamic_partition.time_unit(required parameters)

    The unit for dynamic partition scheduling. Can be specified as HOUR,DAY, WEEK, MONTH and YEAR, means to create or delete partitions by hour, day, week, month and year, respectively.

    When specified as HOUR, the suffix format of the dynamically created partition name is yyyyMMddHH, for example, 2020032501. When the time unit is HOUR, the data type of partition column cannot be DATE.

    When specified as DAY, the suffix format of the dynamically created partition name is yyyyMMdd, for example, 20200325.

    When specified as WEEK, the suffix format of the dynamically created partition name is yyyy_ww. That is, the week of the year of current date. For example, the suffix of the partition created for 2020-03-25 is 2020_13, indicating that it is currently the 13th week of 2020.

    When specified as MONTH, the suffix format of the dynamically created partition name is yyyyMM, for example, 202003.

    When specified as YEAR, the suffix format of the dynamically created partition name is yyyy, for example, 2020.

  • dynamic_partition.time_zone

    The time zone of the dynamic partition, if not filled in, defaults to the time zone of the current machine’s system, such as Asia/Shanghai, if you want to know the supported TimeZone, you can found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones.

  • dynamic_partition.start

    The starting offset of the dynamic partition, usually a negative number. Depending on the time_unit attribute, based on the current day (week / month), the partitions with a partition range before this offset will be deleted. If not filled, the default is -2147483648, that is, the history partition will not be deleted.

  • dynamic_partition.end(required parameters)

    The end offset of the dynamic partition, usually a positive number. According to the difference of the time_unit attribute, the partition of the corresponding range is created in advance based on the current day (week / month).

  • dynamic_partition.prefix(required parameters)

    The dynamically created partition name prefix.

  • dynamic_partition.buckets

    The number of buckets corresponding to the dynamically created partitions.

  • dynamic_partition.replication_num

    The replication number of dynamic partition.If not filled in, defaults to the number of table’s replication number.

  • dynamic_partition.start_day_of_week

    When time_unit is WEEK, this parameter is used to specify the starting point of the week. The value ranges from 1 to 7. Where 1 is Monday and 7 is Sunday. The default is 1, which means that every week starts on Monday.

  • dynamic_partition.start_day_of_month

    When time_unit is MONTH, this parameter is used to specify the start date of each month. The value ranges from 1 to 28. 1 means the 1st of every month, and 28 means the 28th of every month. The default is 1, which means that every month starts at 1st. The 29, 30 and 31 are not supported at the moment to avoid ambiguity caused by lunar years or months.

  • dynamic_partition.create_history_partition

    The default is false. When set to true, Doris will automatically create all partitions, as described in the creation rules below. At the same time, the parameter max_dynamic_partition_num of FE will limit the total number of partitions to avoid creating too many partitions at once. When the number of partitions expected to be created is greater than max_dynamic_partition_num, the operation will fail.

    When the start attribute is not specified, this parameter has no effect.

  • dynamic_partition.history_partition_num

    When create_history_partition is true, this parameter is used to specify the number of history partitions. The default value is -1, which means it is not set.

  • dynamic_partition.hot_partition_num

    Specify how many of the latest partitions are hot partitions. For hot partition, the system will automatically set its storage_medium parameter to SSD, and set storage_cooldown_time.

    Note: If there is no SSD disk path under the storage path, configuring this parameter will cause dynamic partition creation to fail.

    hot_partition_num is all partitions in the previous n days and in the future.

    Let us give an example. Suppose today is 2021-05-20, partition by day, and the properties of dynamic partition are set to: hot_partition_num=2, end=3, start=-3. Then the system will automatically create the following partitions, and set the storage_medium and storage_cooldown_time properties:

    1. p20210517: ["2021-05-17", "2021-05-18") storage_medium=HDD storage_cooldown_time=9999-12-31 23:59:59
    2. p20210518: ["2021-05-18", "2021-05-19") storage_medium=HDD storage_cooldown_time=9999-12-31 23:59:59
    3. p20210519: ["2021-05-19", "2021-05-20") storage_medium=SSD storage_cooldown_time=2021-05-21 00:00:00
    4. p20210520: ["2021-05-20", "2021-05-21") storage_medium=SSD storage_cooldown_time=2021-05-22 00:00:00
    5. p20210521: ["2021-05-21", "2021-05-22") storage_medium=SSD storage_cooldown_time=2021-05-23 00:00:00
    6. p20210522: ["2021-05-22", "2021-05-23") storage_medium=SSD storage_cooldown_time=2021-05-24 00:00:00
    7. p20210523: ["2021-05-23", "2021-05-24") storage_medium=SSD storage_cooldown_time=2021-05-25 00:00:00
  • dynamic_partition.reserved_history_periods

    The range of reserved history periods. It should be in the form of [yyyy-MM-dd,yyyy-MM-dd],[...,...] while the dynamic_partition.time_unit is “DAY, WEEK, MONTH and YEAR”. And it should be in the form of [yyyy-MM-dd HH:mm:ss,yyyy-MM-dd HH:mm:ss],[...,...] while the dynamic_partition.time_unitis "HOUR". And no more spaces expected. The default value is“NULL”`, which means it is not set.

    Let us give an example. Suppose today is 2021-09-06,partitioned by day, and the properties of dynamic partition are set to:

    time_unit="DAY/WEEK/MONTH/YEAR", end=3, start=-3, reserved_history_periods="[2020-06-01,2020-06-20],[2020-10-31,2020-11-15]".

    The system will automatically reserve following partitions in following period :

    1. ["2020-06-01","2020-06-20"],
    2. ["2020-10-31","2020-11-15"]

    or

    time_unit="HOUR", end=3, start=-3, reserved_history_periods="[2020-06-01 00:00:00,2020-06-01 03:00:00]".

    The system will automatically reserve following partitions in following period :

    1. ["2020-06-01 00:00:00","2020-06-01 03:00:00"]

    Otherwise, every [...,...] in reserved_history_periods is a couple of properties, and they should be set at the same time. And the first date can’t be larger than the second one.

  • dynamic_partition.storage_medium

    SinceVersion 1.2.3

    Specifies the default storage medium for the created dynamic partition. HDD is the default, SSD can be selected.

    Note that when set to SSD, the hot_partition_num property will no longer take effect, all partitions will default to SSD storage media and the cooldown time will be 9999-12-31 23:59:59.

Create History Partition Rules

When create_history_partition is true, i.e. history partition creation is enabled, Doris determines the number of history partitions to be created based on dynamic_partition.start and dynamic_partition.history_partition_num.

Assuming the number of history partitions to be created is expect_create_partition_num, the number is as follows according to different settings.

  1. create_history_partition = true

    • dynamic_partition.history_partition_num is not set, i.e. -1.
      expect_create_partition_num = end - start;

    • dynamic_partition.history_partition_num is set
      expect_create_partition_num = end - max(start, -history_partition_num);

  2. create_history_partition = false
    No history partition will be created, expect_create_partition_num = end - 0;

When expect_create_partition_num is greater than max_dynamic_partition_num (default 500), creating too many partitions is prohibited.

Examples:

  1. Suppose today is 2021-05-20, partition by day, and the attributes of dynamic partition are set to create_history_partition=true, end=3, start=-3, history_partition_num=1, then the system will automatically create the following partitions.

    1. p20210519
    2. p20210520
    3. p20210521
    4. p20210522
    5. p20210523
  2. history_partition_num=5 and keep the rest attributes as in 1, then the system will automatically create the following partitions.

    1. p20210517
    2. p20210518
    3. p20210519
    4. p20210520
    5. p20210521
    6. p20210522
    7. p20210523
  3. history_partition_num=-1 i.e., if you do not set the number of history partitions and keep the rest of the attributes as in 1, the system will automatically create the following partitions.

    1. p20210517
    2. p20210518
    3. p20210519
    4. p20210520
    5. p20210521
    6. p20210522
    7. p20210523

Notice

If some partitions between dynamic_partition.start and dynamic_partition.end are lost due to some unexpected circumstances when using dynamic partition, the lost partitions between the current time and dynamic_partition.end will be recreated, but the lost partitions between dynamic_partition.start and the current time will not be recreated.

Example

  1. Table tbl1 partition column k1, type is DATE, create a dynamic partition rule. By day partition, only the partitions of the last 7 days are kept, and the partitions of the next 3 days are created in advance.

    1. CREATE TABLE tbl1
    2. (
    3. k1 DATE,
    4. ...
    5. )
    6. PARTITION BY RANGE(k1) ()
    7. DISTRIBUTED BY HASH(k1)
    8. PROPERTIES
    9. (
    10. "dynamic_partition.enable" = "true",
    11. "dynamic_partition.time_unit" = "DAY",
    12. "dynamic_partition.start" = "-7",
    13. "dynamic_partition.end" = "3",
    14. "dynamic_partition.prefix" = "p",
    15. "dynamic_partition.buckets" = "32"
    16. );

    Suppose the current date is 2020-05-29. According to the above rules, tbl1 will produce the following partitions:

    1. p20200529: ["2020-05-29", "2020-05-30")
    2. p20200530: ["2020-05-30", "2020-05-31")
    3. p20200531: ["2020-05-31", "2020-06-01")
    4. p20200601: ["2020-06-01", "2020-06-02")

    On the next day, 2020-05-30, a new partition will be created p20200602: [" 2020-06-02 "," 2020-06-03 ")

    On 2020-06-06, because dynamic_partition.start is set to 7, the partition 7 days ago will be deleted, that is, the partition p20200529 will be deleted.

  2. Table tbl1 partition column k1, type is DATETIME, create a dynamic partition rule. Partition by week, only keep the partition of the last 2 weeks, and create the partition of the next 2 weeks in advance.

    1. CREATE TABLE tbl1
    2. (
    3. k1 DATETIME,
    4. ...
    5. )
    6. PARTITION BY RANGE(k1) ()
    7. DISTRIBUTED BY HASH(k1)
    8. PROPERTIES
    9. (
    10. "dynamic_partition.enable" = "true",
    11. "dynamic_partition.time_unit" = "WEEK",
    12. "dynamic_partition.start" = "-2",
    13. "dynamic_partition.end" = "2",
    14. "dynamic_partition.prefix" = "p",
    15. "dynamic_partition.buckets" = "8"
    16. );

    Suppose the current date is 2020-05-29, which is the 22nd week of 2020. The default week starts on Monday. Based on the above rules, tbl1 will produce the following partitions:

    1. p2020_22: ["2020-05-25 00:00:00", "2020-06-01 00:00:00")
    2. p2020_23: ["2020-06-01 00:00:00", "2020-06-08 00:00:00")
    3. p2020_24: ["2020-06-08 00:00:00", "2020-06-15 00:00:00")

    The start date of each partition is Monday of the week. At the same time, because the type of the partition column k1 is DATETIME, the partition value will fill the hour, minute and second fields, and all are 0.

    On 2020-06-15, the 25th week, the partition 2 weeks ago will be deleted, ie p2020_22 will be deleted.

    In the above example, suppose the user specified the start day of the week as "dynamic_partition.start_day_of_week" = "3", that is, set Wednesday as the start of week. The partition is as follows:

    1. p2020_22: ["2020-05-27 00:00:00", "2020-06-03 00:00:00")
    2. p2020_23: ["2020-06-03 00:00:00", "2020-06-10 00:00:00")
    3. p2020_24: ["2020-06-10 00:00:00", "2020-06-17 00:00:00")

    That is, the partition ranges from Wednesday of the current week to Tuesday of the next week.

    • Note: 2019-12-31 and 2020-01-01 are in same week, if the starting date of the partition is 2019-12-31, the partition name is p2019_53, if the starting date of the partition is 2020-01 -01, the partition name is p2020_01.
  3. Table tbl1 partition column k1, type is DATE, create a dynamic partition rule. Partition by month without deleting historical partitions, and create partitions for the next 2 months in advance. At the same time, set the starting date on the 3rd of each month.

    1. CREATE TABLE tbl1
    2. (
    3. k1 DATE,
    4. ...
    5. )
    6. PARTITION BY RANGE(k1) ()
    7. DISTRIBUTED BY HASH(k1)
    8. PROPERTIES
    9. (
    10. "dynamic_partition.enable" = "true",
    11. "dynamic_partition.time_unit" = "MONTH",
    12. "dynamic_partition.end" = "2",
    13. "dynamic_partition.prefix" = "p",
    14. "dynamic_partition.buckets" = "8",
    15. "dynamic_partition.start_day_of_month" = "3"
    16. );

    Suppose the current date is 2020-05-29. Based on the above rules, tbl1 will produce the following partitions:

    1. p202005: ["2020-05-03", "2020-06-03")
    2. p202006: ["2020-06-03", "2020-07-03")
    3. p202007: ["2020-07-03", "2020-08-03")

    Because dynamic_partition.start is not set, the historical partition will not be deleted.

    Assuming that today is 2020-05-20, and set 28th as the start of each month, the partition range is:

    1. p202004: ["2020-04-28", "2020-05-28")
    2. p202005: ["2020-05-28", "2020-06-28")
    3. p202006: ["2020-06-28", "2020-07-28")

Modify Dynamic Partition Properties

You can modify the properties of the dynamic partition with the following command

  1. ALTER TABLE tbl1 SET
  2. (
  3. "dynamic_partition.prop1" = "value1",
  4. ...
  5. );

The modification of certain attributes may cause conflicts. Assume that the partition granularity was DAY and the following partitions have been created:

  1. p20200519: ["2020-05-19", "2020-05-20")
  2. p20200520: ["2020-05-20", "2020-05-21")
  3. p20200521: ["2020-05-21", "2020-05-22")

If the partition granularity is changed to MONTH at this time, the system will try to create a partition with the range ["2020-05-01", "2020-06-01"), and this range conflicts with the existing partition. So it cannot be created. And the partition with the range ["2020-06-01", "2020-07-01") can be created normally. Therefore, the partition between 2020-05-22 and 2020-05-30 needs to be filled manually.

Check Dynamic Partition Table Scheduling Status

You can further view the scheduling of dynamic partitioned tables by using the following command:

  1. mysql> SHOW DYNAMIC PARTITION TABLES;
  2. +-----------+--------+----------+-------------+------+--------+---------+-----------+----------------+---------------------+--------+------------------------+----------------------+-------------------------+
  3. | TableName | Enable | TimeUnit | Start | End | Prefix | Buckets | StartOf | LastUpdateTime | LastSchedulerTime | State | LastCreatePartitionMsg | LastDropPartitionMsg | ReservedHistoryPeriods |
  4. +-----------+--------+----------+-------------+------+--------+---------+-----------+----------------+---------------------+--------+------------------------+----------------------+-------------------------+
  5. | d3 | true | WEEK | -3 | 3 | p | 1 | MONDAY | N/A | 2020-05-25 14:29:24 | NORMAL | N/A | N/A | [2021-12-01,2021-12-31] |
  6. | d5 | true | DAY | -7 | 3 | p | 32 | N/A | N/A | 2020-05-25 14:29:24 | NORMAL | N/A | N/A | NULL |
  7. | d4 | true | WEEK | -3 | 3 | p | 1 | WEDNESDAY | N/A | 2020-05-25 14:29:24 | NORMAL | N/A | N/A | NULL |
  8. | d6 | true | MONTH | -2147483648 | 2 | p | 8 | 3rd | N/A | 2020-05-25 14:29:24 | NORMAL | N/A | N/A | NULL |
  9. | d2 | true | DAY | -3 | 3 | p | 32 | N/A | N/A | 2020-05-25 14:29:24 | NORMAL | N/A | N/A | NULL |
  10. | d7 | true | MONTH | -2147483648 | 5 | p | 8 | 24th | N/A | 2020-05-25 14:29:24 | NORMAL | N/A | N/A | NULL |
  11. +-----------+--------+----------+-------------+------+--------+---------+-----------+----------------+---------------------+--------+------------------------+----------------------+-------------------------+
  12. 7 rows in set (0.02 sec)
  • LastUpdateTime: The last time of modifying dynamic partition properties
  • LastSchedulerTime: The last time of performing dynamic partition scheduling
  • State: The state of the last execution of dynamic partition scheduling
  • LastCreatePartitionMsg: Error message of the last time to dynamically add partition scheduling
  • LastDropPartitionMsg: Error message of the last execution of dynamic deletion partition scheduling

Advanced Operation

FE Configuration Item

  • dynamic_partition_enable

    Whether to enable Doris’s dynamic partition feature. The default value is false, which is off. This parameter only affects the partitioning operation of dynamic partition tables, not normal tables. You can modify the parameters in fe.conf and restart FE to take effect. You can also execute the following commands at runtime to take effect:

    MySQL protocol:

    ADMIN SET FRONTEND CONFIG ("dynamic_partition_enable" = "true")

    HTTP protocol:

    curl --location-trusted -u username:password -XGET http://fe_host:fe_http_port/api/_set_config?dynamic_partition_enable=true

    To turn off dynamic partitioning globally, set this parameter to false.

  • dynamic_partition_check_interval_seconds

    The execution frequency of dynamic partition threads defaults to 3600 (1 hour), that is, scheduling is performed every 1 hour. You can modify the parameters in fe.conf and restart FE to take effect. You can also modify the following commands at runtime:

    MySQL protocol:

    ADMIN SET FRONTEND CONFIG ("dynamic_partition_check_interval_seconds" = "7200")

    HTTP protocol:

    curl --location-trusted -u username:password -XGET http://fe_host:fe_http_port/api/_set_config?dynamic_partition_check_interval_seconds=432000

Converting dynamic and manual partition tables to each other

For a table, dynamic and manual partitioning can be freely converted, but they cannot exist at the same time, there is and only one state.

Converting Manual Partitioning to Dynamic Partitioning

If a table is not dynamically partitioned when it is created, it can be converted to dynamic partitioning at runtime by modifying the dynamic partitioning properties with ALTER TABLE, an example of which can be seen with HELP ALTER TABLE.

When dynamic partitioning feature is enabled, Doris will no longer allow users to manage partitions manually, but will automatically manage partitions based on dynamic partition properties.

NOTICE: If dynamic_partition.start is set, historical partitions with a partition range before the start offset of the dynamic partition will be deleted.

Converting Dynamic Partitioning to Manual Partitioning

The dynamic partitioning feature can be disabled by executing ALTER TABLE tbl_name SET ("dynamic_partition.enable" = "false") and converting it to a manual partition table.

When dynamic partitioning feature is disabled, Doris will no longer manage partitions automatically, and users will have to create or delete partitions manually by using ALTER TABLE.

Common problem

  1. After creating the dynamic partition table, it prompts Could not create table with dynamic partition when fe config dynamic_partition_enable is false

    Because the main switch of dynamic partition, that is, the configuration of FE dynamic_partition_enable is false, the dynamic partition table cannot be created.

    At this time, please modify the FE configuration file, add a line dynamic_partition_enable=true, and restart FE. Or execute the command ADMIN SET FRONTEND CONFIG (“dynamic_partition_enable” = “true”) to turn on the dynamic partition switch.

  2. Replica settings for dynamic partitions

    Dynamic partitions are automatically created by scheduling logic inside the system. When creating a partition automatically, the partition properties (including the number of replicas of the partition, etc.) are all prefixed with dynamic_partition, rather than the default properties of the table. for example:

    1. CREATE TABLE tbl1 (
    2. `k1` int,
    3. `k2` date
    4. )
    5. PARTITION BY RANGE(k2)()
    6. DISTRIBUTED BY HASH(k1) BUCKETS 3
    7. PROPERTIES
    8. (
    9. "dynamic_partition.enable" = "true",
    10. "dynamic_partition.time_unit" = "DAY",
    11. "dynamic_partition.end" = "3",
    12. "dynamic_partition.prefix" = "p",
    13. "dynamic_partition.buckets" = "32",
    14. "dynamic_partition.replication_num" = "1",
    15. "dynamic_partition.start" = "-3",
    16. "replication_num" = "3"
    17. );

    In this example, no initial partition is created (partition definition in PARTITION BY clause is empty), and DISTRIBUTED BY HASH(k1) BUCKETS 3, "replication_num" = "3", "dynamic_partition is set. replication_num" = "1 and "dynamic_partition.buckets" = "32".

    We make the first two parameters the default parameters for the table, and the last two parameters the dynamic partition-specific parameters.

    When the system automatically creates a partition, it will use the two configurations of bucket number 32 and replica number 1 (that is, parameters dedicated to dynamic partitions). Instead of the two configurations of bucket number 3 and replica number 3.

    When a user manually adds a partition through the ALTER TABLE tbl1 ADD PARTITION statement, the two configurations of bucket number 3 and replica number 3 (that is, the default parameters of the table) will be used.

    That is, dynamic partitioning uses a separate set of parameter settings. The table’s default parameters are used only if no dynamic partition-specific parameters are set. as follows:

    1. CREATE TABLE tbl2 (
    2. `k1` int,
    3. `k2` date
    4. )
    5. PARTITION BY RANGE(k2)()
    6. DISTRIBUTED BY HASH(k1) BUCKETS 3
    7. PROPERTIES
    8. (
    9. "dynamic_partition.enable" = "true",
    10. "dynamic_partition.time_unit" = "DAY",
    11. "dynamic_partition.end" = "3",
    12. "dynamic_partition.prefix" = "p",
    13. "dynamic_partition.start" = "-3",
    14. "dynamic_partition.buckets" = "32",
    15. "replication_num" = "3"
    16. );

    In this example, if dynamic_partition.replication_num is not specified separately, the default parameter of the table is used, which is "replication_num" = "3".

    And the following example:

    1. CREATE TABLE tbl3 (
    2. `k1` int,
    3. `k2` date
    4. )
    5. PARTITION BY RANGE(k2)(
    6. PARTITION p1 VALUES LESS THAN ("2019-10-10")
    7. )
    8. DISTRIBUTED BY HASH(k1) BUCKETS 3
    9. PROPERTIES
    10. (
    11. "dynamic_partition.enable" = "true",
    12. "dynamic_partition.time_unit" = "DAY",
    13. "dynamic_partition.end" = "3",
    14. "dynamic_partition.prefix" = "p",
    15. "dynamic_partition.start" = "-3",
    16. "dynamic_partition.buckets" = "32",
    17. "dynamic_partition.replication_num" = "1",
    18. "replication_num" = "3"
    19. );

    In this example, there is a manually created partition p1. This partition will use the default settings for the table, which are 3 buckets and 3 replicas. The dynamic partitions automatically created by the subsequent system will still use the special parameters for dynamic partitions, that is, the number of buckets is 32 and the number of replicas is 1.

More Help

For more detailed syntax and best practices for using dynamic partitions, see SHOW DYNAMIC PARTITION Command manual, you can also enter HELP ALTER TABLE in the MySql client command line for more help information.