通過分區(Partition)提升MySQL性能(二)

From:http://hi.baidu.com/zhangguanshi/blog/item/9ff5a1eb303927dbd439c925.html
= 初步結論 =

  • 分區和未分區佔用文件空間大致相同 (數據和索引文件)
  • 如果查詢語句中有未建立索引字段,分區時間遠遠優於未分區時間
  • 如果查詢語句中字段建立了索引,分區和未分區的差別縮小,分區略優於未分區。


= 最終結論 =

  • 對於大數據量,建議使用分區功能。
  • 去除不必要的字段
  • 根據手冊, 增加 myisam_max_sort_file_size 會增加分區性能

分區命令詳解

    1. RANGE 類型
CREATE TABLE users (
       uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
       name VARCHAR(30) NOT NULL DEFAULT '',
       email VARCHAR(30) NOT NULL DEFAULT ''
)
PARTITION BY RANGE (uid) (
       PARTITION p0 VALUES LESS THAN (3000000)
       DATA DIRECTORY = '/data0/data'
       INDEX DIRECTORY = '/data1/idx',
       PARTITION p1 VALUES LESS THAN (6000000)
       DATA DIRECTORY = '/data2/data'
       INDEX DIRECTORY = '/data3/idx',
       PARTITION p2 VALUES LESS THAN (9000000)
       DATA DIRECTORY = '/data4/data'
       INDEX DIRECTORY = '/data5/idx',
       PARTITION p3 VALUES LESS THAN MAXVALUE     DATA DIRECTORY = '/data6/data'
       INDEX DIRECTORY = '/data7/idx'
);

在這裡,將用戶表分成4個分區,以每300萬條記錄為界限,每個分區都有自己獨立的數據、索引文件的存放目錄,與此同時,這些目錄所在的物理磁盤分區可能也都是完全獨立的,可以提高磁盤 I/O 吞吐量。

    1. LIST 類型
CREATE TABLE category (
     cid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
     name VARCHAR(30) NOT NULL DEFAULT ''
)
PARTITION BY LIST (cid) (
     PARTITION p0 VALUES IN (0,4,8,12)
     DATA DIRECTORY = '/data0/data'
     INDEX DIRECTORY = '/data1/idx',
     PARTITION p1 VALUES IN (1,5,9,13)
     DATA DIRECTORY = '/data2/data'
     INDEX DIRECTORY = '/data3/idx',
     PARTITION p2 VALUES IN (2,6,10,14)
     DATA DIRECTORY = '/data4/data'
     INDEX DIRECTORY = '/data5/idx',
     PARTITION p3 VALUES IN (3,7,11,15)
     DATA DIRECTORY = '/data6/data'
     INDEX DIRECTORY = '/data7/idx'
);

分成4個區,數據文件和索引文件單獨存放。

    1. HASH 類型
CREATE TABLE users (
     uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
     name VARCHAR(30) NOT NULL DEFAULT '',
     email VARCHAR(30) NOT NULL DEFAULT ''
)
PARTITION BY HASH (uid) PARTITIONS 4 (
     PARTITION p0
     DATA DIRECTORY = '/data0/data'
     INDEX DIRECTORY = '/data1/idx',
     PARTITION p1
     DATA DIRECTORY = '/data2/data'
     INDEX DIRECTORY = '/data3/idx',
     PARTITION p2
     DATA DIRECTORY = '/data4/data'
     INDEX DIRECTORY = '/data5/idx',
     PARTITION p3
     DATA DIRECTORY = '/data6/data'
     INDEX DIRECTORY = '/data7/idx'
);

分成4個區,數據文件和索引文件單獨存放。
例子:

CREATE TABLE ti2 (id INT, amount DECIMAL(7,2), tr_date DATE)
    ENGINE=myisam
    PARTITION BY HASH( MONTH(tr_date) )
    PARTITIONS 6;
CREATE PROCEDURE load_ti2()
       begin
    declare v int default 0;
    while v < 80000
    do
        insert into ti2
        values (v,'3.14',adddate('1995-01-01',(rand(v)*3652) mod 365));
         set v = v + 1;
    end while;
    end
    //
    1. KEY 類型
CREATE TABLE users (
     uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
     name VARCHAR(30) NOT NULL DEFAULT '',
     email VARCHAR(30) NOT NULL DEFAULT ''
)
PARTITION BY KEY (uid) PARTITIONS 4 (
     PARTITION p0
     DATA DIRECTORY = '/data0/data'
     INDEX DIRECTORY = '/data1/idx',
     PARTITION p1
     DATA DIRECTORY = '/data2/data'
     INDEX DIRECTORY = '/data3/idx',
     PARTITION p2
     DATA DIRECTORY = '/data4/data'
     INDEX DIRECTORY = '/data5/idx',
     PARTITION p3
     DATA DIRECTORY = '/data6/data'
     INDEX DIRECTORY = '/data7/idx'
);

分成4個區,數據文件和索引文件單獨存放。

  1. 子分區
    子分區是針對 RANGE/LIST 類型的分區表中每個分區的再次分割。再次分割可以是 HASH/KEY 等類型。例如:

      • 對 RANGE 分區再次進行子分區劃分,子分區採用 HASH 類型。
    CREATE TABLE users (
         uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         name VARCHAR(30) NOT NULL DEFAULT '',
         email VARCHAR(30) NOT NULL DEFAULT ''
    )
    PARTITION BY RANGE (uid) SUBPARTITION BY HASH (uid % 4) SUBPARTITIONS 2(
         PARTITION p0 VALUES LESS THAN (3000000)
         DATA DIRECTORY = '/data0/data'
         INDEX DIRECTORY = '/data1/idx',
         PARTITION p1 VALUES LESS THAN (6000000)
         DATA DIRECTORY = '/data2/data'
         INDEX DIRECTORY = '/data3/idx'
    );
      • 或者,對 RANGE 分區再次進行子分區劃分,子分區採用 KEY 類型。
    CREATE TABLE users (
         uid INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
         name VARCHAR(30) NOT NULL DEFAULT '',
         email VARCHAR(30) NOT NULL DEFAULT ''
    )
    PARTITION BY RANGE (uid) SUBPARTITION BY KEY(uid) SUBPARTITIONS 2(
         PARTITION p0 VALUES LESS THAN (3000000)
         DATA DIRECTORY = '/data0/data'
         INDEX DIRECTORY = '/data1/idx',
         PARTITION p1 VALUES LESS THAN (6000000)
         DATA DIRECTORY = '/data2/data'
         INDEX DIRECTORY = '/data3/idx'
    );

分區管理

    • 刪除分區
ALTER TABLE users DROP PARTITION p0;
刪除分區 p0。
    • 重建分區
        • RANGE 分區重建
      ALTER TABLE users REORGANIZE PARTITION p0,p1 INTO (PARTITION p0 VALUES LESS THAN (6000000));

      將原來的 p0,p1 分區合併起來,放到新的 p0 分區中。

        • LIST 分區重建
      ALTER TABLE users REORGANIZE PARTITION p0,p1 INTO (PARTITION p0 VALUES IN(0,1,4,5,8,9,12,13));

      將原來的 p0,p1 分區合併起來,放到新的 p0 分區中。

        • HASH/KEY 分區重建
      ALTER TABLE users REORGANIZE PARTITION COALESCE PARTITION 2;

      用 REORGANIZE 方式重建分區的數量變成2,在這裡數量只能減少不能增加。想要增加可以用 ADD PARTITION 方法。

    • 新增分區
        • 新增 RANGE 分區
      ALTER TABLE category ADD PARTITION (PARTITION p4 VALUES IN (16,17,18,19)
      DATA DIRECTORY = '/data8/data'
      INDEX DIRECTORY = '/data9/idx');
        • 新增 HASH/KEY 分區
      ALTER TABLE users ADD PARTITION PARTITIONS 8;

      將分區總數擴展到8個。

    • 給已有的表加上分區
alter table results partition by RANGE (month(ttime))
(PARTITION p0 VALUES LESS THAN (1),
PARTITION p1 VALUES LESS THAN (2) , PARTITION p2 VALUES LESS THAN (3) ,
PARTITION p3 VALUES LESS THAN (4) , PARTITION p4 VALUES LESS THAN (5) ,
PARTITION p5 VALUES LESS THAN (6) , PARTITION p6 VALUES LESS THAN (7) ,
PARTITION p7 VALUES LESS THAN (8) , PARTITION p8 VALUES LESS THAN (9) ,
PARTITION p9 VALUES LESS THAN (10) , PARTITION p10 VALUES LESS THAN (11),
PARTITION p11 VALUES LESS THAN (12),
PARTITION P12 VALUES LESS THAN (13) );

默認分區限制分區字段必須是主鍵(PRIMARY KEY)的一部分,為了去除此限制:

    1. 使用 ID
mysql> ALTER TABLE np_pk
->      PARTITION BY HASH( TO_DAYS(added) )
->      PARTITIONS 4;
ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function
However, this statement using the id column for the partitioning column is valid, as shown here:
mysql> ALTER TABLE np_pk
->      PARTITION BY HASH(id)
->      PARTITIONS 4;
Query OK, 0 rows affected (0.11 sec)
Records: 0 Duplicates: 0 Warnings: 0
    1. 將原有 Primart Key 去掉生成新 Primart Key
mysql> alter table results drop PRIMARY KEY;
Query OK, 5374850 rows affected (7 min 4.05 sec)
Records: 5374850 Duplicates: 0 Warnings: 0
mysql> alter table results add PRIMARY KEY(id, ttime);
Query OK, 5374850 rows affected (6 min 14.86 sec)
Records: 5374850 Duplicates: 0 Warnings: 0
Please follow and like us:

One comment on “通過分區(Partition)提升MySQL性能(二)

發表迴響

你的電子郵件位址並不會被公開。 必要欄位標記為 *