6

I have a partitioned table on the basis of the Primary key having 1.5 Billion rows in it. But when I check the .ibd files only 80 ibd show in GB's whereas 120ibd shows 1MB file. I would like to know how does the distribution work when I limit Partition with PARTITIONS 200. My structure is as below. Given that id & checksum both are unique all the time.

CREATE TABLE `mapping` (
  `id` varchar(50) NOT NULL DEFAULT '',
  `sha_checksum` varchar(50) NOT NULL DEFAULT '',
  `path_id` varchar(90) DEFAULT NULL,
  `file_id` varchar(35) DEFAULT NULL,
  `server_id` CHAR(6) DEFAULT NULL,
  PRIMARY KEY (`id`,`sha_checksum`)
) ENGINE=InnoDB 
/*!50100 PARTITION BY KEY (id,sha_checksum)
PARTITIONS 200 */

Please help me to understand where I am wrong.

Rohan
  • 61
  • 1
  • A query like `SELECT partition_name, table_rows FROM INFORMATION_SCHEMA.PARTITIONS WHERE table_name='mapping';` may help you understand the row distribution among partitions. – jynus Feb 09 '15 at 13:48
  • thanks jynus, i can see 50% of my partition is empty. This explain 1M file. But it doesn't explain as to why it is empty. – Rohan Feb 10 '15 at 08:16

1 Answers1

1

Rather than trying to explain why most of the partitions seem empty, let me argue against that flavor of PARTITIONing.

To put it bluntly, PK & BY KEY(id, sha1) gains no performance, nor any other benefit that I can imagine.

Note that to get "partition pruning", you have to specify both the id and the sha_checksum. Performance will be essentially the same as having a non-partitioned table with the same PRIMARY KEY.

Since the records are randomly scattered around, I can think of no use for "transportable tablespaces".

What version of MySQL are you using?

What character set are you using? (I'm thinking of space considerations, especially for the CHAR column.)

Please describe id and the application; I can help you devise a better PARTITION scheme, or continue to argue against it.

Another thing to note: More than about 50 partitions slows things down for a variety of reasons, one of which is that all partitions are opened when an INSERT occurs. (Very new versions have 'fixed' this failure to 'prune'.)

Rick James
  • 66,863
  • 4
  • 38
  • 92
  • Thanks Rick, Im currently running Percona 5.6.21-70.1 with utf8 Charterset, rows in CHAR column is Fixed width with exact 6 char data. Based on id and sha_checksum we get the data location. Also from parition info table i can see nearly 50% partition in empty; ` mysql> SELECT count(*) FROM INFORMATION_SCHEMA.PARTITIONS WHERE table_name='mapping' and table_rows<>0; +----------+ | count(*) | +----------+ | 100| +----------+ ` – Rohan Feb 10 '15 at 06:48
  • CHAR(6) with utf8 always takes 18 bytes, even if you have only ascii characters. CHAR(6) CHARSET ascii _might_ be more suitable for server_id and save 12 bytes per row. – Rick James Feb 10 '15 at 17:39
  • Id + sha --> data location. Yes, but partitioning does not help at all (I claim). Why are many empty? I don't know. – Rick James Feb 10 '15 at 17:40