Changing Default Distribution

Nick Lamb -

Default Distribution

ClustrixDB uses a hashing algorithm to determine where a table (representation) or index’s should reside within a cluster. The columns selected for hashing are referred to as the “distribution key” for that representation and determines placement of data across nodes. To accomplish this, each index contains metadata about the distribution key. By default, the distribution key uses the first column of an index (DISTRIBUTE = 1), regardless of how many columns comprise the index. This is true for all indices including the primary key.

For more on this, including how to modify the distribution key, see our documentation for: DISTRIBUTE.

Increasing Distribution

It can sometimes be beneficial to modify the distribution key to include additional columns for an index or table. This is an online operation (DDL) but requires a re-write of the representation so can take some time and require free space to complete for large tables and indexes. As with any other alter operation, this should be run during off-peak.

The following sections describe circumstances where a distribution change should be considered.

Column Uniqueness

If the first column of the key does not contain a sufficient number of unique values, it can prevent the database from being able to distribute the representation evenly across the cluster. This can lead to either very large slices that grow beyond the split threshold and/or a wide variance between the smallest and largest slice sizes. This can cause queries that access those slices to vary in performance. A general guideline is that the first key of a key should contain at least an order of magnitude more values than the representations number of slices. For example, for a table with 600 slices the key should contain at least 6000 unique values.

You can see which representations the systems thinks would benefit from a change in distribution with the following query. This query will tell you the current distribution and the total number of columns in a key. You'll want to increase distribution depending on the uniqueness of the columns which may not mean changing it to equal the total number of columns in the key. Note: Single column non-primary keys that aren't unique enough for proper distribution can be redistributed to include the primary key.

SELECT db.name           AS DATABASE, 
     rel.name          AS table,
     rep.name          AS index,
     rep.cont_keycount AS total_keycount,
     hd.range_keycount AS current_keycount,
     bytes,
     rhd.slice_size_deviance,
     node_write_deviance,
     vdev_write_deviance,
     node_read_deviance,
     vdev_read_deviance
FROM   system.rebalancer_hash_distributions rhd
     JOIN system.representations rep
       ON rep.representation = rhd.hash_dist
     JOIN system.relations rel
       ON rel.table = rep.relation
     JOIN system.DATABASES db
       ON db.db = rel.db
     JOIN system.range_hash_distributions hd
       ON hd.representation = rep.representation
WHERE  rhd.needs_redistribute
     AND rep.cont_keycount != hd.range_keycount;

To determine how many columns you should distribute across you need to look at the uniqueness of each column in the key. For a sample representation (`col1`,`col2`) belonging to table foo.bar, you could query the number of rows in the table and the uniqueness with:


CREATE TABLE `bar` (
  `col1` bigint(20) unsigned not null,
  `col2` bigint(20) unsigned not null,
  PRIMARY KEY (`col1`,`col2`) /*$ DISTRIBUTE=1
) CHARACTER SET utf8 /*$ REPLICAS=2 SLICES=50 CONTAINER=layered */

explain select * from foo.bar;
+--------------------------------------------------------------+------------+-------------+
| Operation                                                    | Est. Cost  | Est. Rows   |
+--------------------------------------------------------------+------------+-------------+
| stream_combine                                               | 6690015.46 |   786516.00 |
|   index_scan 1 := bar.__idx_bar__PRIMARY                     |  103914.13 |   199750.30 |
+--------------------------------------------------------------+------------+-------------+
2 rows in set (0.10 sec)

select count(distinct (`col1`)),count(distinct(`col2`)) from foo.bar;
+-----------------------------+---------------------------+
| count(distinct (`col1`))    | count(distinct(`col2`))   |
+-----------------------------+---------------------------+
|                           5 |                      5200 |
+-----------------------------+---------------------------+
1 row in set (0.00 sec)


In this example col1 is not very unique so the data will be hash distributed into 5 buckets. If the second column is added to the distribution key by running ALTER TABLE foo.bar PRIMARY KEY DISTRIBUTE=2; the slices will be more equal in size and more able to be balanced across the cluster.

Increase Parallelism

Changing the distribution of a key will also change the behavior of some planner operators. This can allow queries run against the table to use more parallel operators. In the example below, changing the DISTRIBUTION of the primary key allows for the usage of dist_stream_aggregate because the data being aggregated will be distributed throughout the cluster. This is more overall work for the cluster, but will result in a lower query latency for this query (and potentially other queries against that representation). Additionally, a longer distribution key can make it more expensive in cases where the distributed column(s) is part of a GROUP BY list because more of this aggregation will need to be done by the GTM (Global Transaction Manager) node.

Original Schema and Explain:


CREATE TABLE `bar` (
  `iDate` datetime not null,
  `c_code`   int(10) unsigned not null,
  `counter`  bigint(20) unsigned not null,
  `unique_count` bigint(20) unsigned not null,
  `area` varchar(32) not null,
  PRIMARY KEY (`iDate`,`c_code`,`area`) /*$ DISTRIBUTE=1 */
) CHARACTER SET utf8 /*$ REPLICAS=2 SLICES=400 CONTAINER=layered */;

sql> explain select sum(counter) from test.bar where iDate = '2017-06-22';
+---------------------------------------------------------------------------------+-------------+-------------+
| Operation                                                                       | Est. Cost   | Est. Rows   |
+---------------------------------------------------------------------------------+-------------+-------------+
| stream_aggregate expr0 := sum((1 . "counter"))                                  | 54050340.69 | 71114370.00 |
|   index_scan 1 := bar.__idx_bar__PRIMARY, iDate = param(0)                      | 39827461.69 | 71114370.00 |
+---------------------------------------------------------------------------------+-------------+-------------+
2 rows in set (2.80 sec)

Primary Key DISTRIBUTE=3:


CREATE TABLE `bar` (
  `iDate` datetime not null,
  `c_code`   int(10) unsigned not null,
  `counter`  bigint(20) unsigned not null,
  `unique_count` bigint(20) unsigned not null,
  `area` varchar(32) not null,
  PRIMARY KEY (`iDate`,`c_code`,`area`) /*$ DISTRIBUTE=3 */
) CHARACTER SET utf8 /*$ REPLICAS=2 SLICES=400 CONTAINER=layered */;

sql> explain select sum(counter) from test.bar where iDate = '2017-06-22';
+-------------------------------------------------------------------------------------+-------------+-----------+
| Operation                                                                           | Est. Cost   | Est. Rows |
+-------------------------------------------------------------------------------------+-------------+-----------+
| stream_aggregate expr0 := sum((0 . "expr0"))                                        | 25188878.89 |    800.00 |
|   stream_combine                                                                    | 25188713.89 |    800.00 |
|     dist_stream_aggregate expr0 := sum((1 . "counter"))                             |    31485.79 |      1.00 |
|       index_scan 1 := bar.__idx_bar__PRIMARY, iDate = param(0)                      |    30928.61 |  55217.97 |
+-------------------------------------------------------------------------------------+-------------+-----------+
4 rows in set (0.00 sec)

Note: This change will add broadcasts to the system, which in some cases can affect global cluster performance. Broadcasts cause increased traffic across the network and when unnecessary, cause added contention without a performance benefit. If you see a decrease in performance after increasing the distribution of a popular table, you can measure broadcasts with the following query


SELECT statement, 
       broadcasts, 
       exec_ms, 
       counts, 
       rows_read 
FROM   system.qpc_queries 
ORDER  BY broadcasts DESC 
LIMIT  5\G


If this query suddenly shows the representation you've changed as having a very high number of broadcasts it may be necessary to reduce the distribution of that representation.

Have more questions? Submit a request

0 Comments

Article is closed for comments.
Powered by Zendesk