Sometimes you may encountered the error "Query compilation used too much memory" in response to a large, complex query, or run across a warning with the same text in the clustrix.log. The solution for this type of error is to increase the global max_sierra_opt_mem_MB and max_sierra_tot_mem_MB parameters, and this will generally allowed the query to compile, though warnings in the logs may persist (and initial compilation may take several seconds).
This article will explain what the "compilation used too much memory" message means, what the max_sierra_* global variables actually do, and other ways to handle issues compiling large complex queries.
Compilation phases and memory limits
Glossing over a great deal of complexity, for the purposes of memory utilization we are concerned with three phases of compilation:
- Parsing the query
- Optimizing the query plan
- Generating executable instructions (query fragments)
The amount of memory which can be used for parsing the query is determined by max_sierra_parse_mem_MiB (default 256). This guard is in place to prevent a database crash in the face of a pathologically large query (the query which prompted the addition of this guard selected 40K columns). If a query encounters this limit, it will error with:
ERROR 1 (HY000) at line 1:  Sierra used too much working memory: Query parsing
This limit can be raised to a max of 512MB, but please contact support if you encounter this issue, so we can help determine if the parser may be doing something improper in consuming so much memory.
Typically the most memory-intensive phase is the second, query optimization. This is where the planner tries to find the best approach to answering the query, from potentially very many possible approaches. This includes selecting join order, which indexes to use, and applying a variety of optimizations, such as converting LEFT JOIN to INNER JOIN when possible or converting a bunch of OR'd col='value' to an equivalent IN SET. The optimization process involves searching the space of all possible query plans, comparing estimated execution costs to determine which is most likely to execute fastest (this is covered in much greater detail in the Query Optimizer section of the Clustrix Documentation on docs.clutrix.com). This space of all possible solutions grows larger with more complex queries, in particular as the number of JOINs increases -- number of possible JOIN orderings is n! (n factorial, i.e. 1*2*3*...*n) . The number of available indexes also contributes to the size of the search space. As we work through all these possible plans, we have to keep track of what we've discovered so far, which can add up to a significant amount of memory.
The amount of memory allowed for the optimization phase is controlled by max_sierra_opt_mem_MiB (default 80). When the compiler exceeds this limit, it will stop exploring the search space for a more optimal plan, and proceed to the next phase of compilation with the best plan it's found so far. This plan could be perfectly good, or pretty bad, in which case execution will be more expensive than it could be, with higher latency and greater load placed on the cluster. Importantly, though, in this case the query will execute and return results to the user, with no indication of problems. The indication that we did not finish optimization because of memory constraints will be a WARNING like the following in clustrix.log:
WARNING rigr/sierra/plan_parse.c:282 sierra_plan_impl(): Compilation used too much memory; took 4.230127sec: SELECT `t`.`id` AS `t0_c0`, `t`.`user_id` AS `t0_c1`, `t`.`item_id` AS `t0_c2`, ...
INFO rigr/sierra/plan_parse.c:287 sierra_plan_impl(): Cost=239.49296827979, rows=2.653641
INFO rigr/sierra/plan_parse.c:289 sierra_plan_impl(): LEXPRS PEXPRS DUPEXP GROUP ADGRP MEMORY
INFO rigr/sierra/plan_parse.c:290 sierra_plan_impl(): 11732 37457 10331 2996 2996 117714kB TOTAL
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 5 10986 1076 5 5 3828kB nljoin_rule
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 0 10986 1081 0 0 1459kB nljoin_rule_par
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 10962 0 6060 2451 2451 81795kB join_commute_rule
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 0 8567 792 0 0 6526kB join_to_msjoin
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 0 2782 0 0 0 568kB enforce_combine
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 0 1826 0 0 0 2036kB enforce_merge
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 0 1826 0 0 0 4171kB enforce_sort
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 424 0 1037 268 268 6999kB table_scan_sarge
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 0 178 0 0 0 70kB phys_filter
INFO rigr/sierra/plan_parse.c:295 sierra_plan_impl(): 55 111 278 55 55 2275kB pklookup_to_nljoin
The WARNING message shows the query in question, and the following lines show how much memory is being used by the various optimization rules. In this example we can see that join_commute_rule is the main culprit of our memory consumption, at over 80MB. This is typical when we have a very large number of JOINs.
It is possible to increase the sierra_max_opt_mem_MiB global (up to 1024MB), but bear in mind that the memory limit is also effectively a time limit, and so increasing the memory available may result in initial compilation taking significantly longer (though the query plan cache (QPC) will amortize this cost). Also make sure that sierra_max_opt_mem_MiB is well below max_sierra_tot_mem_MiB, as described below.
Once we've parsed, and searched available plans for an optimal plan, we must compile the plan into query fragments to be distributed to the nodes for parallel execution. This phase requires memory as well, and memory consumption is bounded by max_sierra_tot_mem_MiB, default 128. If we exceed max_sierra_tot_mem_MiB, the query will error as follows:
ERROR 1 (HY000):  Sierra used too much memory: Compilation
Note that max_sierra_tot_mem_MiB includes the memory used during optimization (they use the same heap); that is, if we've consumed 80MB during optimization, and supposing max_sierra_tot_mem_MiB is default 128, then we may have less than 48MB available for the remainder of compilation. Thus it is important to retain some gap between max_sierra_tot_mem_MiB and sierra_max_opt_mem_MiB. If you are receiving the ERROR "Sierra used too much memory: Compilation", ensure that sierra_max_opt_mem_MiB is well below max_sierra_tot_mem_MiB.
Working set memory
There is an additional portion of memory used during compilation, called the working set memory. It is used by the compiler as sort of a scratch area, so it grows and shrinks throughout the compilation process (as opposed to the memory used above, which typically grows through the entire compilation). In v5.0, this is gated by the global variable max_sierra_working_mem_MiB, and defaults to 512. A query exceeding this limit would see an error like:
ERROR 1 (HY000):  Sierra used too much working memory: Compilation
While this global can be increased to 1024, usage of such a large amount of memory in the working set may indicate a problem, so we encourage you to contact support if you encounter this error. Also note that this working memory limit is separate from the max_sierra_tot_mem_MiB limit, and working set memory not counted in the max_sierra_tot_mem_MiB check.
Reducing compilation memory requirements with query hints
To avoid long compilation times, you can reduce the amount of work the compiler must do during optimization, by eliminating some of the plan permutations it must explore, i.e. join ordering and index selection.
Reducing optimization space with STRAIGHT_JOIN
The most effective strategy is to provide join ordering by using STRAIGHT_JOIN rather than JOIN; this tells the compiler to use the ordering provided in the query for this particular join. Note that this is provided for each JOIN between two tables, i.e. `foo` STRAIGHT_JOIN `bar` ON (foo.id = bar.pid); Clustrix does not currently support MySQL's SELECT STRAIGHT_JOIN syntax. Clustrix does support LEFT STRAIGHT_JOIN (unlike MySQL).
You can reduce planner memory consumption significantly by providing join ordering for some but not all joined tables -- remember that the cost of exploring joins increases factorially, so just eliminating one has a big impact. Take care when adding STRAIGHT_JOINs, considering whether your data may change over time such that the forced join ordering will be suboptimal.
Here's an example of a query our QA folks came up with in one of their test cases, which happens to trigger the "compilation used too much memory warning":
action AS op,
DATABASES.name AS `database`,
relations.name AS relation,
representations.name AS representation,
NULL AS bytes,
NULL AS started,
NULL AS finished,
NULL AS error,
NULL AS running,
IF(queued < From_unixtime(1350755986), 'leftover', '') AS leftover
JOIN system.rebalancer_copies USING (id)
JOIN system.slices USING (slice)
JOIN system.representations USING (representation)
ON ( relations.table = representations.relation )
JOIN system.DATABASES USING (db)
WHERE id NOT IN (SELECT id
While not a ton of joins on the face of it, the issue arises from the fact that rebalancer_copies is a view, built from many joins and sub-selects (some of these on further views) such that it involves 14 joins all told. This results in 3-4 second compilation time and the compilation used too much memory warnings. Can we shorten compilation and eliminate these warnings by adding some judicious STRAIGHT_JOINs?
In this case, I just forced the join ordering between rebalancer_queued_activity and rebalancer_copies, by changing the second line of the FROM clause to:
STRAIGHT_JOIN system.rebalancer_copies ON (rebalancer_copies.id = rebalancer_queued_activity.id)
(STRAIGHT_JOIN does not currently support USING syntax, so I changed it to a standard ON.)
This query now compiles in 0.2 seconds. In this case that view was blowing out the complexity of the query, so forcing the join ordering of the view had very significant impact. But in fact, forcing a STRAIGHT_JOIN in nearly any of these cases will greatly reduce compilation time and eliminate the log warnings; this is because join permutations grow factorially as noted earlier, so just removing (or forcing the order of) one join has a big impact. In this case, as long as we allow rebalancer_queued_activity to be first in the join ordering (since it is usually quite small), we are unlikely to prevent a reasonably good plan.
Reducing optimization space with USE INDEX
Where there are many indexes available for a table, it may also be helpful to hint the planner with index to use, if this is obvious to the DBA. This can be done using the USE INDEX (indexname) syntax, similar to MySQL.
Factors which can significantly increase compiler memory usage
- Number of JOINs
- This includes views which include more JOINs
- Sub-selects are also treated as JOINs
- 15 tables can be problematic
- Indicated by high memory costs in rules: nljoin_rule, nljoin_rule_par, join_commute_rule, filter_inner_join_rule
- Mitigate by specifying join ordering using STRAIGHT_JOIN
- IN sets are internally converted to mtables (memory tables) and thus create additional JOIN complexity
- Indicated by high costs in rules above, plus inner_join_mtable_find
- Mitigate by converting IN to multiple OR clauses, e.g. a.thing IN (1,2,3) => (a.thing = 1 OR a.thing = 2 OR a.thing = 3); particularly helpful when several IN SETs in a query
- e.g. INSERT INTO foo VALUES (1, concat('abc','def')), (2, concat('ghi','jkl')), ... (10000, concat('zyx','wvu'));
- Note that this is not an issue when there is no parameter, or the same parameter is used repeatedly, e.g. now() or from_unixtime(N) where N is the same for all rows.
- This will fail in parsing: Sierra used too much working memory: Parsing plan
Get in touch with Clustrix Support!
Finally, if you are having difficulty applying these workarounds, or are not achieving desired results, please contact Clustrix Support, either through the forums, or customers with support contracts may open a ticket.