How to speed up data import on AWS

Rupert Harwood -

When importing data to a Clustrix system it is best practices to use the clustrix_import tool. This is because clustrx_import has been designed to take advantage of the parallel nature of clustered databases to import MySQL dumps into a Clustrix Cluster. Importing mysqldumps with the mysql client results in a single-threaded insert, which will be much slower than clustrix_import.

Detailed information on importing data to Clustrix can be found on our documentation site: Importing Data

If your import is running slowly or failing there are a few things to look at:

  1. You may get much faster import speed if you are running clustrix_import locally from a node, rather than from outside the cluster.
  2. Even though a password for the root user is not required when accessing sql from localhost, it is required for clustrix_import.
  3. We highly recommend running clustrix_import in a screen session so that if you are disconnected from the node, your import will continue to run.
  4. If you have been using triggers in MySQL you will want to run the mysqldump with the "--skip-triggers" argument.

If you are still having problems, please let us know, and include the clustrix_import output to help us understand where/how things are slow.


Related links: 

Have more questions? Submit a request


Please sign in to leave a comment.
Powered by Zendesk