CockroachDB supports importing data from CSV/TSV or SQL dump files.
Import from Tabular Data (CSV)
If you have data exported in a tabular format (e.g., CSV or TSV), you can use the IMPORT statement.
To use this statement, though, you must also have some kind of remote file server (such as Amazon S3 or a custom file server) that all your nodes can access.
Import from Generic SQL Dump
You can execute batches of INSERT statements stored in .sql files (including those generated by cockroach dump) from the command line, importing data into your cluster.
$ cockroach sql --database=[database name] < statements.sql
INSERT statement to include approximately 500-10,000 rows will provide the best performance. The number of rows depends on row size, column families, number of indexes; smaller rows and less complex schemas can benefit from larger groups of INSERTS, while larger rows and more complex schemas benefit from smaller groups.Import from PostgreSQL Dump
If you're importing data from a PostgreSQL deployment, you can import the .sql file generated by the pg_dump command to more quickly import data.
.sql files generated by pg_dump provide better performance because they use the COPY statement instead of bulk INSERT statements.Create PostgreSQL SQL File
Which pg_dump command you want to use depends on whether you want to import your entire database or only specific tables:
Entire database:
$ pg_dump [database] > [filename].sqlSpecific tables:
$ pg_dump -t [table] [table's schema] > [filename].sql
For more details, see PostgreSQL's documentation on pg_dump.
Reformat SQL File
After generating the .sql file, you need to perform a few editing steps before importing it:
- Remove all statements from the file besides the
CREATE TABLEandCOPYstatements. - Manually add the table's
PRIMARY KEYconstraint to theCREATE TABLEstatement. This has to be done manually because PostgreSQL attempts to add the primary key after creating the table, but CockroachDB requires the primary key be defined upon table creation. - Review any other constraints to ensure they're properly listed on the table.
- Remove any unsupported elements.
Import Data
After reformatting the file, you can import it through psql:
$ psql -p [port] -h [node host] -d [database] -U [user] < [file name].sql
For reference, CockroachDB uses these defaults:
[port]: 26257[user]: root