Export and Import
Moving documents in and out of Planck stores is a manifest-driven operation. You write a single YAML file that describes what to move, in which format, and how the nested structures should map onto flat files. The same manifest works for both the directions. You can run it once, or you can let the workbench scheduler run it on a cron.
This page is the operator guide for that flow. It covers the manifest format, the path data takes through the engine during each operation, and the difference between catalog-level export/import (which is what this page is about) and the file-level backup/restore that planctl gives you for disaster recovery.
One note on scope: backup, gc, WAL truncate, stats, export, import, and restore all run as scheduler tasks inside the workbench. Replication is a separate, continuous path between primary and replica, and it is not driven by the scheduler.
Where it runs
Export and import live in three places.
- Workbench UI. The control plane on port 2369. There is a Run Now dialog and a Schedule dialog, both reachable from Server Overview or from the Schedules panel. This is the usual entry point.
- Workbench HTTP API.
POST /api/exportandPOST /api/import. Both take the same YAML manifest that the UI uses. Handy when you are scripting from CI or from your own ops tooling. - Scheduler. A persisted manifest plus a cron expression. The scheduler resolves any template variables at execution time and runs the same engine path that the Run Now dialog uses.
The engine itself (the planck binary) is the thing that actually reads or writes the documents. The workbench parses the manifest, translates the optional YQL filter into the engine's query form, and then forwards the request over the wire protocol.
Do note that planctl does not have its own export or import subcommand. For one-off catalog moves, use the workbench UI or the HTTP API. For full-host disaster recovery, use planctl backup and planctl restore. The difference between the two flows is covered below.
The manifest
Every operation, scheduled or one-shot, takes one YAML manifest. It is the single input that covers every format and every layout.
Top-level fields
| Field | Type | Required | Description |
|---|---|---|---|
store | string | Yes | Store namespace, for example orders or stores.orders. |
format | string | Yes | bson, json, or csv. |
output_dir | string | Yes | Output directory for export, source directory or file path for import. |
query | string | No | YQL filter to export a subset of documents. Export only. |
fields | array | No | Field type hints for JSON import. See JSON type hints. |
entities | array | No | Required for CSV. Optional for JSON and BSON. Defines the file mapping. |
Entity definitions
The entities list is how you describe a multi-file layout. Each entry is one CSV file or one logical JSON shard.
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Entity name. Used for file naming and parent references. |
role | string | Yes | parent or child. |
file | string | Yes | Filename for this entity's data. |
parent | string | No | Parent entity name. Use this to nest a child under another child. Defaults to the root parent. |
parent_field | string | Child only | Array field on the parent document where this entity nests. |
join_key | string | Child only | Column that links child rows to the parent. It is read directly from the CSV header, not from fields. |
fields | array | Yes | List of field descriptors for document columns. The join_key column does not need to be declared here. |
Field descriptors
| Property | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Field name in the document. |
type | string | Yes | Data type used for coercion. |
Supported types:
| Type | Maps to |
|---|---|
string | Text values. |
int | Integer (i64). |
double | Floating-point (f64). |
bool | Boolean. |
datetime | Timestamp in epoch milliseconds. |
objectid | BSON ObjectId (12-byte hex). |
Hierarchies
Children can nest under children, to any depth. The parent field on each child names another entity, and in this way it builds up a tree.
orders (parent)
+-- items (child of orders)
| +-- attributes (child of items)
| | +-- tags (child of attributes)
| +-- reviews (child of items)
+-- payments (child of orders)On export, the engine walks the tree top down from the root, writes the parent rows, and then writes each child file with the join key injected from the parent it belongs to.
On import, entities are sorted deepest-first, loaded, and grouped by join key. The importer then walks the tree bottom up, embedding child groups as arrays under the configured parent_field until the root document is fully assembled. The root documents are inserted into the store as BSON, and the engine flushes once the batch is done.
Filtering exports
The optional query field accepts a YQL expression. Without it, the export covers every document in the store.
query: 'orders.filter(status = "completed" and total > 100)'For scheduled exports you usually want a relative date range. Hardcoded timestamps go stale the moment they ship. The scheduler resolves a small set of template variables to epoch milliseconds at execution time:
| Variable | Resolves to |
|---|---|
${today} | Start of the current day. |
${yesterday} | Start of the previous day. |
${tomorrow} | Start of the next day. |
${now} | Current timestamp. |
${week_ago} | Seven days before ${now}. |
${month_ago} | Thirty days before ${now}. |
A typical nightly export of the previous day's sales:
query: "sales.filter(sale_date >= ${yesterday} and sale_date < ${today})"The variables are resolved at run time, not at the moment you save the schedule.
Running an operation
From the workbench UI
Two entry points, and the same dialog:
- Server Overview, Schema tab. Click Export or Import next to any store. The dialog opens with the store namespace pre-filled.
- Schedules panel. Create a task with
exportorimportas its type, then paste or upload a manifest.
The dialog supports either mode:
| Mode | Behavior |
|---|---|
| Run Now | Executes immediately. The dialog shows progress and a final result panel. |
| Schedule | Persists the manifest as a scheduler task and runs it on the given cron. |
Manifests can be typed straight into the editor or uploaded from disk. The dialog accepts .yaml, .yml, and .txt files, loads the content into the editor, and lets you review the same before running.
Scheduling fields
When you save as a scheduled task:
| Field | Required | Description |
|---|---|---|
| Name | Yes | Schedule name, for example nightly-orders-export. |
| Cron | Yes | Standard 5-field cron expression. |
| Description | No | Free-form note for operators. |
Common cron presets:
| Preset | Expression |
|---|---|
| Daily, 2 am | 0 2 * * * |
| Daily, 4 am | 0 4 * * * |
| Weekly, Sun 3 am | 0 3 * * 0 |
| Hourly | 0 * * * * |
Scheduled export and import tasks show up in the Schedules panel alongside backup, gc, wal truncate, stats, and restore tasks. They can be paused, resumed, edited, or run on demand from the same panel itself.
From the HTTP API
The same manifest, posted as the request body:
curl -X POST https://workbench.example.com:2369/api/export \
-H "Content-Type: application/yaml" \
-H "Authorization: Bearer $WB_KEY" \
--data-binary @orders-export.yaml/api/import takes the manifest in the same way. Both are meant for scripted use from CI or from ops automation.
Manifest examples
JSON, whole store
store: orders
format: json
output_dir: /data/exports/ordersJSON is self-describing, so entities is not needed here. The export covers every document in the store.
JSON, filtered export
store: orders
format: json
output_dir: /data/exports/shipped-orders
query: "orders.filter(TotalDue > 10000)"JSON import, auto-inferred types
store: orders
format: json
output_dir: /data/imports/orders.jsonFor a JSON import, output_dir is the path to the file. The file must be a single JSON array of objects. With no fields section, the types come straight from the JSON syntax:
- JSON string to BSON string.
- JSON number with no decimal to BSON int64.
- JSON number with a decimal to BSON double.
- JSON boolean to BSON boolean.
- JSON null to BSON null.
- JSON object to BSON embedded document.
- JSON array to BSON array.
JSON type hints
Auto-inference is convenient, but it does not always produce the BSON type you actually want. The common cases are numeric strings that should become integers, ISO date strings that should become timestamps, and integer literals that should become doubles.
Add a fields block to coerce specific fields. Any field that you do not list keeps its auto-inferred type.
store: orders
format: json
output_dir: /data/imports/orders.json
fields:
- name: EmployeeID
type: int
- name: CustomerID
type: int
- name: TotalDue
type: double
- name: OrderDate
type: datetime
- name: IsOnline
type: boolCoercion rules:
| Declared type | JSON value | BSON result |
|---|---|---|
int | "289" (string) | int64 289 |
int | 289 (number) | int64 289 |
double | "9.99" (string) | double 9.99 |
double | 100 (number) | double 100.0 |
bool | "true", "1", "yes" (string) | boolean true |
bool | 1 (number) | boolean true |
bool | 0 (number) | boolean false |
datetime | "2024-01-15" (string) | int64 1705276800000 (epoch ms) |
datetime | "2024-01-15T10:30:00Z" (string) | int64 1705314600000 (epoch ms) |
string | anything | stored as written (default). |
The supported datetime formats are YYYY-MM-DD (midnight UTC) and YYYY-MM-DDTHH:MM:SSZ (ISO 8601 UTC). The T may also be a space.
BSON
store: products
format: bson
output_dir: /data/exports/productsDocuments are written out as length-prefixed BSON blobs. The same manifest, with output_dir pointing at a .bson file, performs an import.
CSV, flat
store: products
format: csv
output_dir: /data/exports/products
entities:
- name: products
role: parent
file: products.csv
fields:
- name: ProductID
type: int
- name: ProductNumber
type: string
- name: ProductName
type: string
- name: StandardCost
type: double
- name: ListPrice
type: doubleOne CSV, the listed columns, the declared types. The same manifest runs as an import as well: the engine reads each row, coerces the fields to their declared BSON types, and inserts.
CSV, parent and child
store: orders
format: csv
output_dir: /data/exports/orders
entities:
- name: orders
role: parent
file: orders.csv
fields:
- name: OrderDate
type: string
- name: CustomerID
type: int
- name: SubTotal
type: double
- name: TotalDue
type: double
- name: details
role: child
parent_field: SalesOrderDetails
join_key: CustomerID
file: order_details.csv
fields:
- name: SalesOrderDetailID
type: int
- name: ProductID
type: int
- name: OrderQty
type: int
- name: UnitPrice
type: double
- name: LineTotal
type: doubleExport flattens the orders into orders.csv and their SalesOrderDetails arrays into order_details.csv, with CustomerID injected from the parent row. Import goes the other way around: rows in order_details.csv are grouped by CustomerID, then embedded as the SalesOrderDetails array on the matching order before it is inserted.
CSV, three levels deep
store: orders
format: csv
output_dir: /data/exports/orders-full
entities:
- name: orders
role: parent
file: orders.csv
fields:
- name: OrderDate
type: string
- name: CustomerID
type: int
- name: TotalDue
type: double
- name: details
role: child
parent_field: SalesOrderDetails
join_key: CustomerID
file: order_details.csv
fields:
- name: SalesOrderDetailID
type: int
- name: ProductID
type: int
- name: OrderQty
type: int
- name: UnitPrice
type: double
- name: attributes
role: child
parent: details
parent_field: Attributes
join_key: SalesOrderDetailID
file: detail_attributes.csv
fields:
- name: AttrName
type: string
- name: AttrValue
type: stringHere attributes nests under details because of the parent: details line. Without that line it would nest under the root instead.
CSV, flattening a sub-document
store: customers
format: csv
output_dir: /data/exports/customers
entities:
- name: customers
role: parent
file: customers.csv
fields:
- name: CustomerID
type: int
- name: FirstName
type: string
- name: LastName
type: string
- name: FullName
type: string
- name: addresses
role: child
parent_field: Address
join_key: CustomerID
file: customer_addresses.csv
fields:
- name: Street
type: string
- name: City
type: string
- name: State
type: string
- name: ZipCode
type: string
- name: Country
type: stringA nested Address sub-document is flattened into its own CSV, with CustomerID carried across.
Scheduled export, daily sales
store: sales
format: csv
output_dir: /data/exports/daily-sales
query: "sales.filter(sale_date >= ${yesterday} and sale_date < ${today})"
entities:
- name: sales
role: parent
file: sales.csv
fields:
- { name: sale_id, type: int }
- { name: sale_date, type: datetime }
- { name: register_id, type: string }
- { name: total, type: double }
- name: line_items
role: child
parent_field: items
join_key: sale_id
file: sale_items.csv
fields:
- { name: sku, type: string }
- { name: quantity, type: int }
- { name: price, type: double }Save this with cron 0 1 * * * and the scheduler will write yesterday's sales to two CSVs at 1 am every night.
What happens under the hood
Export
- The workbench receives the manifest YAML.
- If
queryis set, the workbench parses the YQL into the engine's query form. - The parsed manifest, along with any parsed query, is sent to the engine over the wire protocol.
- The engine flushes the memtable so that the export sees the most recent writes.
- The engine scans the store, applies the predicate, and walks the entity tree to write each file. For each child, the join key from the parent row is injected into the child file.
Import
- The workbench receives the manifest YAML and forwards it to the engine.
- The engine sorts the entities deepest-first.
- At each depth, the child file is loaded and the rows are grouped by their
join_keycolumn (read from the CSV header, not from thefieldslist). - The importer walks back up the tree, embedding the already-grouped children as arrays under the configured
parent_field. - The root parent file is read row by row. The matching child groups are embedded, the document is assembled, and the row is inserted as BSON.
- After the batch finishes, the engine flushes to persist the new documents.
Scheduled execution
- The scheduler reads the stored manifest from the schedule document.
- The template variables are resolved to epoch milliseconds at that moment.
- The resolved manifest then takes the same path as a Run Now operation. The result is logged back to the schedule task, so you can see it in the Schedules panel.
Export and import vs backup and restore
It is worth being explicit about the split here. Both live in the workbench scheduler, but they are not the same operation.
| Concern | Export / import | Backup / restore |
|---|---|---|
| Driven by | YAML manifest | App or system selector |
| Scope | A single store, optionally filtered | A whole app's data dir, or the workbench system DB |
| Format | JSON, BSON, CSV | File-level snapshot (vlog segments, B+ tree, WAL) |
| Purpose | Seed data, cross-environment migration | Disaster recovery |
| CLI surface | None. UI and /api/export /api/import | planctl backup and planctl restore |
| Scheduler task type | export, import | backup, restore |
planctl backup takes a one-shot file-level snapshot of the running app and downloads it to your machine:
planctl backup --app myapp --profile prod --output ./backupsplanctl restore writes a snapshot back into the target host. The mode is implicit in which flags are present:
# restore an app
planctl restore --app myapp --backup ./backups/myapp-2025-06-01.tar --profile prod
# restore one service on an app
planctl restore --app myapp --service orders --backup ./backups/orders.tar --profile prod
# restore the workbench's own system DB
planctl restore --system --backup ./backups/system-2025-06-01.tarUse planctl backup and planctl restore for the bit-for-bit recovery scenario. For everything else, use the export/import manifests.
Practical notes
- Use absolute paths for
output_dir. Relative paths resolve against the engine process's working directory, which is rarely where you want them to land. - Declare types for every CSV column. CSV carries no type information of its own, so the manifest is the only source of truth. For JSON imports, add a
fieldsblock for any field where auto-inference produces the wrong BSON type. - Schedule heavy operations off-hours. Export scans the whole store. Import batches the inserts. Both consume CPU, disk, and memory in proportion to the data set.
- Use template variables for date-bounded scheduled exports. A hardcoded date in a recurring task is a bug waiting to ship.
- Test with Run Now before scheduling. The same manifest, the same engine path. In case it works once, it will work on a cron too.
- One manifest, both directions. The same file can drive an export on the source host and an import on the target host. That is the intended way to move data between environments.