Migration SQL Grammar
Overview
This grammar reference applies to migration scripts used for schema and data migrations in Kinotic. Migration scripts use a SQL dialect designed for schema and data management.
All statements must end with a semicolon (;). Identifiers must start with a letter or underscore and can contain letters, numbers, and underscores. Strings are enclosed in single quotes ('...').
Statements Overview
CREATE TABLECREATE COMPONENT TEMPLATECREATE INDEX TEMPLATEREINDEXINSERTUPDATEDELETE- Comments
CREATE TABLE
Creates an Elasticsearch index with the specified field mappings.
Syntax:
CREATE TABLE [IF NOT EXISTS] <index_name> (<column_name> <type> [, <column_name> <type>]*) ;
IF NOT EXISTS(optional): Only create the index if it does not already exist.<type>: See Supported Types.
Example:
CREATE TABLE IF NOT EXISTS products (
name TEXT,
sku KEYWORD,
price DOUBLE,
inStock BOOLEAN,
createdAt DATE
) ;
CREATE COMPONENT TEMPLATE
Creates a reusable component template that can be referenced by index templates.
Syntax:
CREATE COMPONENT TEMPLATE <template_name> (<definition> [, <definition>]*) ;
Definitions:
| Definition | Allowed Values | Description |
|---|---|---|
NUMBER_OF_SHARDS | Integer (e.g., 1, 3) | Number of primary shards for the index |
NUMBER_OF_REPLICAS | Integer (e.g., 0, 1) | Number of replica shards for the index |
<column_name> <type> | See Supported Types | Field mapping (name and type) |
Example:
CREATE COMPONENT TEMPLATE base_settings (
NUMBER_OF_SHARDS = 3,
NUMBER_OF_REPLICAS = 1,
createdAt DATE,
updatedAt DATE
) ;
CREATE INDEX TEMPLATE
Creates an index template that applies settings and mappings to indices matching a pattern.
Syntax:
CREATE INDEX TEMPLATE <template_name> FOR '<pattern>' USING '<component_template>'
[WITH (<definition> [, <definition>]*)] ;
WITH (...)(optional): Additional definitions as in component templates.
Example:
CREATE INDEX TEMPLATE logs_template FOR 'logs-*' USING 'base_settings'
WITH (NUMBER_OF_REPLICAS = 2, level KEYWORD) ;
REINDEX
Copies documents from one index to another with optional transformations.
Syntax:
REINDEX <source_index> INTO <dest_index> [WITH (<option> [, <option>]*)] ;
Options:
| Option | Allowed Values | Description |
|---|---|---|
CONFLICTS | ABORT, PROCEED | How to handle version conflicts: abort or proceed |
MAX_DOCS | Integer (e.g., 1000) | Maximum number of documents to reindex |
SLICES | AUTO, Integer (e.g., 2) | Number of slices for parallel processing; AUTO selects automatically based on shard count |
SIZE | Integer (e.g., 500) | Batch size for reindexing |
SOURCE_FIELDS | Comma-separated list (e.g., 'field1,field2') | Restrict source fields to copy |
QUERY | String (Lucene query syntax) | Query to filter source documents |
SCRIPT | String (Painless script) | Script to transform documents during reindex |
WAIT | TRUE, FALSE | If TRUE, wait for completion; if FALSE, return task ID |
SKIP_IF_NO_SOURCE | TRUE, FALSE | If TRUE, skip if source index does not exist (default FALSE) |
Example:
REINDEX old_products INTO new_products WITH (
CONFLICTS = PROCEED,
SLICES = AUTO,
QUERY = 'status:active',
SKIP_IF_NO_SOURCE = TRUE
) ;
If SKIP_IF_NO_SOURCE = TRUE, the reindex operation will be skipped (no error) if the source index does not exist. This is useful for idempotent migrations.
INSERT
Inserts a document into an index.
Syntax:
INSERT INTO <index_name> [(<column_name> [, <column_name>]*)]
VALUES (<expression> [, <expression>]*) [WITH REFRESH] ;
WITH REFRESH(optional): Immediately refresh the index after insert, making the document searchable.<expression>: Literal value, parameter (?), or field reference.
Example:
INSERT INTO products (name, sku, price)
VALUES ('Widget', 'WDG-001', 9.99) WITH REFRESH ;
The id column
When the column list includes id, its value is used as the document's unique storage identifier. If the column list omits id, a random storage identifier is auto-generated.
This matters because every find-by-id lookup resolves the document by its _id. A row inserted without an id column cannot be retrieved by its logical identifier later — it will only be discoverable via search.
As a rule: if the target table has a logical id field, always include the id column in the INSERT so _id stays in sync with the document's id value.
-- Recommended: id column promoted to _id
INSERT INTO users (id, email, active)
VALUES ('user-001', 'jane@example.com', true)
WITH REFRESH ;
-- Avoid when the table has a logical id field: the row will get a random _id
INSERT INTO users (email, active) VALUES ('jane@example.com', true) ;
UPDATE
Updates documents matching a where clause.
Syntax:
UPDATE <index_name> SET <field> = <expression> [, <field> = <expression>]*
WHERE <where_clause> [WITH REFRESH] ;
WITH REFRESH(optional): Immediately refresh the index after update.<expression>: Literal, parameter, or binary expression (e.g.,age + 1).
Example:
UPDATE products SET price = 12.99, updatedAt = '2024-01-15'
WHERE sku == 'WDG-001' WITH REFRESH ;
DELETE
Deletes documents matching a where clause.
Syntax:
DELETE FROM <index_name> WHERE <where_clause> [WITH REFRESH] ;
WITH REFRESH(optional): Immediately refresh the index after delete.
Example:
DELETE FROM products WHERE inStock == false WITH REFRESH ;
Comments
-- This is a comment
Comments start with -- and continue to the end of the line. Comments are ignored by the parser.
Where Clauses
Where clauses are used in UPDATE and DELETE statements.
Syntax:
<field> <operator> <value>
(<where_clause>)
<where_clause> AND <where_clause>
<where_clause> OR <where_clause>
Operators: ==, !=, <, >, <=, >=
Values: Literal, parameter (?), string, integer, boolean
Example:
WHERE status == 'archived' AND createdAt < '2023-01-01'
WHERE (category == 'electronics' OR category == 'appliances') AND price > 100
Supported Types
| Type | Description |
|---|---|
TEXT | Full-text searchable string; analyzed and tokenized |
KEYWORD | Exact-match string; not tokenized |
KEYWORD NOT INDEXED | Exact-match string; stored but not searchable |
INTEGER | 32-bit signed integer |
INTEGER NOT INDEXED | 32-bit signed integer; stored but not searchable |
LONG | 64-bit signed integer |
LONG NOT INDEXED | 64-bit signed integer; stored but not searchable |
FLOAT | 32-bit floating-point number |
FLOAT NOT INDEXED | 32-bit floating-point number; stored but not searchable |
DOUBLE | 64-bit floating-point number |
DOUBLE NOT INDEXED | 64-bit floating-point number; stored but not searchable |
BOOLEAN | true / false value |
BOOLEAN NOT INDEXED | true / false value; stored but not searchable |
DATE | ISO 8601 date/datetime string |
DATE NOT INDEXED | ISO 8601 date/datetime string; stored but not searchable |
JSON | Arbitrary JSON object; leaf values indexed as searchable strings (see JSON type) |
JSON NOT INDEXED | Arbitrary JSON object; stored as-is without indexing any sub-fields |
BINARY | Base64-encoded binary data; stored but not searchable |
GEO_POINT | Geographic point (lat, lon) |
GEO_SHAPE | Geographic shape (polygon, line, etc.) |
UUID | UUID stored as an exact-match string |
UUID NOT INDEXED | UUID stored as an exact-match string; not searchable |
DECIMAL | Decimal number stored as a 64-bit float |
DECIMAL NOT INDEXED | Decimal number stored as a 64-bit float; not searchable |
OBJECT (...) | Embedded object with declared sub-fields; schema is fixed |
OBJECT (...) NOT INDEXED | Embedded object; stored but sub-fields are not searchable |
NESTED (...) | Array of embedded objects; each element is independently queryable |
UNION (...) | One of several named object variants; all variant fields are merged into a single object |
UNION (...) NOT INDEXED | Union field; stored but not searchable |
The NOT INDEXED variant of each type stores the value but excludes it from the search index. This reduces storage and indexing overhead for fields that only need to be returned in results, not queried.
JSON Type
The JSON type stores an entire JSON object as a single field. All leaf values within the object — regardless of nesting depth — are indexed and queryable as strings. This approach avoids requiring a fixed schema for nested data, making it well-suited for semi-structured or dynamic payloads.
Key behaviors:
- Sub-fields are queryable by dot-path. A field defined as
metadata JSONcan be filtered withmetadata.version == '2'ormetadata.source.region == 'us-east'. - All comparisons are string-based. Even numeric or date values stored inside a JSON field are compared as strings. Use
TEXT,INTEGER,DOUBLE, etc. for fields that require numeric or range queries. - Dynamic nesting is supported. The sub-field structure does not need to be declared in the schema — any JSON shape can be stored and its leaf values will be indexed automatically.
Example:
CREATE TABLE events (
id UUID,
ts DATE,
payload JSON
) ;
A record with payload = {"source": {"region": "us-east"}, "retries": 3} can then be filtered as:
WHERE payload.source.region == 'us-east'
Note that retries would be compared as the string '3', not the number 3.
For JSON NOT INDEXED, the object is stored and returned in query results but none of its sub-fields can be queried.
Composite Types
OBJECT, NESTED, and UNION allow structured sub-documents to be defined inline. All composite types enforce dynamic: strict on their sub-fields — new sub-fields must be added via ALTER TABLE, not inserted ad-hoc.
Composite types are recursive: OBJECT inside NESTED, NESTED inside OBJECT, and other combinations are all valid.
OBJECT
Defines a single embedded object. The sub-fields are stored inline with the parent document.
Syntax:
<column_name> OBJECT (<sub_column> <type> [, <sub_column> <type>]*) [NOT INDEXED]
NOT INDEXED(optional): disables the object. The data is stored but cannot be searched or accessed via field paths.
Example:
CREATE TABLE persons (
id KEYWORD,
address OBJECT (street TEXT, city KEYWORD, state KEYWORD, zip KEYWORD)
);
-- Store but do not index the payload object
CREATE TABLE events (
id KEYWORD,
payload OBJECT (raw TEXT, size INTEGER) NOT INDEXED
);
NESTED
Defines an array of embedded objects where each element is independently queryable. Use NESTED instead of OBJECT when the field holds a list of items and you need to query across the list without cross-element matching.
Syntax:
<column_name> NESTED (<sub_column> <type> [, <sub_column> <type>]*)
NESTED does not support NOT INDEXED.
Example:
CREATE TABLE articles (
id KEYWORD,
tags NESTED (label TEXT, value KEYWORD)
);
UNION
Defines a field that can hold one of several named object variants. All variant fields are merged into a single flat object. Fields that appear in multiple variants must have the same type.
Syntax:
<column_name> UNION (
<VariantName> (<sub_column> <type> [, <sub_column> <type>]*) [, ...]
) [NOT INDEXED]
NOT INDEXED(optional): disables the merged object.- Include a shared discriminator field (e.g.
kind KEYWORD) in each variant so application code can identify which variant it received.
Example:
CREATE TABLE assets (
id KEYWORD,
item UNION (
Book (kind KEYWORD, title TEXT, isbn KEYWORD),
Video (kind KEYWORD, title TEXT, duration INTEGER)
)
);
The resulting item field is a single object with properties kind, title, isbn, and duration.
kind KEYWORD discriminator) are always safe.Expressions
- Literals:
'string',123,true,false - Parameters:
? - Binary Expressions:
<field> + <value>,<field> - <value>, etc.
Reserved Keywords
All keywords in the grammar are reserved and case-insensitive:
ABORT, ADD, ALTER, AND, AUTO, BINARY, BOOLEAN, COLUMN, COMPONENT, CONFLICTS, CREATE, DATE, DECIMAL, DELETE, DOUBLE, EXISTS, FLOAT, FOR, FROM, GEO_POINT, GEO_SHAPE, IF, INDEX, INDEXED, INSERT, INTEGER, INTO, JSON, KEYWORD, LONG, MAX_DOCS, NESTED, NOT, NUMBER_OF_REPLICAS, NUMBER_OF_SHARDS, OBJECT, OR, PROCEED, QUERY, REFRESH, REINDEX, SCRIPT, SET, SIZE, SLICES, SOURCE_FIELDS, TABLE, TEMPLATE, TEXT, TRUE, FALSE, UNION, UPDATE, USING, UUID, VALUES, WAIT, WHERE, WITH, SKIP_IF_NO_SOURCE