Reference

Migration SQL Grammar

Complete SQL grammar reference for migration scripts.

Overview

This grammar reference applies to migration scripts used for schema and data migrations in Kinotic. Migration scripts use a SQL dialect designed for schema and data management.

All statements must end with a semicolon (;). Identifiers must start with a letter or underscore and can contain letters, numbers, and underscores. Strings are enclosed in single quotes ('...').

Statements Overview

  • CREATE TABLE
  • CREATE COMPONENT TEMPLATE
  • CREATE INDEX TEMPLATE
  • REINDEX
  • INSERT
  • UPDATE
  • DELETE
  • Comments

CREATE TABLE

Creates an Elasticsearch index with the specified field mappings.

Syntax:

CREATE TABLE [IF NOT EXISTS] <index_name> (<column_name> <type> [, <column_name> <type>]*) ;
  • IF NOT EXISTS (optional): Only create the index if it does not already exist.
  • <type>: See Supported Types.

Example:

CREATE TABLE IF NOT EXISTS products (
    name TEXT,
    sku KEYWORD,
    price DOUBLE,
    inStock BOOLEAN,
    createdAt DATE
) ;

CREATE COMPONENT TEMPLATE

Creates a reusable component template that can be referenced by index templates.

Syntax:

CREATE COMPONENT TEMPLATE <template_name> (<definition> [, <definition>]*) ;

Definitions:

DefinitionAllowed ValuesDescription
NUMBER_OF_SHARDSInteger (e.g., 1, 3)Number of primary shards for the index
NUMBER_OF_REPLICASInteger (e.g., 0, 1)Number of replica shards for the index
<column_name> <type>See Supported TypesField mapping (name and type)

Example:

CREATE COMPONENT TEMPLATE base_settings (
    NUMBER_OF_SHARDS = 3,
    NUMBER_OF_REPLICAS = 1,
    createdAt DATE,
    updatedAt DATE
) ;

CREATE INDEX TEMPLATE

Creates an index template that applies settings and mappings to indices matching a pattern.

Syntax:

CREATE INDEX TEMPLATE <template_name> FOR '<pattern>' USING '<component_template>'
    [WITH (<definition> [, <definition>]*)] ;
  • WITH (...) (optional): Additional definitions as in component templates.

Example:

CREATE INDEX TEMPLATE logs_template FOR 'logs-*' USING 'base_settings'
    WITH (NUMBER_OF_REPLICAS = 2, level KEYWORD) ;

REINDEX

Copies documents from one index to another with optional transformations.

Syntax:

REINDEX <source_index> INTO <dest_index> [WITH (<option> [, <option>]*)] ;

Options:

OptionAllowed ValuesDescription
CONFLICTSABORT, PROCEEDHow to handle version conflicts: abort or proceed
MAX_DOCSInteger (e.g., 1000)Maximum number of documents to reindex
SLICESAUTO, Integer (e.g., 2)Number of slices for parallel processing; AUTO selects automatically based on shard count
SIZEInteger (e.g., 500)Batch size for reindexing
SOURCE_FIELDSComma-separated list (e.g., 'field1,field2')Restrict source fields to copy
QUERYString (Lucene query syntax)Query to filter source documents
SCRIPTString (Painless script)Script to transform documents during reindex
WAITTRUE, FALSEIf TRUE, wait for completion; if FALSE, return task ID
SKIP_IF_NO_SOURCETRUE, FALSEIf TRUE, skip if source index does not exist (default FALSE)

Example:

REINDEX old_products INTO new_products WITH (
    CONFLICTS = PROCEED,
    SLICES = AUTO,
    QUERY = 'status:active',
    SKIP_IF_NO_SOURCE = TRUE
) ;

If SKIP_IF_NO_SOURCE = TRUE, the reindex operation will be skipped (no error) if the source index does not exist. This is useful for idempotent migrations.


INSERT

Inserts a document into an index.

Syntax:

INSERT INTO <index_name> [(<column_name> [, <column_name>]*)]
    VALUES (<expression> [, <expression>]*) [WITH REFRESH] ;
  • WITH REFRESH (optional): Immediately refresh the index after insert, making the document searchable.
  • <expression>: Literal value, parameter (?), or field reference.

Example:

INSERT INTO products (name, sku, price)
    VALUES ('Widget', 'WDG-001', 9.99) WITH REFRESH ;

The id column

When the column list includes id, its value is used as the document's unique storage identifier. If the column list omits id, a random storage identifier is auto-generated.

This matters because every find-by-id lookup resolves the document by its _id. A row inserted without an id column cannot be retrieved by its logical identifier later — it will only be discoverable via search.

As a rule: if the target table has a logical id field, always include the id column in the INSERT so _id stays in sync with the document's id value.

-- Recommended: id column promoted to _id
INSERT INTO users (id, email, active)
    VALUES ('user-001', 'jane@example.com', true)
    WITH REFRESH ;

-- Avoid when the table has a logical id field: the row will get a random _id
INSERT INTO users (email, active) VALUES ('jane@example.com', true) ;

UPDATE

Updates documents matching a where clause.

Syntax:

UPDATE <index_name> SET <field> = <expression> [, <field> = <expression>]*
    WHERE <where_clause> [WITH REFRESH] ;
  • WITH REFRESH (optional): Immediately refresh the index after update.
  • <expression>: Literal, parameter, or binary expression (e.g., age + 1).

Example:

UPDATE products SET price = 12.99, updatedAt = '2024-01-15'
    WHERE sku == 'WDG-001' WITH REFRESH ;

DELETE

Deletes documents matching a where clause.

Syntax:

DELETE FROM <index_name> WHERE <where_clause> [WITH REFRESH] ;
  • WITH REFRESH (optional): Immediately refresh the index after delete.

Example:

DELETE FROM products WHERE inStock == false WITH REFRESH ;

Comments

-- This is a comment

Comments start with -- and continue to the end of the line. Comments are ignored by the parser.


Where Clauses

Where clauses are used in UPDATE and DELETE statements.

Syntax:

<field> <operator> <value>
(<where_clause>)
<where_clause> AND <where_clause>
<where_clause> OR <where_clause>

Operators: ==, !=, <, >, <=, >=

Values: Literal, parameter (?), string, integer, boolean

Example:

WHERE status == 'archived' AND createdAt < '2023-01-01'
WHERE (category == 'electronics' OR category == 'appliances') AND price > 100

Supported Types

TypeDescription
TEXTFull-text searchable string; analyzed and tokenized
KEYWORDExact-match string; not tokenized
KEYWORD NOT INDEXEDExact-match string; stored but not searchable
INTEGER32-bit signed integer
INTEGER NOT INDEXED32-bit signed integer; stored but not searchable
LONG64-bit signed integer
LONG NOT INDEXED64-bit signed integer; stored but not searchable
FLOAT32-bit floating-point number
FLOAT NOT INDEXED32-bit floating-point number; stored but not searchable
DOUBLE64-bit floating-point number
DOUBLE NOT INDEXED64-bit floating-point number; stored but not searchable
BOOLEANtrue / false value
BOOLEAN NOT INDEXEDtrue / false value; stored but not searchable
DATEISO 8601 date/datetime string
DATE NOT INDEXEDISO 8601 date/datetime string; stored but not searchable
JSONArbitrary JSON object; leaf values indexed as searchable strings (see JSON type)
JSON NOT INDEXEDArbitrary JSON object; stored as-is without indexing any sub-fields
BINARYBase64-encoded binary data; stored but not searchable
GEO_POINTGeographic point (lat, lon)
GEO_SHAPEGeographic shape (polygon, line, etc.)
UUIDUUID stored as an exact-match string
UUID NOT INDEXEDUUID stored as an exact-match string; not searchable
DECIMALDecimal number stored as a 64-bit float
DECIMAL NOT INDEXEDDecimal number stored as a 64-bit float; not searchable
OBJECT (...)Embedded object with declared sub-fields; schema is fixed
OBJECT (...) NOT INDEXEDEmbedded object; stored but sub-fields are not searchable
NESTED (...)Array of embedded objects; each element is independently queryable
UNION (...)One of several named object variants; all variant fields are merged into a single object
UNION (...) NOT INDEXEDUnion field; stored but not searchable

The NOT INDEXED variant of each type stores the value but excludes it from the search index. This reduces storage and indexing overhead for fields that only need to be returned in results, not queried.


JSON Type

The JSON type stores an entire JSON object as a single field. All leaf values within the object — regardless of nesting depth — are indexed and queryable as strings. This approach avoids requiring a fixed schema for nested data, making it well-suited for semi-structured or dynamic payloads.

Key behaviors:

  • Sub-fields are queryable by dot-path. A field defined as metadata JSON can be filtered with metadata.version == '2' or metadata.source.region == 'us-east'.
  • All comparisons are string-based. Even numeric or date values stored inside a JSON field are compared as strings. Use TEXT, INTEGER, DOUBLE, etc. for fields that require numeric or range queries.
  • Dynamic nesting is supported. The sub-field structure does not need to be declared in the schema — any JSON shape can be stored and its leaf values will be indexed automatically.

Example:

CREATE TABLE events (
    id      UUID,
    ts      DATE,
    payload JSON
) ;

A record with payload = {"source": {"region": "us-east"}, "retries": 3} can then be filtered as:

WHERE payload.source.region == 'us-east'

Note that retries would be compared as the string '3', not the number 3.

For JSON NOT INDEXED, the object is stored and returned in query results but none of its sub-fields can be queried.


Composite Types

OBJECT, NESTED, and UNION allow structured sub-documents to be defined inline. All composite types enforce dynamic: strict on their sub-fields — new sub-fields must be added via ALTER TABLE, not inserted ad-hoc.

Composite types are recursive: OBJECT inside NESTED, NESTED inside OBJECT, and other combinations are all valid.

OBJECT

Defines a single embedded object. The sub-fields are stored inline with the parent document.

Syntax:

<column_name> OBJECT (<sub_column> <type> [, <sub_column> <type>]*) [NOT INDEXED]
  • NOT INDEXED (optional): disables the object. The data is stored but cannot be searched or accessed via field paths.

Example:

CREATE TABLE persons (
    id KEYWORD,
    address OBJECT (street TEXT, city KEYWORD, state KEYWORD, zip KEYWORD)
);

-- Store but do not index the payload object
CREATE TABLE events (
    id KEYWORD,
    payload OBJECT (raw TEXT, size INTEGER) NOT INDEXED
);

NESTED

Defines an array of embedded objects where each element is independently queryable. Use NESTED instead of OBJECT when the field holds a list of items and you need to query across the list without cross-element matching.

Syntax:

<column_name> NESTED (<sub_column> <type> [, <sub_column> <type>]*)

NESTED does not support NOT INDEXED.

Example:

CREATE TABLE articles (
    id KEYWORD,
    tags NESTED (label TEXT, value KEYWORD)
);

UNION

Defines a field that can hold one of several named object variants. All variant fields are merged into a single flat object. Fields that appear in multiple variants must have the same type.

Syntax:

<column_name> UNION (
    <VariantName> (<sub_column> <type> [, <sub_column> <type>]*) [, ...]
) [NOT INDEXED]
  • NOT INDEXED (optional): disables the merged object.
  • Include a shared discriminator field (e.g. kind KEYWORD) in each variant so application code can identify which variant it received.

Example:

CREATE TABLE assets (
    id KEYWORD,
    item UNION (
        Book  (kind KEYWORD, title TEXT, isbn KEYWORD),
        Video (kind KEYWORD, title TEXT, duration INTEGER)
    )
);

The resulting item field is a single object with properties kind, title, isbn, and duration.

Limitation: shared OBJECT/NESTED/UNION fields across variants. If two variants declare a field with the same name and the same composite type (OBJECT, NESTED, or UNION), only the first variant's sub-field definitions are used. The second variant's sub-fields are silently dropped. Conflict detection only applies to scalar types — it will not throw if two same-named OBJECT fields have different sub-structures.To avoid this, ensure that any field shared across variants with a composite type has identical sub-fields in every variant. Scalar shared fields (such as a kind KEYWORD discriminator) are always safe.

Expressions

  • Literals: 'string', 123, true, false
  • Parameters: ?
  • Binary Expressions: <field> + <value>, <field> - <value>, etc.

Reserved Keywords

All keywords in the grammar are reserved and case-insensitive:

ABORT, ADD, ALTER, AND, AUTO, BINARY, BOOLEAN, COLUMN, COMPONENT, CONFLICTS, CREATE, DATE, DECIMAL, DELETE, DOUBLE, EXISTS, FLOAT, FOR, FROM, GEO_POINT, GEO_SHAPE, IF, INDEX, INDEXED, INSERT, INTEGER, INTO, JSON, KEYWORD, LONG, MAX_DOCS, NESTED, NOT, NUMBER_OF_REPLICAS, NUMBER_OF_SHARDS, OBJECT, OR, PROCEED, QUERY, REFRESH, REINDEX, SCRIPT, SET, SIZE, SLICES, SOURCE_FIELDS, TABLE, TEMPLATE, TEXT, TRUE, FALSE, UNION, UPDATE, USING, UUID, VALUES, WAIT, WHERE, WITH, SKIP_IF_NO_SOURCE

Copyright © 2026