GLOSSARY

Core BatchPipe terms as used in the app and schema.

← Back to Docs

Core concepts

Workspace

A company-level account and billing boundary. Pipes, API keys, billing, and events are isolated per workspace.

User account

A human identity (email + password) that can belong to multiple workspaces.

Workspace user

A membership mapping between a user account and a workspace, with a role (owner/admin/member).

Pipe

A named stream that receives records and delivers batches to configured destinations. Pipes also define enrichment and limits.

Record

One JSON object ingested through the ingestion API (clients may send one record or an array per request).

Credentials and access

API key

A workspace credential used for ingestion (POST /v1/ingest/…) and, with the same Bearer token, for the JSON management API (/v1/pipes, /v1/api-keys, etc.). Keys are stored hashed; the secret is shown only once when created.

API key prefix

A short, non-secret part of the key format so you can tell which key was used without exposing the secret.

Allowed origin (CORS)

Optional browser origins allowed for a pipe. CORS is enforced by browsers; it is not a strong identity check for non-browser clients. Any browser origin skips allowlist checks and successful ingest/preflight responses echo the request Origin in Access-Control-Allow-Origin: with an API key, a valid Bearer token is still required; in CORS-only mode there is no shared secret beyond the ingest URL—use limits and treat the URL as sensitive.

Data flow

Ingest (ingestion)

Accepting records into BatchPipe (authenticate, validate, enrich, enforce limits, and buffer/queue). Ingest answers: “How much data did we accept into the pipeline?”

Delivery

Workers take buffered records, build batches, and write/send them to destinations (DB/object storage/HTTP). Delivery answers: “How much did we successfully push out to destinations?”

Batch

A group of records delivered together as a unit, controlled by size/time thresholds.

Destinations

Destination

A configured endpoint where pipe batches are delivered (database, object store, HTTP endpoint, or pull consumer).

Destination config

JSON settings for the destination (connection details, URL, table name, etc.). Treat as sensitive; it should be encrypted at rest.

Destination column mapping

For database destinations, how ingested JSON fields map to destination table columns (name, source field, type, nullable, optional semantic role).

Destination status

Operational state of a destination: active = deliveries permitted, blocked = do not attempt delivery until unblocked (e.g., repeated failures or operator action).

Destination column source field

The JSON field name in the ingested record that should be written into a destination database column. Most values come directly from your ingestion payload (e.g. user_id). Some values can come from BatchPipe enrichment if enabled on the pipe (e.g. an ingestion timestamp field, client IP field, or geo fields from IP).

Destination column type (JSON)

For database destinations, destination_column_type stores the JSON value type of the field: string, number, boolean, object, array, or null. Dates and timestamps are usually ingested as string (e.g. ISO-8601). Delivery uses this to cast values into the actual SQL column type at the destination.

Destination column semantic role

Optional “special meaning” on top of normal column mapping (primarily for database destinations):

dedupe_id: stable identifier; database delivery builds an upsert on these columns (they must match a primary key or unique constraint). Other mapped columns are updated on conflict. If every mapped column is a dedupe key, PostgreSQL uses DO NOTHING; MySQL uses a no-op assignment on one key column. MySQL uses row-alias upsert syntax (8.0.19+).
record_ts: business/event time (“when it happened”)
received_at: when BatchPipe received it (“when we saw it”)

Limits

pl_ingest_max_records_per_second: ingest rate cap (per pipe)

pl_ingest_max_records_per_day: ingest daily record cap (per pipe)

pl_ingest_max_records_per_request: max records in one ingest HTTP body

pl_delivery_max_records_per_flush: max records per delivery flush

pl_delivery_max_seconds_per_flush: max seconds before a time-based flush

pl_delivery_max_bytes_per_flush: max payload bytes per delivery flush

billing_plan_max_seconds_per_flush: plan ceiling for delivery flush timing (per billing_plan tier)

Billing tables

billing_usage_hourly

UTC-hour rollup per workspace + pipe (ingest requests/records/bytes, delivery batch counters + bytes, and pull request/wait counters). Written idempotently from the BILLING_USAGE JetStream stream.

billing_usage_idempotency

One row per applied billing stream message id so JetStream redelivery does not double-count rollups.

billing_ingest_raw / billing_delivery_raw

Legacy append-only rows retained for audit; new traffic is accounted in hourly rollups.

billing_usage_daily

Derived daily aggregates for dashboards and Stripe reporting (summed from hourly rollups, including pull request and pull-wait counters).

Operational events

Workspace event

A stored alert or signal per workspace (for example destination auth failures, slow delivery, schema mismatch, backlog growth, or billing usage publish failures).