BigQuery and ClickHouse: equivalent concepts

The tables below map each BigQuery concept to its ClickHouse equivalent. For the end-to-end migration walkthrough, see Migrating from BigQuery to ClickHouse Cloud.

Resource hierarchy

BigQuery	ClickHouse	Notes
Organization	Organization	Root node of the hierarchy in both.
Project	Service (region-scoped); warehouse for grouping services with shared storage and independent compute	A service is the data-container analogue; use a warehouse to group services that share storage but scale compute independently. Billing, quotas, and tier are set at the ClickHouse organization level, not per service.
Dataset	Database	Same role — logical container for tables and grant scope.
IAM permissions	Console roles plus SQL grants	Two-layer access: roles in `console.clickhouse.cloud` plus SQL grants in the database. Console users can also be granted DB roles for SQL Console use.

Compute and capacity

BigQuery	ClickHouse	Notes
Slot	Replica (whole node); queries can parallelize across replicas	A replica is a static compute node in a ClickHouse service. See the callout below.
Slot reservation	Vertical and horizontal autoscaling bounds; warehouses for workload isolation	Where BigQuery reservations guarantee slot capacity, ClickHouse sizes via min/max autoscaling bounds. Setting min = max effectively fixes the size.
Quotas	Workload classes plus per-query limits	Workload classes cover memory, CPU, concurrency, and I/O scheduling at runtime. Spend caps have no direct primitive; there's no setting that suspends a service on hitting a cost threshold.

Slot vs replica

Unlike a BigQuery slot, a ClickHouse replica is a static compute node that serves many concurrent queries; it isn't allocated to a query from a pool. Replicas are also much coarser-grained: a slot is closer to a unit of CPU than to a whole node.

Billing and pricing model

ClickHouse Cloud meters compute as per-minute compute units (8 GiB RAM, 2 vCPU) rather than per-TB scanned or flat-rate slot commitments, charges for storage as compressed bytes, and bills backups as a separate line item. There's no equivalent to BigQuery's logical-vs-physical storage choice. As in BigQuery, ClickHouse Cloud charges for public internet egress and cross-region data transfer and offers committed-spend discounts. Managed ingestion (ClickPipes) is metered separately; other features (materialized view refresh, secondary index maintenance, dictionary loads) run on the service's compute. See ClickHouse Cloud pricing for current rates, tiers, and commitment options.

Storage and tables

In ClickHouse, a table's behavior is set at creation time: the engine (MergeTree family) determines merge and storage semantics, and ORDER BY / PARTITION BY / TTL clauses configure physical layout and retention. Many BigQuery per-feature settings map to a clause in the ClickHouse CREATE TABLE statement. Physical schema design also differs between platforms; see the migration guide for design tradeoffs.

BigQuery	ClickHouse	Notes
Table	MergeTree-family table	Engine choice determines storage and merge behavior; pick by access pattern (`MergeTree` for append-mostly facts, `ReplacingMergeTree` for upserts, `AggregatingMergeTree` for pre-aggregations).
Column schema modes (`NULLABLE`, `REQUIRED`, `REPEATED`)	`Nullable(T)` for optional; omit for required; `Array(T)` for repeated; `Array(Tuple(...))` or `Nested` for repeated records	In ClickHouse, columns are non-nullable unless wrapped with `Nullable(T)`. Nullability has a small storage and query cost, so use it only when the column needs nulls.
Schema evolution (add / drop / modify columns)	`ALTER TABLE ... ADD / DROP / MODIFY COLUMN`	Same DDL surface as BigQuery. Many column changes are metadata-only.
Partitioning	`PARTITION BY` clause on the table	Where BigQuery limits partitioning to time-unit, integer-range, or ingestion-time columns, ClickHouse takes an arbitrary expression. Use it for retention (drop a partition) and pruning.
Clustering	`ORDER BY` columns in the table definition	Where BigQuery clustering is best-effort and reorganized in the background, ClickHouse's `ORDER BY` is enforced at insert time and drives the sparse primary index.
External tables / BigLake	`s3` / `gcs` / `azureBlobStorage` table functions for direct file access; Iceberg engine for open catalogs	Object storage and open-table formats are accessed directly through these functions and engines. ClickHouse doesn't provide a unified-governance layer over external storage.
Object tables (SQL access to unstructured files)	`s3` / `gcs` table functions over binary formats	Unstructured objects (images, PDFs, audio) are queried directly through these table functions, not via a dedicated table type.
In-SQL ML inference on objects (`ML.GENERATE_TEXT`, `ML.GENERATE_EMBEDDING`)	`aiGenerate` / `aiEmbed` for text	In-SQL LLM and embedding functions exist, but they operate on text columns, not directly on unstructured objects (images, PDFs, audio). For object content, convert to text first or call a multimodal model from the application layer.
Apache Iceberg	Iceberg engine	Wide open-table format support; see the data lake support matrix for current capabilities.
Default table / partition / dataset expiration	`TTL` clause on the table, column, or partition	`TTL` can be set at table creation or via `ALTER TABLE ... MODIFY TTL`, and applies at the table, column, or partition level rather than the dataset default.
Table snapshot	`CLONE AS` for an instant copy, or a table-level `BACKUP`	ClickHouse has no read-only point-in-time snapshot type. `CLONE AS` gives an instant hardlinked copy; a table-level `BACKUP` covers point-in-time recovery.
Time travel	Point-in-time backup restore	No inline historical query; restore is the only way to reach prior state. See the callout below.
Authorized views	View with `SQL SECURITY DEFINER` (runs with the view-owner's privileges)	See CREATE VIEW for the syntax and the `INVOKER` / `DEFINER` / `NONE` modes.
Row-level security	Row policy — a `WHERE`-style expression evaluated per user	Same role; the policy expression is attached per user (or role) rather than as a BigQuery RLS row access policy resource.
Wildcard tables (`_TABLE_SUFFIX`)	`Merge` table engine (persistent grouping) or `merge()` function (inline)	Same idea, different syntax. `Merge` is a persistent table-of-tables; `merge()` is inline without creating one.
Table clone	`CREATE TABLE ... CLONE AS` within a service, or backup restore into a new service	`CLONE AS` hardlinks the source table's parts (part-level copy-on-write), so no data is physically copied. Copying across services still reads the source fully.

Backups

ClickHouse Cloud backups via the UI are per-service, and create a new service on restore. Finer backup granularity is possible with BACKUP/RESTORE SQL commands.

Updates and deletes

ClickHouse is append-optimized. There's no SQL MERGE statement (unrelated to the Merge and MergeTree engines), and ALTER TABLE … UPDATE / DELETE run as background mutations rather than transactional row writes. BigQuery DML patterns (MERGE, UPDATE, DELETE, dbt incremental updates) typically port to engine choice in ClickHouse: ReplacingMergeTree keeps the latest row by sort key, CollapsingMergeTree cancels rows by inserting a matching row with Sign = -1, and AggregatingMergeTree maintains aggregated state. Engine choice is set at table creation and is non-trivial to change later.

Query model and performance

Query acceleration in ClickHouse comes from three layers: primary-key ordering (a sparse index over the on-disk sort order), secondary indexes on non-key columns, and materialized views.

BigQuery	ClickHouse	Notes
Primary key (advisory)	Primary key — drives the on-disk sort order and the sparse primary index	Where BigQuery's PK is advisory only, ClickHouse's PK is load-bearing — it determines physical layout and is used to prune granules, avoid re-sorts, and short-circuit `LIMIT`. Neither system enforces uniqueness.
Foreign key (advisory)	Wide tables or dictionaries for lookups	ClickHouse has no notion of foreign keys. This doesn't prohibit joins but means referential integrity is left to you to manage at the application level.
Search index	Full-text index	Token index over string columns. Operator surface differs: BigQuery's `SEARCH()` maps to the `hasToken` / `hasAllTokens` / `hasAnyTokens` / `hasPhrase` family in ClickHouse; `LIKE` and `match` can also use the index when the pattern is tokenizable (see the index doc for restrictions).
Vector index	`Array(Float32)` or `Array(BFloat16)` with a vector ANN index; or `QBit` for tunable-precision search	Embeddings store as `Array(Float32)`, or `Array(BFloat16)` to halve storage, with an ANN index accelerating approximate nearest-neighbor lookups. `QBit` keeps full precision while letting you trade bits for speed at query time.
Materialized view	Incremental MV — updates on each insert into a base table	Cost is paid at insert time. BigQuery's auto-refreshed MVs map to this incremental model, not to refreshable MVs.
Scheduled query	Refreshable MV for query-driven scheduled work; external orchestrator (dbt, Airflow) for procedural pipelines	Refreshable MVs run on a cron-style schedule and maintain a result table.
Streaming inserts (`tabledata.insertAll`, Storage Write API)	Asynchronous inserts for server-side batching of small row-by-row writes; `Buffer` engine for in-memory accumulation; ClickPipes for managed streaming pipelines	The raw API surface is `INSERT` over HTTP or the native protocol, which accepts the Arrow / ArrowStream columnar format used by the Storage Write API. For the row-by-row semantics of BigQuery streaming inserts, async inserts are the closer fit, batching on the server. For deduplication, ClickHouse exposes block-level dedup on Replicated tables (`insert_deduplicate`, default-on, recent-blocks window) and user-supplied per-insert dedup via `insert_deduplication_token`; for row-level dedup, use `ReplacingMergeTree`. None of these reproduces the Storage Write API's full exactly-once-by-stream-offset model.
Continuous queries	Streaming table engine (Kafka, NATS, RabbitMQ) feeding a materialized view that writes to a destination table; ClickPipes for managed streaming from Pub/Sub or Kinesis; an incremental MV on a regular table for inserts already landing in ClickHouse	"Always-on transform" maps to engine + MV rather than a single declarative resource. Pick by source: streaming queue → table engine or ClickPipes; ongoing inserts into an existing table → MV.
Dry run	`EXPLAIN ESTIMATE` — reports rows, parts, and marks the query would read; other `EXPLAIN` variants (`PLAN`, `PIPELINE`, `SYNTAX`) cover deeper plan inspection	Covers the plan-inspection role of BigQuery's dry run, not the cost-estimation role. ClickHouse billing isn't per-query, so there's no "bytes that will be billed" answer to return.
Federated queries (Spanner, Cloud SQL, AlloyDB, Bigtable)	External OLTP attached via database engine (PostgreSQL, MySQL, MongoDB, SQLite)	Distinct from external tables in object storage; these attach a live source so its tables are queryable directly. Bigtable has no ClickHouse database-engine equivalent.
Cached results	Query cache	ClickHouse's query cache lives in each replica's memory and is per-user by default; identical queries to different replicas don't share results. Not transactionally consistent.
Sessions / multi-statement queries	`SET` for session-scoped settings and query parameters	`SET name = value` applies a setting for the session's lifetime, and `SET param_name = value` defines query parameters referenced as `{name:Type}`. Multi-statement scripting (`DECLARE`, procedural control flow) has no equivalent; keep that logic in the client or an orchestrator.

Security and governance

Authorized views and row-level security are listed under Storage and tables.

BigQuery	ClickHouse	Notes
Policy tags / column-level access control	Column-level grants on specific columns of a table	Grants apply at the column level. BigQuery's centralized taxonomy/policy-tag governance has no direct equivalent.
Data masking	`CREATE MASKING POLICY` (ClickHouse Cloud); or views and row policies. See data masking patterns	`CREATE MASKING POLICY` is a direct equivalent: function-based column masking applied at query time per role, without changing stored data.
Customer-managed encryption keys (CMEK)	CMEK on the service	Supports key rotation and revocation. See the CMEK page for the current list of supported cloud providers.
AEAD / SQL-level encryption functions	Encryption functions (`encrypt` / `decrypt`)	Covers AES-128/256-CBC/GCM and AEAD modes.
Differential privacy	—	Not a managed feature. Applying noise in a UDF doesn't reproduce BigQuery's epsilon/delta-controlled privacy guarantees; for true DP, use an external library.
VPC Service Controls	Private connectivity — Private Service Connect (GCP), PrivateLink (AWS, Azure), and IP allowlists for ingress restriction	Restricts ingress; doesn't replicate VPC SC's data-exfiltration boundary.

BigQuery	ClickHouse	Notes
Analytics Hub / data exchanges / listings	Read access to a shared database, or a dedicated service with consumer-specific row policies	ClickHouse has no in-product data marketplace; sharing uses standard access primitives.
Data clean rooms	Row policies and authorized views	No managed clean-room product.

Operations and ecosystem

ClickHouse surfaces operational state through system.* tables (queries, sessions, replication, parts, metrics) and the cloud console; managed ingestion is handled by ClickPipes; BI and notebook workflows are typically handled in external systems that read from ClickHouse.

BigQuery	ClickHouse	Notes
BigQuery ML	External training and serving (notebooks, Spark, Vertex AI, feature stores) reading from ClickHouse; see AI/ML in Cloud for managed-side features	ClickHouse has no in-database model training; use it as the analytical store and train externally, with chDB embedding the engine in-process (native Pandas/Arrow) to feed pipelines. For in-SQL LLM inference, see `aiGenerate` / `aiEmbed`.
BI Engine	Direct querying — no separate acceleration tier to provision	Sub-second BI latency comes from the storage engine itself; there's no in-memory cache layer to size or pay for separately.
OMNI / cross-cloud federated query	—	ClickHouse doesn't query in place across clouds. The closest pattern is one service per supported region with data staged into the target service before being queried.
Data sources / file formats	File-format and connector library	Managed connectors (ClickPipes) for sources like Kafka, Pub/Sub, MySQL, Postgres, and object storage; SQL table functions for ad-hoc reads of files in object storage.
Query jobs (ID, history, cancel)	`system.query_log` and `system.processes` for inspection; `KILL QUERY` to cancel	Same information, exposed through system tables instead of a job API.
`INFORMATION_SCHEMA`	Native `system.*` tables for ClickHouse-specific detail, or the ANSI `information_schema` views for tool compatibility	Both surfaces available.
Data Transfer Service	ClickPipes — scheduled and streaming ingestion from SaaS, storage, and OLTP sources	ClickPipes is ClickHouse Cloud's managed connector platform; coverage spans streaming systems, OLTP sources, and object storage.
Audit logs (admin activity, data access, system events)	Cloud audit log for org and service admin events; `system.query_log` for data-access activity within the service	BigQuery's three audit streams collapse to two on ClickHouse Cloud: admin activity to the cloud audit log, query/data access to `system.query_log`. System-event telemetry is exposed through other `system.*` tables rather than a dedicated audit stream.
Change data capture ingestion	ClickPipes for Postgres, MySQL, or Kafka	Managed CDC from OLTP sources.
BigQuery Studio notebooks	Hex or Jupyter with `clickhouse-connect`	No in-product notebook surface. Hex is a first-class partner with a native ClickHouse connector; Jupyter covers the self-managed path through the Python client.
BigQuery DataFrames	chDB for a pandas-compatible in-process API	chDB's DataStore is the analog to BigQuery DataFrames: it runs ClickHouse SQL in-process and returns results in Pandas/Arrow.
Data Canvas	—	No drag-and-drop NL canvas. SQL Console covers ad-hoc query authoring; visual data prep happens in an external orchestrator.
Gemini in BigQuery (SQL generation, code completion)	ClickHouse Agents in the cloud console; Ask-AI in the docs	Agents are conversational: natural-language queries against your data with tool calls and chat workflows. For in-SQL LLM calls analogous to `BQ.ML.GENERATE_TEXT`, see `aiGenerate`. Check the Agents page for the current capability surface.
Knowledge Catalog / data lineage / data quality	`system.*` tables for metadata; external tools (dbt, DataHub) for lineage and quality	ClickHouse exposes metadata via system tables rather than a managed catalog product.
Cross-region replication / managed disaster recovery	Multi-AZ HA within a region (automatic)	Cross-region replication and failover work differently in ClickHouse Cloud. See Data resiliency for the current model.

Next steps

For the end-to-end migration walkthrough, see Migrating from BigQuery to ClickHouse Cloud.
For the conceptual side-by-side, see Comparing ClickHouse Cloud and BigQuery.
For loading data, see Loading data from BigQuery.

Resource hierarchy​

Compute and capacity​

Billing and pricing model​

Storage and tables​

Query model and performance​

Security and governance​

Data sharing​

Operations and ecosystem​

Next steps​