AuditSpine — Capabilities

Compute Tiers

Four compute tiers. One seal model.

From a single analyst laptop to a multi-cloud enterprise pipeline, the same SSAA seal logic applies. Pick the compute tier that matches your infrastructure today — scale up when you need to.

Compute and storage are independent. CT1 (local) / CT2 (Docker) / CT3 (GCP cloud) are compute execution tiers. Bronze → Silver → Gold are storage tiers sealed by SHA-256 at each boundary. The same Bronze hash produced on CT1 is verified on CT2 and CT3 — identical hashes across all compute tiers prove the seal is environment-independent, exactly as Databricks lakehouse architecture decouples compute from storage.

F-150 — CT1

Analyst Team / PoC

Single machine · <10M rows

Pandas + DuckDB + scikit-learn. Laptop-deployable. Full SSAA seals, local manifests. Zero infrastructure cost to start.

F-250 — CT2

Self-Hosted Spark Shop

Cluster / Docker · 10M–1B rows

Spark + Iceberg + Dagster or Airflow. Existing Databricks investment works here. Delta Lake is a drop-in if you already own it.

Silverado — CT2.5

Cloud-Native Mid-Market

Managed cloud · 100M–10B rows

GCP / AWS / Azure. Cloud Run + GCS + Iceberg. Dagster Cloud or Cloud Composer. Vertex, SageMaker, or Azure ML for model tracking.

Sierra HD 3500 — CT3

Enterprise Regulated

Multi-cloud · >10B rows

Dataproc + BigQuery or Snowflake or Redshift. Airflow / Cloud Composer. Kafka + Flink. Seal chain survives across cloud boundaries.

Tool Compatibility

Available now — and what's coming.

Every tool below has a tested or stubbed StorageAdapter that plugs into the SSAA seal model. Tools marked Available Now are in active use or validated through integration tests. Tools marked Planned with Implementation per Customer Need have committed adapters ready when your engagement calls for them.

Available Now

30+ adapters tested or stubbed

Compute

Pandas F-150 baseline. All Bronze/Silver/Gold proofs built here first.
Apache Spark Local, cluster, K8s/Yarn. Tested with real NYC Taxi + IEEE Fraud datasets.
GCP Dataproc Serverless Sierra HD cloud compute. Cloud Build wired.
GCP Cloud Run Job Stateless execution. Serverless CT3.
GCP GKE Spark Spark-on-Kubernetes for enterprise scale.

Storage & Table Format

Apache Iceberg Best SSAA fit. Immutable snapshots by design. Preferred new-build format.
Delta Lake ACID + time travel. If you're on Databricks, we use Delta.
DuckDB Zero-ops local SQL. Excellent for CT1 analytics.
PostgreSQL ConfigDB + StatusDB. Running in production on our own stack.
Apache Parquet Universal file format. Used at every compute tier.
GCS / S3 / ADLS Cloud object stores. Underlay for Iceberg/Delta in cloud.
BigQuery Sierra HD analytics layer. Adapter built and tested. Native time travel via snapshots.
Snowflake Adapter built and tested. If you're already on Snowflake, we plug in here.
AWS Redshift AWS equivalent of BigQuery.
Azure Synapse / ADLS Azure customer lane.

Orchestration

Dagster Preferred orchestrator. Asset-based model maps directly to SSAA medallion stages.

Streaming

Apache Kafka Tested with producer + consumer. Seal at the Bronze ingest watermark.
Azure Event Hubs Kafka-protocol-compatible. Azure Sierra HD lane.

ML & Model Tracking

scikit-learn / XGBoost F-150 ML baseline. GBM, RF, LogReg, AUC metrics all proven.
MLflow Experiment + model registry seals map directly to SSAA.
Vertex AI GCP ML platform for Sierra HD.
AWS SageMaker Model registry + feature store. AWS Sierra HD lane.
Azure ML Azure Sierra HD ML lane.
Feast (Feature Store) Open-source feature store. Seal feature snapshots at training time.

Governance & Connectors

OPA (Open Policy Agent) Policy-as-code. Gate decisions are sealable artifacts.
Apache Ranger Enterprise Hadoop/Spark RBAC.
Airbyte 300+ source connectors. Seal at ingest boundary.
Fivetran Managed ETL. Same seal point as Airbyte.

Transformation

dbt (DuckDB / Spark / BigQuery) SQL-first transformation with Bronze/Silver/Gold schema targets. dbt project live and tested. dbt run artifacts are SSAA-sealable.
Apache Spark SQL Distributed SQL transforms via Spark backend. Seal the plan + output at each stage.

Planned with Implementation per Customer Need

Committed — adapter pending

Orchestration

Apache Airflow Full support committed. DAG run = sealed pipeline execution. Adapter in backlog — activate at contract.
GCP Cloud Composer Managed Airflow. Same adapter, GCP-native configuration.
Prefect Dagster alternative. Available on request.

Compute

Apache Beam Unified batch/streaming. Runs on Dataflow, Flink, or Spark runner — single seal point regardless of runner. Stub adapter committed.

Streaming

Apache Flink Stateful streaming over Kafka. CEP and stream-time joins. Add when customer needs it.
GCP Pub/Sub GCP-native streaming. Pairs with Dataflow / Beam.

Data Formats

Apache Avro Read Avro from Bronze, transform to Parquet or JSON for Silver. Common in Kafka shops — we handle the schema evolution. Adapter committed.

Transformation

Apache Hive / HiveQL Legacy Hadoop transforms. Available for customers still on Hive estates.

Storage

Apache Hudi Upsert-optimized table format. Common in AWS / EMR shops. Add if you bring it.
Apache Hive Metastore Schema catalog for Spark / Iceberg. Add when you need Hive-compatible catalog.

Don't see your tool?
Contact us — we can discuss how to meet your need on priority.

How Sealing Works

The same seal. Every tool. Every tier.

What AuditSpine guarantees — regardless of your stack

Snapshot seal — every pipeline run produces a SHA-256 manifest hash over the current data state. Deterministic and reproducible.

Offline verification — the seal can be verified without re-running the pipeline. An auditor checks the hash; no recompute needed.

Tamper detection — modified, added, or deleted files are immediately detected. The seal breaks on any change.

Append-only audit chain — Bronze seal, Silver seal, Gold seal. Each transformation stage adds to the chain. Nothing overwrites history.

Pipeline metadata embedded — run ID, tier, timestamp, and tool config are embedded in every snapshot. The audit record travels with the data.

Format-agnostic — we seal at the transformation boundary, not inside the tool. Parquet, Avro, Delta, Iceberg — the seal logic is the same.

We meet you
where you are.

Available Now

Planned with Implementation per Customer Need

What AuditSpine guarantees — regardless of your stack

Ready to see it on your stack?

We meet youwhere you are.

Available Now

Planned with Implementation per Customer Need

What AuditSpine guarantees — regardless of your stack

Ready to see it on your stack?

We meet you
where you are.