Snowflake-Labs/sfguide-build-autonomous-pipelines-for-ai-agents
Python
Captured source
source ↗Snowflake-Labs/sfguide-build-autonomous-pipelines-for-ai-agents
Language: Python
License: Apache-2.0
Stars: 4
Forks: 9
Open issues: 0
Created: 2026-04-29T09:19:30Z
Pushed: 2026-05-30T19:26:56Z
Default branch: main
Fork: no
Archived: no
README:
Build Autonomous Pipelines for AI Agents
End-to-end quickstart for building a real-time Transportation Management System (TMS) data platform using Snowflake DCM Projects, Openflow, Dynamic Tables, and Cortex AI.
Stream real-time data from a Kafka endpoint using the Openflow Kafka connector. In Snowflake, build a three-layer architecture: RAW (ingested Kafka topics), TRANSFORM (two-layer Dynamic Table pipeline — cleansing + analytics), and ANALYTICS (views ready for Cortex Agent consumption).
Folder guide
| Folder | Description | |---|---| | [1_bootstrap/](1_bootstrap/README.md) | Account setup — SUMMIT_ADMIN role, grants, DCM project object, Openflow deployment, network access | | [2_dcm_project/](2_dcm_project/README.md) | DCM project — database, schemas, 9 raw tables, 9 clean DTs, 5 analytic DTs, 5 analytics views, roles | | [3_generate/](3_generate/README.md) | Synthetic data generator — Kafka/Redpanda producer and consumer | | [4_openflow/](4_openflow/README.md) | Openflow connector configuration for Kafka ingestion | | [5_fraud_detection/](5_fraud_detection/README.md) | AI-powered fraud detection — heuristic scorer + Cortex AI enrichment (AI_CLASSIFY, AI_COMPLETE) | | [6_cortex-agent/](6_cortex-agent/) | Semantic view creation and Cortex Agent (SQL + Cortex Code prompts) | | [7_streamlit/](7_streamlit/) | Streamlit in Snowflake operations dashboard (Cortex Code prompt) | | [helpers/](helpers/README.md) | Reference and helper scripts | | assets/ | Screenshots and diagrams for the quickstart |
Prerequisites
Snowflake Setup:
- Enterprise Snowflake account with Openflow deployment and runtimes enabled
- ACCOUNTADMIN access needed for bootstrap (roles, Openflow deployment, external access integration, network rules)
Kafka Setup:
- We provide a Kafka endpoint for this quickstart — no setup needed
- You can also use your own Kafka cluster (Confluent, Amazon MSK, Redpanda, etc.)
Local workspace:
- Python 3.13+
- Snowflake CLI (
snow) v3.16.0+ installed - Redpanda CLI (
rpk) for topic management
Quickstart
1. Configure environment
Python environment
# Verify Python 3.13+ is available python3 --version # Must show 3.13 or higher # If below 3.13, install it first: # macOS: brew install python@3.13 # Ubuntu: sudo apt install python3.13 python3.13-venv # Then use the explicit path: /opt/homebrew/bin/python3.13 -m venv .venv python3 -m venv .venv source .venv/bin/activate pip install --upgrade pip pip install kafka-python-ng python-dotenv snowflake-cli snowflake-connector-python ipykernel pip install "snowflake-connector-python[pandas]"
Create a Programmatic Access Token (PAT)
Generate a PAT restricted to the SUMMIT_ADMIN role. Run this in Snowsight or any authenticated session:
ALTER USER ADD PAT summit_admin_pat ROLE_RESTRICTION = 'SUMMIT_ADMIN' DAYS_TO_EXPIRY = 7 COMMENT = 'PAT for summit quickstart';
Copy the token_secret from the output — it is only shown once. You will use it as SNOWFLAKE_PAT in the next step.
> Note: The SUMMIT_ADMIN role must already be granted to your user (Step 2 does this). If you haven't bootstrapped yet, go to bootstrap section and configure the foundational roles and access for your account.
Environment file
All scripts read parameters from environment variables (.env). Copy the template and fill in your values:
cp .env.template .env # edit .env with your Snowflake and Kafka credentials
| Env var | Default | Description | |---|---|---| | Snowflake connection | | | | SNOWFLAKE_ACCOUNT | MYORG-MYACCOUNT | Snowflake account identifier (org-account format) | | SNOWFLAKE_USER | myuser | Snowflake username | | SNOWFLAKE_ROLE | SUMMIT_ADMIN | Role used for DCM operations | | SNOWFLAKE_WAREHOUSE | SUMMIT_WH | Warehouse for CLI operations | | SNOWFLAKE_CONNECTION_NAME | summit | Named connection for snow CLI | | SNOWFLAKE_PAT | — | Programmatic Access Token for authentication | | DCM Project | | | | DCM_DATABASE | DCM_DB | Database containing the DCM project object | | DCM_SCHEMA | PROJECTS | Schema containing the DCM project object | | DCM_PROJECT | DCM_PROJECT_DEV | Name of the DCM project | | Kafka | | | | KAFKA_BOOTSTRAP_SERVERS | localhost:9092 | Broker address | | KAFKA_USERNAME | kafka_user | SASL username | | KAFKA_PASSWORD | kafka_pass | SASL password | | KAFKA_TOPIC_PREFIX | tms | Prefix for all topic names | | TMS Producer | | | | TMS_ORDER_COUNT | 1 | Number of orders to generate | | TMS_DELAY | 1.0 | Seconds between orders | | TMS_FRAUD_RATE | 0.01 | Fraction of fraudulent orders | | TMS_RETURN_RATE | 0.02 | Fraction of returned packages | | TMS_DAYS_BACK | 7 | Spread orders over past N days |
Configure Snowflake CLI
source .env bash helpers/setup_snow_cli_connection.sh
Configure Redpanda CLI
source .env bash helpers/setup_rpk_profile.sh
Verify Kafka connectivity
source .env python3 helpers/test_kafka_connection.py
2. Bootstrap the account
Run 1_bootstrap/setup.sql as ACCOUNTADMIN in Snowsight or via the CLI:
snow sql -f 1_bootstrap/setup.sql -c
This creates the SUMMIT_ADMIN role, DCM_DB database, SUMMIT_WH warehouse, the Openflow deployment (SUMMIT_DEPLOYMENT), and network access rules for the Kafka broker.
See [1_bootstrap/README.md](1_bootstrap/README.md) for full details.
3. Deploy the DCM project
Update 2_dcm_project/manifest.yml with your account identifier, then:
snow dcm plan --target DCM_DEV --from 2_dcm_project snow dcm deploy --target DCM_DEV --from 2_dcm_project
This creates the entire data platform: database, 4 schemas, 9 raw tables, 14 dynamic tables (9 clean + 5 analytic), 5 analytics views, warehouse, and roles with grants.
4. Post-deploy and seed data
# Create the Openflow runtime in the deployed database snow sql -f 2_dcm_project/scripts/post_deploy.sql \ --variable "env_suffix=_DEV"…
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Low traction guide repo