RepoSnowflake (Arctic)Snowflake (Arctic)published Apr 29, 2026seen 5d

Snowflake-Labs/sfguide-build-autonomous-pipelines-for-ai-agents

Python

Open original ↗

Captured source

source ↗

Snowflake-Labs/sfguide-build-autonomous-pipelines-for-ai-agents

Language: Python

License: Apache-2.0

Stars: 4

Forks: 9

Open issues: 0

Created: 2026-04-29T09:19:30Z

Pushed: 2026-05-30T19:26:56Z

Default branch: main

Fork: no

Archived: no

README:

Build Autonomous Pipelines for AI Agents

End-to-end quickstart for building a real-time Transportation Management System (TMS) data platform using Snowflake DCM Projects, Openflow, Dynamic Tables, and Cortex AI.

Stream real-time data from a Kafka endpoint using the Openflow Kafka connector. In Snowflake, build a three-layer architecture: RAW (ingested Kafka topics), TRANSFORM (two-layer Dynamic Table pipeline — cleansing + analytics), and ANALYTICS (views ready for Cortex Agent consumption).

Folder guide

| Folder | Description | |---|---| | [1_bootstrap/](1_bootstrap/README.md) | Account setup — SUMMIT_ADMIN role, grants, DCM project object, Openflow deployment, network access | | [2_dcm_project/](2_dcm_project/README.md) | DCM project — database, schemas, 9 raw tables, 9 clean DTs, 5 analytic DTs, 5 analytics views, roles | | [3_generate/](3_generate/README.md) | Synthetic data generator — Kafka/Redpanda producer and consumer | | [4_openflow/](4_openflow/README.md) | Openflow connector configuration for Kafka ingestion | | [5_fraud_detection/](5_fraud_detection/README.md) | AI-powered fraud detection — heuristic scorer + Cortex AI enrichment (AI_CLASSIFY, AI_COMPLETE) | | [6_cortex-agent/](6_cortex-agent/) | Semantic view creation and Cortex Agent (SQL + Cortex Code prompts) | | [7_streamlit/](7_streamlit/) | Streamlit in Snowflake operations dashboard (Cortex Code prompt) | | [helpers/](helpers/README.md) | Reference and helper scripts | | assets/ | Screenshots and diagrams for the quickstart |

Prerequisites

Snowflake Setup:

  • Enterprise Snowflake account with Openflow deployment and runtimes enabled
  • ACCOUNTADMIN access needed for bootstrap (roles, Openflow deployment, external access integration, network rules)

Kafka Setup:

  • We provide a Kafka endpoint for this quickstart — no setup needed
  • You can also use your own Kafka cluster (Confluent, Amazon MSK, Redpanda, etc.)

Local workspace:

Quickstart

1. Configure environment

Python environment

# Verify Python 3.13+ is available
python3 --version # Must show 3.13 or higher

# If below 3.13, install it first:
# macOS: brew install python@3.13
# Ubuntu: sudo apt install python3.13 python3.13-venv
# Then use the explicit path: /opt/homebrew/bin/python3.13 -m venv .venv

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install kafka-python-ng python-dotenv snowflake-cli snowflake-connector-python ipykernel
pip install "snowflake-connector-python[pandas]"

Create a Programmatic Access Token (PAT)

Generate a PAT restricted to the SUMMIT_ADMIN role. Run this in Snowsight or any authenticated session:

ALTER USER ADD PAT summit_admin_pat
ROLE_RESTRICTION = 'SUMMIT_ADMIN'
DAYS_TO_EXPIRY = 7
COMMENT = 'PAT for summit quickstart';

Copy the token_secret from the output — it is only shown once. You will use it as SNOWFLAKE_PAT in the next step.

> Note: The SUMMIT_ADMIN role must already be granted to your user (Step 2 does this). If you haven't bootstrapped yet, go to bootstrap section and configure the foundational roles and access for your account.

Environment file

All scripts read parameters from environment variables (.env). Copy the template and fill in your values:

cp .env.template .env
# edit .env with your Snowflake and Kafka credentials

| Env var | Default | Description | |---|---|---| | Snowflake connection | | | | SNOWFLAKE_ACCOUNT | MYORG-MYACCOUNT | Snowflake account identifier (org-account format) | | SNOWFLAKE_USER | myuser | Snowflake username | | SNOWFLAKE_ROLE | SUMMIT_ADMIN | Role used for DCM operations | | SNOWFLAKE_WAREHOUSE | SUMMIT_WH | Warehouse for CLI operations | | SNOWFLAKE_CONNECTION_NAME | summit | Named connection for snow CLI | | SNOWFLAKE_PAT | — | Programmatic Access Token for authentication | | DCM Project | | | | DCM_DATABASE | DCM_DB | Database containing the DCM project object | | DCM_SCHEMA | PROJECTS | Schema containing the DCM project object | | DCM_PROJECT | DCM_PROJECT_DEV | Name of the DCM project | | Kafka | | | | KAFKA_BOOTSTRAP_SERVERS | localhost:9092 | Broker address | | KAFKA_USERNAME | kafka_user | SASL username | | KAFKA_PASSWORD | kafka_pass | SASL password | | KAFKA_TOPIC_PREFIX | tms | Prefix for all topic names | | TMS Producer | | | | TMS_ORDER_COUNT | 1 | Number of orders to generate | | TMS_DELAY | 1.0 | Seconds between orders | | TMS_FRAUD_RATE | 0.01 | Fraction of fraudulent orders | | TMS_RETURN_RATE | 0.02 | Fraction of returned packages | | TMS_DAYS_BACK | 7 | Spread orders over past N days |

Configure Snowflake CLI

source .env
bash helpers/setup_snow_cli_connection.sh

Configure Redpanda CLI

source .env
bash helpers/setup_rpk_profile.sh

Verify Kafka connectivity

source .env
python3 helpers/test_kafka_connection.py

2. Bootstrap the account

Run 1_bootstrap/setup.sql as ACCOUNTADMIN in Snowsight or via the CLI:

snow sql -f 1_bootstrap/setup.sql -c

This creates the SUMMIT_ADMIN role, DCM_DB database, SUMMIT_WH warehouse, the Openflow deployment (SUMMIT_DEPLOYMENT), and network access rules for the Kafka broker.

See [1_bootstrap/README.md](1_bootstrap/README.md) for full details.

3. Deploy the DCM project

Update 2_dcm_project/manifest.yml with your account identifier, then:

snow dcm plan --target DCM_DEV --from 2_dcm_project
snow dcm deploy --target DCM_DEV --from 2_dcm_project

This creates the entire data platform: database, 4 schemas, 9 raw tables, 14 dynamic tables (9 clean + 5 analytic), 5 analytics views, warehouse, and roles with grants.

4. Post-deploy and seed data

# Create the Openflow runtime in the deployed database
snow sql -f 2_dcm_project/scripts/post_deploy.sql \
--variable "env_suffix=_DEV"…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Low traction guide repo