Snowflake-Labs/ml-pipelines
Python
Captured source
source ↗Snowflake-Labs/ml-pipelines
Language: Python
License: Apache-2.0
Stars: 4
Forks: 1
Open issues: 0
Created: 2026-06-02T18:03:34Z
Pushed: 2026-06-02T18:07:10Z
Default branch: main
Fork: no
Archived: no
README:
Snowflake MLOps Framework
A production-ready framework for deploying and managing machine learning workflows on Snowflake with automated CI/CD.
Overview
This framework enables teams to:
- Deploy ML workflows as scheduled Snowflake DAGs (Tasks)
- Manage features with a versioned feature store
- Automate deployments via GitHub Actions CI/CD
- Share utilities through the
ml_utilspackage
Getting Started
This repo is intended to be used as an example. Copy this directory into your own repository before running any GitHub Actions or making modifications. The CI/CD workflow is self-contained and will work once you configure the required secrets in your repo.
Prerequisites
- Python 3.10
- A Snowflake account with ACCOUNTADMIN access (for initial setup)
- uv package manager
- A Snowflake connection configured in
~/.snowflake/connections.toml
Installation
# Install dependencies and build the shared ml_utils package uv sync --locked uv build
---
Environment Setup
The framework uses three isolated environments. Each environment has its own database, role, warehouses, and service users.
| Environment | Database | Role | Usage | |-------------|----------|------|-------| | DEV | ML_PIPELINE_DEV_DB | ML_PIPELINE_DEV_ROLE | Local development and experimentation | | STAGING | ML_PIPELINE_STAGING_DB | ML_PIPELINE_STAGING_ROLE | CI/CD deploys from feature branches | | PROD | ML_PIPELINE_PROD_DB | ML_PIPELINE_PROD_ROLE | CI/CD deploys from main branch |
Creating DEV, STAGING, and PROD Environments
The setup/setup.sql script creates all required Snowflake resources for a given environment. Run it three times, once per environment, replacing the ` placeholder with DEV, STAGING, or PROD`.
For each environment the script will:
- Create the environment role (
ML_PIPELINE__ROLE) and grant it required account-level privileges - Grant the role to the current user
- Create the database (
ML_PIPELINE__DB) withBASE_DATAandML_PROJECTSschemas - Create
DATA_STAGE,BUILD_STAGE, andJOB_STAGEstages - Create a default XS warehouse
- Create a service user (
GIT_ACTIONS_) for CI/CD automation - Load the included demo dataset (
MORTGAGE_LENDING_DEMO_DATA)
Steps:
Use the Snowflake CLI (snow sql) to execute the setup script. The -D flag substitutes the `` template variable:
# Create the DEV environment snow sql -f setup/setup.sql -D "env=DEV" -c # Create the STAGING environment snow sql -f setup/setup.sql -D "env=STAGING" -c # Create the PROD environment snow sql -f setup/setup.sql -D "env=PROD" -c
Replace ` with the name of a connection in ~/.snowflake/connections.toml` that has ACCOUNTADMIN access.
---
Creating PATs and Configuring GitHub Secrets
The CI/CD pipeline authenticates to Snowflake using the GIT_ACTIONS_STAGING and GIT_ACTIONS_PROD service users. You need to generate a Programmatic Access Token (PAT) for each.
1. Generate PATs in Snowflake
For each service user (GIT_ACTIONS_STAGING and GIT_ACTIONS_PROD), generate a password or PAT:
-- As ACCOUNTADMIN or USERADMIN ALTER USER GIT_ACTIONS_STAGING SET PASSWORD = ''; ALTER USER GIT_ACTIONS_PROD SET PASSWORD = '';
Or, if using key-pair authentication, generate and assign RSA keys per Snowflake documentation.
2. Add GitHub Secrets
In your GitHub repository, go to Settings > Secrets and variables > Actions and add the following secrets:
| Secret Name | Value | |---|---| | SNOWFLAKE_ACCOUNT | Your Snowflake account identifier (e.g. myorg-myaccount) | | SNOWFLAKE_STAGING_USER | GIT_ACTIONS_STAGING | | SNOWFLAKE_STAGING_PASSWORD | Password/PAT for the STAGING service user | | SNOWFLAKE_PROD_USER | GIT_ACTIONS_PROD | | SNOWFLAKE_PROD_PASSWORD | Password/PAT for the PROD service user |
The deploy workflow (.github/workflows/deploy.yml) automatically selects the correct user and password based on the branch:
- Push to `main` uses
SNOWFLAKE_PROD_USER/SNOWFLAKE_PROD_PASSWORD - Push to any other branch uses
SNOWFLAKE_STAGING_USER/SNOWFLAKE_STAGING_PASSWORD
---
Local Development
Set these environment variables for local development:
export SNOWFLAKE_CONNECTION="your_connection_name" # From connections.toml (connection must have role defined) export SNOWFLAKE_ENVIRONMENT="DEV"
Deploy Feature Store Locally
python feature_store/setup_feature_store.py
This registers entities and feature views defined in feature_store/config.yml, creates warehouses, and applies environment-appropriate privileges. Feature views are automatically versioned based on their definition; only breaking changes (query, entities, schema) increment the version.
Deploy a Project Locally
# Deploy only (creates resources, uploads code, schedules DAGs) python scripts/deploy_project.py example_project # Deploy and immediately execute all DAGs (useful for testing) python scripts/deploy_project.py example_project --run-dag
Create a New Project
./scripts/create_project.sh my_project
This copies the template/ directory into projects/my_project and generates:
config.yml- DAG and compute resource configurationutils.py- Project-level utilities (stage path helpers)pip-requirements.txt- Python dependencies for container notebooks and ML Jobs
Clean Up Resources
# Preview what would be deleted python scripts/cleanup.py example_project --dry-run # Delete all resources for a project python scripts/cleanup.py example_project # Delete specific feature views python scripts/cleanup.py --features example_features --dry-run python scripts/cleanup.py --features example_features # Delete everything (all projects, features, stages) python scripts/cleanup.py --all --dry-run python scripts/cleanup.py --all
---
Feature Store
Configuration
Define entities and feature views in feature_store/config.yml:
entities: - name: loan_entity join_keys: - LOAN_ID warehouses: - name: FEATURE_STORE warehouse_size: SMALL auto_suspend: 300 default: true feature_views: - name:…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Low-star repo by Snowflake, likely minor tool.