RepoSnowflake (Arctic)Snowflake (Arctic)published Jun 2, 2026seen 5d

Snowflake-Labs/ml-pipelines

Python

Open original ↗

Captured source

source ↗
published Jun 2, 2026seen 5dcaptured 14hhttp 200method plain

Snowflake-Labs/ml-pipelines

Language: Python

License: Apache-2.0

Stars: 4

Forks: 1

Open issues: 0

Created: 2026-06-02T18:03:34Z

Pushed: 2026-06-02T18:07:10Z

Default branch: main

Fork: no

Archived: no

README:

Snowflake MLOps Framework

A production-ready framework for deploying and managing machine learning workflows on Snowflake with automated CI/CD.

Overview

This framework enables teams to:

  • Deploy ML workflows as scheduled Snowflake DAGs (Tasks)
  • Manage features with a versioned feature store
  • Automate deployments via GitHub Actions CI/CD
  • Share utilities through the ml_utils package

Getting Started

This repo is intended to be used as an example. Copy this directory into your own repository before running any GitHub Actions or making modifications. The CI/CD workflow is self-contained and will work once you configure the required secrets in your repo.

Prerequisites

  • Python 3.10
  • A Snowflake account with ACCOUNTADMIN access (for initial setup)
  • uv package manager
  • A Snowflake connection configured in ~/.snowflake/connections.toml

Installation

# Install dependencies and build the shared ml_utils package
uv sync --locked
uv build

---

Environment Setup

The framework uses three isolated environments. Each environment has its own database, role, warehouses, and service users.

| Environment | Database | Role | Usage | |-------------|----------|------|-------| | DEV | ML_PIPELINE_DEV_DB | ML_PIPELINE_DEV_ROLE | Local development and experimentation | | STAGING | ML_PIPELINE_STAGING_DB | ML_PIPELINE_STAGING_ROLE | CI/CD deploys from feature branches | | PROD | ML_PIPELINE_PROD_DB | ML_PIPELINE_PROD_ROLE | CI/CD deploys from main branch |

Creating DEV, STAGING, and PROD Environments

The setup/setup.sql script creates all required Snowflake resources for a given environment. Run it three times, once per environment, replacing the ` placeholder with DEV, STAGING, or PROD`.

For each environment the script will:

  • Create the environment role (ML_PIPELINE__ROLE) and grant it required account-level privileges
  • Grant the role to the current user
  • Create the database (ML_PIPELINE__DB) with BASE_DATA and ML_PROJECTS schemas
  • Create DATA_STAGE, BUILD_STAGE, and JOB_STAGE stages
  • Create a default XS warehouse
  • Create a service user (GIT_ACTIONS_) for CI/CD automation
  • Load the included demo dataset (MORTGAGE_LENDING_DEMO_DATA)

Steps:

Use the Snowflake CLI (snow sql) to execute the setup script. The -D flag substitutes the `` template variable:

# Create the DEV environment
snow sql -f setup/setup.sql -D "env=DEV" -c

# Create the STAGING environment
snow sql -f setup/setup.sql -D "env=STAGING" -c

# Create the PROD environment
snow sql -f setup/setup.sql -D "env=PROD" -c

Replace ` with the name of a connection in ~/.snowflake/connections.toml` that has ACCOUNTADMIN access.

---

Creating PATs and Configuring GitHub Secrets

The CI/CD pipeline authenticates to Snowflake using the GIT_ACTIONS_STAGING and GIT_ACTIONS_PROD service users. You need to generate a Programmatic Access Token (PAT) for each.

1. Generate PATs in Snowflake

For each service user (GIT_ACTIONS_STAGING and GIT_ACTIONS_PROD), generate a password or PAT:

-- As ACCOUNTADMIN or USERADMIN
ALTER USER GIT_ACTIONS_STAGING SET PASSWORD = '';
ALTER USER GIT_ACTIONS_PROD SET PASSWORD = '';

Or, if using key-pair authentication, generate and assign RSA keys per Snowflake documentation.

2. Add GitHub Secrets

In your GitHub repository, go to Settings > Secrets and variables > Actions and add the following secrets:

| Secret Name | Value | |---|---| | SNOWFLAKE_ACCOUNT | Your Snowflake account identifier (e.g. myorg-myaccount) | | SNOWFLAKE_STAGING_USER | GIT_ACTIONS_STAGING | | SNOWFLAKE_STAGING_PASSWORD | Password/PAT for the STAGING service user | | SNOWFLAKE_PROD_USER | GIT_ACTIONS_PROD | | SNOWFLAKE_PROD_PASSWORD | Password/PAT for the PROD service user |

The deploy workflow (.github/workflows/deploy.yml) automatically selects the correct user and password based on the branch:

  • Push to `main` uses SNOWFLAKE_PROD_USER / SNOWFLAKE_PROD_PASSWORD
  • Push to any other branch uses SNOWFLAKE_STAGING_USER / SNOWFLAKE_STAGING_PASSWORD

---

Local Development

Set these environment variables for local development:

export SNOWFLAKE_CONNECTION="your_connection_name" # From connections.toml (connection must have role defined)
export SNOWFLAKE_ENVIRONMENT="DEV"

Deploy Feature Store Locally

python feature_store/setup_feature_store.py

This registers entities and feature views defined in feature_store/config.yml, creates warehouses, and applies environment-appropriate privileges. Feature views are automatically versioned based on their definition; only breaking changes (query, entities, schema) increment the version.

Deploy a Project Locally

# Deploy only (creates resources, uploads code, schedules DAGs)
python scripts/deploy_project.py example_project

# Deploy and immediately execute all DAGs (useful for testing)
python scripts/deploy_project.py example_project --run-dag

Create a New Project

./scripts/create_project.sh my_project

This copies the template/ directory into projects/my_project and generates:

  • config.yml - DAG and compute resource configuration
  • utils.py - Project-level utilities (stage path helpers)
  • pip-requirements.txt - Python dependencies for container notebooks and ML Jobs

Clean Up Resources

# Preview what would be deleted
python scripts/cleanup.py example_project --dry-run

# Delete all resources for a project
python scripts/cleanup.py example_project

# Delete specific feature views
python scripts/cleanup.py --features example_features --dry-run
python scripts/cleanup.py --features example_features

# Delete everything (all projects, features, stages)
python scripts/cleanup.py --all --dry-run
python scripts/cleanup.py --all

---

Feature Store

Configuration

Define entities and feature views in feature_store/config.yml:

entities:
- name: loan_entity
join_keys:
- LOAN_ID

warehouses:
- name: FEATURE_STORE
warehouse_size: SMALL
auto_suspend: 300
default: true

feature_views:
- name:…

Excerpt shown — open the source for the full document.

Notability

notability 2.0/10

Low-star repo by Snowflake, likely minor tool.