RepoSnowflake (Arctic)Snowflake (Arctic)published Jan 12, 2026seen 5d

Snowflake-Labs/sfguide-quantitative-research-ai-functions-and-cortex-code

Jupyter Notebook

Open original ↗

Captured source

source ↗

Snowflake-Labs/sfguide-quantitative-research-ai-functions-and-cortex-code

Language: Jupyter Notebook

License: Apache-2.0

Stars: 3

Forks: 6

Open issues: 0

Created: 2026-01-12T19:07:57Z

Pushed: 2026-02-21T00:04:16Z

Default branch: main

Fork: no

Archived: no

README:

Quant Research and Data Science with Cortex Code and AI Functions using Snowflake Public Data

Transform unstructured earnings call transcripts into actionable investment insights using Snowflake Cortex AI, ML Model Registry, and intelligent agents - all accelerated by Cortex Code.

Built By: Harry Yu, Senior Data Scientist, Finance | Snowflake 📧 [h.yu@snowflake.com](mailto:h.yu@snowflake.com) | 💼 LinkedIn | 💻 GitHub

---

Why This Matters

Financial analysts spend countless hours manually reviewing earnings call transcripts. This guide demonstrates how to systematically process unstructured data at scale using AI Functions (AI_COMPLETE, AI_SQL) - turning raw transcript text into structured sentiment scores, analyst participation metrics, and investment signals that feed directly into quantitative models.

> Full Guide: For detailed architecture, business impact, and use cases, see the Snowflake Developers Guide.

---

What You Will Learn

  • How to use Cortex Code to build entire ML pipelines through natural language
  • How to extract structured insights from unstructured text using AI_COMPLETE()
  • How to train and register ML models in Snowflake's Model Registry
  • How to create semantic search over unstructured data with Cortex Search
  • How to build a Semantic View for natural language SQL queries via Cortex Analyst
  • How to build a Cortex Agent that combines multiple AI tools
  • How to access your agent through Snowflake Intelligence

What You Will Build

  • A sentiment analysis pipeline using AI_COMPLETE() to score earnings call transcripts (1-10 scale)
  • A LightGBM stock prediction model with walk-forward validation, registered in Snowflake Model Registry
  • A Cortex Search service for semantic search over sentiment insights
  • A Semantic View enabling natural language queries via Cortex Analyst
  • A Cortex Agent that orchestrates ML predictions, structured queries, semantic search, and email notifications—all accessible via Snowflake Intelligence

Prerequisites

> Note on Privileges: This guide uses ACCOUNTADMIN for simplicity in demo and learning environments. For production deployments, follow the principle of least privilege by creating a dedicated role with only the specific grants required.

Getting Started

Step 1: Run Setup Script

1. In Snowsight, navigate to Projects > Workspaces 2. Create a new SQL file and copy the contents from `scripts/setup.sql` 3. Run the entire script

This creates the complete demo environment including:

  • Auto-installs Snowflake Public Data (Free) from Marketplace
  • Database, warehouse, and role setup
  • Pre-computed ML features (FSI_DATA table)
  • Tables, stored procedures, and ML model infrastructure
  • Deploys reference notebooks from this repository

Step 2: Choose Your Path

| Path | Description | |------|-------------| | [Path A: Cortex Code](#path-a-cortex-code-recommended) | Build everything through natural language prompts | | [Path B: Notebooks](#path-b-notebooks-optional) | Run pre-built notebooks |

---

Path A: Cortex Code (Recommended)

Build the entire quantitative research pipeline through conversation with Cortex Code.

A0: Run Setup Script

Before starting, run the setup script to create the required database objects:

1. Open a SQL worksheet in Snowsight 2. Copy and run the contents of scripts/setup.sql

This creates the database, schema, role, warehouse, and base tables needed for the lab.

A1: Create a New Notebook

1. Navigate to Projects → Notebooks in Snowsight 2. Click + Notebook (top-right) 3. Configure the notebook:

  • Notebook location: FSI_DEMO_DBANALYTICS
  • Notebook warehouse: FSI_DEMO_WH

4. Click Create 5. Delete the auto-populated sample cells (select cell → delete) 6. Add a Python cell with this starter code and run it:

import pandas as pd
import numpy as np
from snowflake.snowpark.context import get_active_session
session = get_active_session()
session.use_role("FSI_DEMO_ROLE")
session.use_warehouse("FSI_DEMO_WH")
session.use_database("FSI_DEMO_DB")
session.use_schema("ANALYTICS")

7. Click Packages (top menu) and add:

  • lightgbm
  • scikit-learn
  • snowflake-ml-python
  • matplotlib
  • seaborn
  • statsmodels

8. Click Start to activate the notebook

A2: Open Cortex Code

Click the Cortex Code icon (bottom-right corner of the notebook).

> Tip: Refresh the page before your first prompt - this helps Cortex Code recognize the notebook context.

A3: Run Prompts in Sequence

Use the following prompts one at a time. Run the generated code after each prompt before moving to the next.

> Tip: You have multiple options to run generated code: > - Click the + button to add it as a new cell in your notebook > - Use the Run option when Cortex Code offers it > - Click the play button to run it within the chat interface

---

Prompt 1: AI Sentiment Extraction

Using FSI_DEMO_DB.ANALYTICS.UNIQUE_TRANSCRIPTS table, extract analyst sentiment from earnings call transcripts.

Use AI_COMPLETE with claude-4-sonnet to analyze each transcript. Focus ONLY on analyst questions and tone (ignore management remarks). Score sentiment on 1-10 scale where 1=extremely negative, 5=neutral, 10=extremely positive.

Return JSON with: score (1-10), reason (brief explanation), analyst_count (number of unique analysts).

Insert results into AI_TRANSCRIPTS_ANALYSTS_SENTIMENTS table with columns: PRIMARY_TICKER, EVENT_TIMESTAMP, EVENT_TYPE, CREATED_AT, SENTIMENT_SCORE, UNIQUE_ANALYST_COUNT, SENTIMENT_REASON.

Filter…

Excerpt shown — open the source for the full document.

Notability

notability 2.0/10

Low-star tutorial repo, not notable