Snowflake Platform Skill

Build and deploy applications on Snowflake's AI Data Cloud using the snow CLI, Cortex AI functions, Native Apps, and Snowpark.

Quick Start

Install Snowflake CLI

pip install snowflake-cli snow --version # Should show 3.14.0+

Configure Connection

Interactive setup

snow connection add

Or create ~/.snowflake/config.toml manually

[connections.default] account = "orgname-accountname" user = "USERNAME" authenticator = "SNOWFLAKE_JWT" private_key_path = "~/.snowflake/rsa_key.p8"

Test Connection

snow connection test -c default snow sql -q "SELECT CURRENT_USER(), CURRENT_ACCOUNT()"

When to Use This Skill

Use when:

Building applications on Snowflake platform
Using Cortex AI functions in SQL queries
Developing Native Apps for Marketplace
Setting up JWT key-pair authentication
Working with Snowpark Python

Don't use when:

Building Streamlit apps (use streamlit-snowflake skill)
Need data engineering/ETL patterns
Working with BI tools (Tableau, Looker)

Cortex AI Functions

Snowflake Cortex provides LLM capabilities directly in SQL. Functions are in the SNOWFLAKE.CORTEX schema.

Core Functions

Function Purpose GA Status

COMPLETE / AI_COMPLETE

Text generation from prompt GA Nov 2025

SUMMARIZE / AI_SUMMARIZE

Summarize text GA

TRANSLATE / AI_TRANSLATE

Translate between languages GA Sep 2025

SENTIMENT / AI_SENTIMENT

Sentiment analysis GA Jul 2025

AI_FILTER

Natural language filtering GA Nov 2025

AI_CLASSIFY

Categorize text/images GA Nov 2025

AI_AGG

Aggregate insights across rows GA Nov 2025

COMPLETE Function

-- Simple prompt SELECT SNOWFLAKE.CORTEX.COMPLETE( 'llama3.1-70b', 'Explain quantum computing in one sentence' ) AS response;

-- With conversation history SELECT SNOWFLAKE.CORTEX.COMPLETE( 'llama3.1-70b', [ {'role': 'system', 'content': 'You are a helpful assistant'}, {'role': 'user', 'content': 'What is Snowflake?'} ] ) AS response;

-- With options SELECT SNOWFLAKE.CORTEX.COMPLETE( 'mistral-large2', 'Summarize this document', {'temperature': 0.3, 'max_tokens': 500} ) AS response;

Available Models:

llama3.1-70b , llama3.1-8b , llama3.2-3b
mistral-large2 , mistral-7b
snowflake-arctic
gemma-7b
claude-3-5-sonnet (200K context)

Model Context Windows (Updated 2025):

Model Context Window Best For

Claude 3.5 Sonnet 200,000 tokens Large documents, long conversations

Llama3.1-70b 128,000 tokens Complex reasoning, medium documents

Llama3.1-8b 8,000 tokens Simple tasks, short text

Llama3.2-3b 8,000 tokens Fast inference, minimal text

Mistral-large2 Variable Check current docs

Snowflake Arctic Variable Check current docs

Token Math: ~4 characters = 1 token. A 32,000 character document ≈ 8,000 tokens.

Error: Input exceeds context window limit → Use smaller model or chunk your input.

SUMMARIZE Function

-- Single text SELECT SNOWFLAKE.CORTEX.SUMMARIZE(article_text) AS summary FROM articles LIMIT 10;

-- Aggregate across rows (no context window limit) SELECT AI_SUMMARIZE_AGG(review_text) AS all_reviews_summary FROM product_reviews WHERE product_id = 123;

TRANSLATE Function

-- Translate to English (auto-detect source) SELECT SNOWFLAKE.CORTEX.TRANSLATE( review_text, '', -- Empty = auto-detect source language 'en' -- Target language ) AS translated FROM international_reviews;

-- Explicit source language SELECT AI_TRANSLATE( description, 'es', -- Source: Spanish 'en' -- Target: English ) AS translated FROM spanish_products;

AI_FILTER (Natural Language Filtering)

Performance: As of September 2025, AI_FILTER includes automatic optimization delivering 2-10x speedup and up to 60% token reduction for suitable queries.

-- Filter with plain English SELECT * FROM customer_feedback WHERE AI_FILTER( feedback_text, 'mentions shipping problems or delivery delays' );

-- Combine with SQL predicates for maximum optimization -- Query planner applies standard filters FIRST, then AI on smaller dataset SELECT * FROM support_tickets WHERE created_date > '2025-01-01' -- Standard filter applied first AND AI_FILTER(description, 'customer is angry or frustrated');

Best Practice: Always combine AI_FILTER with traditional SQL predicates (date ranges, categories, etc.) to reduce the dataset before AI processing. This maximizes the automatic optimization benefits.

Throttling: During peak usage, AI function requests may be throttled with retry-able errors. Implement exponential backoff for production applications (see Known Issue #10).

AI_CLASSIFY

-- Categorize support tickets SELECT ticket_id, AI_CLASSIFY( description, ['billing', 'technical', 'shipping', 'other'] ) AS category FROM support_tickets;

Billing

Cortex AI functions bill based on tokens:

~4 characters = 1 token
Both input AND output tokens are billed
Rates vary by model (larger models cost more)

Cost Management at Scale (Community-sourced):

Real-world production case study showed a single AI_COMPLETE query processing 1.18 billion records cost nearly $5K in credits. Cost drivers to watch:

Cross-region inference: Models not available in your region incur additional data transfer costs
Warehouse idle time: Unused compute still bills, but aggressive auto-suspend adds resume overhead
Large table joins: Complex queries with AI functions multiply costs

-- This seemingly simple query can be expensive at scale SELECT product_id, AI_COMPLETE('mistral-large2', 'Summarize: ' || review_text) as summary FROM product_reviews -- 1 billion rows WHERE created_date > '2024-01-01';

-- Cost = (input tokens + output tokens) × row count × model rate -- At scale, this adds up fast

Best Practices:

Filter datasets BEFORE applying AI functions
Right-size warehouses (don't over-provision)
Monitor credit consumption with QUERY_HISTORY views
Consider batch processing instead of row-by-row AI operations

Source: The Hidden Cost of Snowflake Cortex AI (Community blog with billing evidence)

Authentication

JWT Key-Pair Authentication

Critical: Snowflake uses TWO account identifier formats:

Format Example Used For

Organization-Account irjoewf-wq46213

REST API URLs, connection config

Account Locator NZ90655

JWT claims (iss , sub )

These are NOT interchangeable!

Discover Your Account Locator

SELECT CURRENT_ACCOUNT(); -- Returns: NZ90655

Generate RSA Key Pair

Generate private key (PKCS#8 format required)

openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out ~/.snowflake/rsa_key.p8 -nocrypt

Generate public key

openssl rsa -in ~/.snowflake/rsa_key.p8 -pubout -out ~/.snowflake/rsa_key.pub

Get fingerprint for JWT claims

openssl rsa -in ~/.snowflake/rsa_key.p8 -pubout -outform DER |
openssl dgst -sha256 -binary | openssl enc -base64

-- In Snowflake worksheet (requires ACCOUNTADMIN or SECURITYADMIN) ALTER USER my_user SET RSA_PUBLIC_KEY='MIIBIjANBgkq...';

JWT Claim Format

iss: ACCOUNT_LOCATOR.USERNAME.SHA256:fingerprint sub: ACCOUNT_LOCATOR.USERNAME

Example:

iss: NZ90655.JEZWEB.SHA256:jpZO6LvU2SpKd8tE61OGfas5ZXpfHloiJd7XHLPDEEA= sub: NZ90655.JEZWEB

SPCS Container Authentication (v4.2.0+)

New in January 2026: Connector automatically detects and uses SPCS service identifier tokens when running inside Snowpark Container Services.

No special configuration needed inside SPCS containers

import snowflake.connector

Auto-detects SPCS_TOKEN environment variable

conn = snowflake.connector.connect()

This enables seamless authentication from containerized Snowpark services without explicit credentials.

Source: Release v4.2.0

Snow CLI Commands

Project Management

Initialize project

snow init

Execute SQL

snow sql -q "SELECT 1" snow sql -f query.sql

View logs

snow logs

Native App Commands

Development

snow app run # Deploy and run locally snow app deploy # Upload to stage only snow app teardown # Remove app

Versioning

snow app version create V1_0 snow app version list snow app version drop V1_0

Publishing

snow app publish --version V1_0 --patch 0

Release Channels

snow app release-channel list snow app release-channel add-version --channel ALPHA --version V1_0 snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT

Streamlit Commands

snow streamlit deploy --replace snow streamlit deploy --replace --open

Stage Commands

snow stage list snow stage copy @my_stage/file.txt ./local/

Native App Development

Project Structure

my_native_app/ ├── snowflake.yml # Project config ├── manifest.yml # App manifest ├── setup_script.sql # Installation script ├── app/ │ └── streamlit/ │ ├── environment.yml │ └── streamlit_app.py └── scripts/ └── setup.sql

snowflake.yml

definition_version: 2

native_app: name: my_app package: name: my_app_pkg distribution: external # For marketplace application: name: my_app source_stage: stage/dev artifacts: - src: manifest.yml dest: manifest.yml - src: setup_script.sql dest: setup_script.sql - src: app/streamlit/environment.yml dest: streamlit/environment.yml - src: app/streamlit/streamlit_app.py dest: streamlit/streamlit_app.py enable_release_channels: true # For ALPHA/BETA channels

manifest.yml

manifest_version: 1

artifacts: setup_script: setup_script.sql default_streamlit: streamlit/streamlit_app.py

Note: Do NOT include privileges section - Native Apps can't declare privileges

External Access Integration

Native Apps calling external APIs need this setup:

-- 1. Create network rule (in a real database, NOT app package) CREATE DATABASE IF NOT EXISTS MY_APP_UTILS;

CREATE OR REPLACE NETWORK RULE MY_APP_UTILS.PUBLIC.api_rule MODE = EGRESS TYPE = HOST_PORT VALUE_LIST = ('api.example.com:443');

-- 2. Create integration CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION my_app_integration ALLOWED_NETWORK_RULES = (MY_APP_UTILS.PUBLIC.api_rule) ENABLED = TRUE;

-- 3. Grant to app GRANT USAGE ON INTEGRATION my_app_integration TO APPLICATION MY_APP;

-- 4. CRITICAL: Attach to Streamlit (must repeat after EVERY deploy!) ALTER STREAMLIT MY_APP.config_schema.my_streamlit SET EXTERNAL_ACCESS_INTEGRATIONS = (my_app_integration);

Warning: Step 4 resets on every snow app run . Must re-run after each deploy!

Shared Data Pattern

When your Native App needs data from an external database:

-- 1. Create shared_data schema in app package CREATE SCHEMA IF NOT EXISTS MY_APP_PKG.SHARED_DATA;

-- 2. Create views referencing external database CREATE OR REPLACE VIEW MY_APP_PKG.SHARED_DATA.MY_VIEW AS SELECT * FROM EXTERNAL_DB.SCHEMA.TABLE;

-- 3. Grant REFERENCE_USAGE (CRITICAL!) GRANT REFERENCE_USAGE ON DATABASE EXTERNAL_DB TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;

-- 4. Grant access to share GRANT USAGE ON SCHEMA MY_APP_PKG.SHARED_DATA TO SHARE IN APPLICATION PACKAGE MY_APP_PKG; GRANT SELECT ON ALL VIEWS IN SCHEMA MY_APP_PKG.SHARED_DATA TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;

In setup_script.sql , reference shared_data.view_name (NOT the original database).

Marketplace Publishing

Security Review Workflow

1. Deploy app

snow app run

2. Create version

snow app version create V1_0

3. Check security review status

snow app version list

Wait for review_status = APPROVED

4. Set release directive

snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT

5. Create listing in Snowsight Provider Studio (UI only)

Security Review Statuses

Status Meaning Action

NOT_REVIEWED

Scan hasn't run Check DISTRIBUTION is EXTERNAL

IN_PROGRESS

Scan running Wait

APPROVED

Passed Can publish

REJECTED

Failed Fix issues or appeal

MANUAL_REVIEW

Human reviewing Wait (can take days)

Triggers manual review: External access integrations, Streamlit components, network calls.

Provider Studio Fields

Field Max Length Notes

Title 72 chars App name

Subtitle 128 chars One-liner

Description 10,000 chars HTML editor

Business Needs 6 max Select from dropdown

Quick Start Examples 10 max Title + Description + SQL

Data Dictionary Required Mandatory for data listings (2025)

Paid Listing Prerequisites

Requirement

1 Full Snowflake account (not trial)

2 ACCOUNTADMIN role

3 Provider Profile approved

4 Stripe account configured

5 Provider & Consumer Terms accepted

6 Contact Marketplace Ops

Note: Cannot convert free listing to paid. Must create new listing.

Snowpark Python

Session Setup

from snowflake.snowpark import Session

connection_params = { "account": "orgname-accountname", "user": "USERNAME", "password": "PASSWORD", # Or use private_key_path "warehouse": "COMPUTE_WH", "database": "MY_DB", "schema": "PUBLIC" }

session = Session.builder.configs(connection_params).create()

DataFrame Operations

Read table

df = session.table("MY_TABLE")

Filter and select

result = df.filter(df["STATUS"] == "ACTIVE")
.select("ID", "NAME", "CREATED_AT")
.sort("CREATED_AT", ascending=False)

Execute

result.show()

Collect to Python

rows = result.collect()

Row Access (Common Gotcha)

WRONG - dict() doesn't work on Snowpark Row

config = dict(result[0])

CORRECT - Access columns explicitly

row = result[0] config = { 'COLUMN_A': row['COLUMN_A'], 'COLUMN_B': row['COLUMN_B'], }

DML Statistics (v4.2.0+)

New in January 2026: SnowflakeCursor.stats property exposes granular DML statistics for operations where rowcount is insufficient (e.g., CTAS queries).

Before v4.2.0 - rowcount returns -1 for CTAS

cursor.execute("CREATE TABLE new_table AS SELECT * FROM source WHERE active = true") print(cursor.rowcount) # Returns -1 (not helpful!)

After v4.2.0 - stats property shows actual row counts

cursor.execute("CREATE TABLE new_table AS SELECT * FROM source WHERE active = true") print(cursor.stats) # Returns {'rows_inserted': 1234, 'duplicates': 0, ...}

Source: Release v4.2.0

UDFs and Stored Procedures

from snowflake.snowpark.functions import udf, sproc

Register UDF

@udf(name="my_udf", replace=True) def my_udf(x: int) -> int: return x * 2

Register Stored Procedure

@sproc(name="my_sproc", replace=True) def my_sproc(session: Session, table_name: str) -> str: df = session.table(table_name) count = df.count() return f"Row count: {count}"

REST API (SQL API v2)

The REST API is the foundation for programmatic Snowflake access from Cloudflare Workers.

Endpoint

https://{org-account}.snowflakecomputing.com/api/v2/statements

Required Headers (CRITICAL)

ALL requests must include these headers - missing Accept causes silent failures:

const headers = { 'Authorization': Bearer ${jwt}, 'Content-Type': 'application/json', 'Accept': 'application/json', // REQUIRED - "null" error if missing 'User-Agent': 'MyApp/1.0', };

Async Query Handling

Even simple queries return async (HTTP 202). Always implement polling:

// Submit returns statementHandle, not results const submit = await fetch(url, { method: 'POST', headers, body }); const { statementHandle } = await submit.json();

// Poll until complete while (true) { const status = await fetch(${url}/${statementHandle}, { headers }); if (status.status === 200) break; // Complete if (status.status === 202) { await sleep(2000); // Still running continue; } }

Workers Subrequest Limits

Plan Limit Safe Polling

Free 50 45 attempts @ 2s = 90s max

Paid 1,000 100 attempts @ 500ms = 50s max

Fetch Timeouts

Workers fetch() has no default timeout. Always use AbortController:

const response = await fetch(url, { signal: AbortSignal.timeout(30000), // 30 seconds headers, });

Cancel on Timeout

Cancel queries when timeout occurs to avoid warehouse costs:

POST /api/v2/statements/{statementHandle}/cancel

See templates/snowflake-rest-client.ts for complete implementation.

Known Issues

Account Identifier Confusion

Symptom: JWT auth fails silently, queries don't appear in Query History.

Cause: Using org-account format in JWT claims instead of account locator.

Fix: Use SELECT CURRENT_ACCOUNT() to get the actual account locator.

External Access Reset

Symptom: API calls fail after snow app run .

Cause: External access integration attachment resets on every deploy.

Fix: Re-run ALTER STREAMLIT ... SET EXTERNAL_ACCESS_INTEGRATIONS after each deploy.

Release Channel Syntax

Symptom: ALTER APPLICATION PACKAGE ... SET DEFAULT RELEASE DIRECTIVE fails.

Cause: Legacy SQL syntax doesn't work with release channels enabled.

Fix: Use snow CLI: snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT

Artifact Nesting

Symptom: Files appear in streamlit/streamlit/ instead of streamlit/ .

Cause: Directory mappings in snowflake.yml nest the folder name.

Fix: List individual files explicitly in artifacts, not directories.

REFERENCE_USAGE Missing

Symptom: "A view that is added to the shared content cannot reference objects from other databases"

Cause: Missing GRANT REFERENCE_USAGE ON DATABASE for shared data.

Fix: Always grant REFERENCE_USAGE before snow app run when using external databases.

REST API Missing Accept Header

Symptom: "Unsupported Accept header null is specified" on polling requests.

Cause: Initial request had Accept: application/json but polling request didn't.

Fix: Use consistent headers helper function for ALL requests (submit, poll, cancel).

Workers Fetch Hangs Forever

Symptom: Worker hangs indefinitely waiting for Snowflake response.

Cause: Cloudflare Workers' fetch() has no default timeout.

Fix: Always use AbortSignal.timeout(30000) on all Snowflake requests.

Too Many Subrequests

Symptom: "Too many subrequests" error during polling.

Cause: Polling every 1 second × 600 attempts = 600 subrequests exceeds limits.

Fix: Poll every 2-5 seconds, limit to 45 (free) or 100 (paid) attempts.

Warehouse Not Auto-Resuming (Perceived)

Symptom: Queries return statementHandle but never complete (code 090001 indefinitely).

Cause: 090001 means "running" not error. Warehouse IS resuming, just takes time.

Fix: Auto-resume works. Wait longer or explicitly resume first: POST /api/v2/warehouses/{wh}:resume

Memory Leaks in Connector 4.x (Active Issue)

Error: Long-running Python applications show memory growth over time Source: GitHub Issue #2727, #2725 Affects: snowflake-connector-python 4.0.0 - 4.2.0

Why It Happens:

SessionManager uses defaultdict which prevents garbage collection
SnowflakeRestful.fetch() holds references that leak during query execution

Prevention: Reuse connections rather than creating new ones repeatedly. Fix is in progress via PR #2741 and PR #2726.

AVOID - creates new connection each iteration

for i in range(1000): conn = snowflake.connector.connect(...) cursor = conn.cursor() cursor.execute("SELECT 1") cursor.close() conn.close()

BETTER - reuse connection

conn = snowflake.connector.connect(...) cursor = conn.cursor() for i in range(1000): cursor.execute("SELECT 1") cursor.close() conn.close()

Status: Fix expected in connector v4.3.0 or later

AI Function Throttling During Peak Usage

Error: "Request throttled due to high usage. Please retry." Source: Snowflake Cortex Documentation Affects: All Cortex AI functions (COMPLETE, FILTER, CLASSIFY, etc.)

Why It Happens: AI/LLM requests may be throttled during high usage periods to manage platform capacity. Throttled requests return errors and require manual retries.

Prevention: Implement retry logic with exponential backoff:

import time import snowflake.connector

def execute_with_retry(cursor, query, max_retries=3): for attempt in range(max_retries): try: return cursor.execute(query).fetchall() except snowflake.connector.errors.DatabaseError as e: if "throttled" in str(e).lower() and attempt < max_retries - 1: wait_time = 2 ** attempt # Exponential backoff time.sleep(wait_time) else: raise

Status: Documented behavior, no fix planned

References

Snowflake CLI Documentation
SQL REST API Reference
SQL API Authentication
Cortex AI Functions
Native Apps Framework
Snowpark Python
Marketplace Publishing

Related Skills

streamlit-snowflake
Streamlit in Snowflake apps