ML Training Example Projects

Purpose: Provide complete, runnable example projects demonstrating ML training workflows from data preparation through deployment.

Activation Triggers:

User requests example projects or starter templates
User wants to see working sentiment classification code
User needs text generation training examples
User mentions RedAI trade classifier
User wants reference implementations
User needs to understand complete training workflows

Key Resources:

scripts/setup-example.sh
Initialize and setup any example project
scripts/run-training.sh
Execute training for any example
scripts/test-inference.sh
Test trained models
examples/sentiment-classification/
Binary sentiment classification (IMDB-style)
examples/text-generation/
GPT-style text generation with LoRA
examples/redai-trade-classifier/
Financial trade classification with Modal deployment
templates/
Scaffolding for new projects

Available Example Projects

Sentiment Classification

Use Case: Binary sentiment analysis (positive/negative reviews)

Features:

DistilBERT fine-tuning for text classification
Custom dataset loading from JSON
Training with validation metrics
Model saving and inference
Production-ready inference API

Files:

train.py
Complete training script
data.json
Sample training data (50 examples)
inference.py
Inference server
README.md
Setup and usage guide

Dataset Format:

{"text": "This movie was amazing!", "label": 1} {"text": "Terrible waste of time", "label": 0}

Text Generation

Use Case: Fine-tune GPT-2 for custom text generation

Features:

GPT-2 small model fine-tuning
LoRA (Low-Rank Adaptation) for efficient training
Custom tokenization
Generation with temperature/top-p sampling
Modal deployment configuration

Files:

train.py
LoRA training script
config.yaml
Hyperparameters and model config
generate.py
Text generation script
modal_deploy.py
Modal deployment
README.md
Complete guide

Config Structure:

model: name: gpt2 max_length: 512 training: epochs: 3 batch_size: 4 learning_rate: 2e-4 lora: r: 8 alpha: 16 dropout: 0.1

RedAI Trade Classifier

Use Case: Financial trade classification (buy/sell/hold)

Features:

Multi-class classification for trading signals
Feature engineering from market data
Class imbalance handling
Modal deployment for production inference
Real-time prediction API

Files:

train.py
Training with class weighting
modal_deploy.py
Complete Modal deployment
data_preprocessing.py
Feature engineering
README.md
Trading strategy guide

Model Input:

Price features (open, high, low, close)
Volume indicators
Technical indicators (RSI, MACD, moving averages)
Sentiment scores

Quick Start

Setup Any Example

Initialize example project

./scripts/setup-example.sh <project-name>

Options: sentiment-classification, text-generation, redai-trade-classifier

./scripts/setup-example.sh sentiment-classification

What it does:

Creates project directory
Copies example files
Installs dependencies
Downloads/prepares sample data
Validates environment

Run Training

Train model for any example

./scripts/run-training.sh <project-name>

Examples:

./scripts/run-training.sh sentiment-classification ./scripts/run-training.sh text-generation ./scripts/run-training.sh redai-trade-classifier

Monitors:

Training progress
Loss curves
Validation metrics
GPU utilization
Checkpoint saving

Test Inference

Test trained model

./scripts/test-inference.sh <project-name> <input>

Examples:

./scripts/test-inference.sh sentiment-classification "This product is great!" ./scripts/test-inference.sh text-generation "Once upon a time" ./scripts/test-inference.sh redai-trade-classifier market_data.json

Common Workflows

Start From Example Template

Choose example based on use case:

Classification → sentiment-classification
Generation → text-generation
Financial ML → redai-trade-classifier

Setup project:

./scripts/setup-example.sh <example-name>

Customize for your data:

Update data loading in train.py
Modify model architecture if needed
Adjust hyperparameters in config

Run training:

./scripts/run-training.sh <example-name>

Deploy:

Local: Use inference.py
Production: Use modal_deploy.py

Extend Example with Custom Data

Prepare data in example format
Replace data files (data.json, config.yaml)
Update preprocessing if needed
Train with same script
Test inference with new data

Deploy Example to Production

All examples include Modal deployment:

Deploy to Modal

cd examples/<project-name> modal deploy modal_deploy.py

Get endpoint URL

modal app show <app-name>

Example Comparison

Feature Sentiment Text Gen Trade Classifier

Task Type Binary Classification Generation Multi-class

Model DistilBERT GPT-2 + LoRA Custom Transformer

Training Time 5-10 min 15-30 min 10-20 min

GPU Required Optional Recommended Required

Modal Deploy ✅ ✅ ✅

Custom Data Easy Moderate Advanced

Customization Guide

Sentiment Classification

Change dataset:

In train.py, update load_data()

def load_data(path): # Your custom loading logic return texts, labels

Change model:

Replace DistilBERT with other models

model_name = "bert-base-uncased" # or roberta-base, etc.

Text Generation

Change generation style:

In config.yaml

generation: temperature: 0.8 # Higher = more creative top_p: 0.9 # Nucleus sampling max_length: 200 # Output length

Add custom prompts:

In generate.py

prompts = [ "Your custom prompt here", "Another prompt" ]

Trade Classifier

Add features:

In data_preprocessing.py

def engineer_features(df): df['rsi'] = calculate_rsi(df['close']) df['macd'] = calculate_macd(df['close']) # Add your custom indicators return df

Change strategy:

Update labels in train.py

0 = sell, 1 = hold, 2 = buy

labels = your_strategy(prices, indicators)

Dependencies

Each example includes its own requirements.txt :

Sentiment Classification:

transformers
torch
datasets
scikit-learn

Text Generation:

transformers
peft (LoRA)
torch
modal (deployment)

Trade Classifier:

transformers
pandas
numpy
modal
ta (technical analysis)

Troubleshooting

Training Fails

Issue: Out of memory Fix: Reduce batch size in config

Issue: CUDA not available Fix: Use CPU or install CUDA toolkit

Inference Errors

Issue: Model not found Fix: Check checkpoint path in inference script

Issue: Wrong input format Fix: Validate input matches training data format

Deployment Issues

Issue: Modal authentication Fix: Run modal token new to authenticate

Issue: Dependency conflicts Fix: Use exact versions from requirements.txt

Resources

Scripts: All scripts are in scripts/ with execution permissions

Examples: Complete projects in examples/ directory

Templates: Scaffolding in templates/ for creating new projects

Documentation: Each example has detailed README.md

Supported Frameworks: PyTorch, Transformers, PEFT Deployment Platforms: Modal, Local, FastAPI Version: 1.0.0

example-projects

Safety Notice

Copy this and send it to your AI assistant to learn

Initialize example project

Options: sentiment-classification, text-generation, redai-trade-classifier

Train model for any example

Examples:

Test trained model

Examples:

Deploy to Modal

Get endpoint URL

In train.py, update load_data()

Replace DistilBERT with other models

In config.yaml

In generate.py

In data_preprocessing.py

Update labels in train.py

0 = sell, 1 = hold, 2 = buy

Source Transparency

Related Skills

document-parsers

stt-integration

model-routing-patterns