Portfolio Optimization

Overview

This skill provides guidance for implementing high-performance portfolio optimization algorithms using Python C extensions. It covers the workflow for creating C extensions that interface with NumPy arrays, proper verification strategies, and common pitfalls to avoid when optimizing numerical computations.

When to Apply This Skill

Apply this skill when:

Implementing portfolio risk calculations (variance, volatility, Sharpe ratio)
Optimizing matrix-vector operations for large asset portfolios
Creating C extensions for Python numerical code
Performance requirements specify speedup ratios (e.g., >= 1.2x)
Working with covariance matrices and portfolio weights

Recommended Workflow

Phase 1: Codebase Understanding

Before writing any code:

Read all relevant source files completely - Understand the baseline implementation, data structures, and expected interfaces
Identify the mathematical operations - Common operations include:
Matrix-vector multiplication (covariance matrix times weights)
Dot products (weights times returns)
Square root operations (for volatility from variance)
Understand the test suite - Know what correctness tolerances are expected (e.g., 1e-10) and what performance benchmarks must be met
Document the input/output contracts - Array shapes, data types (typically float64), and return value specifications

Phase 2: Implementation Planning

Consider these factors before implementation:

Why C provides speedup:

Eliminates Python interpreter overhead
Enables direct memory access without bounds checking
Allows compiler optimizations (vectorization, loop unrolling)
Reduces temporary array allocations

Design decisions to make:

Whether to use NumPy C API for zero-copy array access
Memory layout assumptions (C-contiguous vs Fortran-contiguous)
Error handling strategy for type mismatches and dimension errors

Potential algorithmic optimizations:

Cache-friendly memory access patterns (row-major iteration for C arrays)
SIMD vectorization opportunities
Minimizing Python-to-C data conversion overhead

Phase 3: C Extension Implementation

When implementing the C extension:

Include proper headers:

Python.h (must be first)
numpy/arrayobject.h for NumPy array access

Initialize NumPy in the module init function:

Call import_array() to initialize NumPy C API

Use NumPy C API for array access:

PyArray_DATA() for getting data pointer
PyArray_DIM() for dimensions
PyArray_STRIDE() for memory strides
Check PyArray_IS_C_CONTIGUOUS() for memory layout

Implement robust error handling:

Validate array dimensions match expected shapes
Check data types (expect NPY_FLOAT64 for double precision)
Handle non-contiguous arrays (either reject or handle strides)
Set appropriate Python exceptions on error

Phase 4: Python Wrapper Implementation

Create a Python module that:

Imports the C extension module
Provides a clean interface matching the baseline API
Handles any necessary array preparation (ensuring contiguity)
Documents the interface clearly

Phase 5: Verification Strategy

Critical: Verify every change completely

After editing files, re-read them - Confirm edits were applied correctly, especially for multi-line changes

Test incrementally:

Build the C extension first and verify it compiles
Test individual functions before running full benchmarks
Use small test cases for correctness verification before scaling up

Correctness verification:

Compare outputs against baseline implementation
Use appropriate numerical tolerances (typically 1e-10 for double precision)
Test with known inputs where expected outputs can be calculated manually

Performance verification:

Run benchmarks with representative data sizes
Verify speedup meets requirements across different portfolio sizes
Test edge cases: small portfolios (n=1, n=10), large portfolios (n=5000+)

Edge Cases to Handle

Ensure the implementation addresses:

Empty portfolios (n=0) - Return appropriate default or error
Single-asset portfolios (n=1) - Degenerate case for covariance
Dimension mismatches - Weights vector length vs covariance matrix dimensions
Invalid inputs:
Non-square covariance matrices
NaN or infinity values in inputs
Negative variance (mathematically invalid)
Memory considerations:
Non-contiguous NumPy arrays
Memory allocation failures in C code
Large portfolios that may stress memory

Common Pitfalls to Avoid

Code Completeness

Never truncate code in edit operations - always provide complete implementations
Verify file contents after editing to confirm changes applied correctly
Document all design choices explicitly

Testing Approach

Avoid going directly from implementation to full benchmark testing
Test each function individually before integration testing
Do not rely solely on "tests pass" for validation - understand why they pass

C Extension Specific

Always check NumPy array types before accessing data
Handle reference counting properly to avoid memory leaks
Initialize NumPy API with import_array() in module init
Use PyErr_SetString() to set exceptions on errors

Performance Validation

Verify speedup is consistent across different input sizes
Profile if further optimizations might be needed
Consider the overhead of Python-to-C transitions for small inputs

Build and Test Commands

Typical workflow commands:

Build the C extension

python setup.py build_ext --inplace

Run correctness tests

python -c "from portfolio_optimized import *; # test calls"

Run benchmark

python benchmark.py

Run full test suite

pytest test_portfolio.py -v

Verification Checklist

Before considering the task complete:

All source files read and understood
C extension compiles without warnings
Individual functions tested for correctness
Numerical results match baseline within tolerance
Performance meets speedup requirements
Edge cases explicitly tested or handled
Error handling implemented for invalid inputs
File contents verified after all edits
No memory leaks in C code (proper reference counting)

portfolio-optimization

Safety Notice

Copy this and send it to your AI assistant to learn

Build the C extension

Run correctness tests

Run benchmark

Run full test suite

Source Transparency

Related Skills

extracting-pdf-text

video-processing

google-workspace

imessage