VectorBT Backtesting Expert Skill
Environment
-
Python with vectorbt, pandas, numpy, plotly
-
Data sources: OpenAlgo (Indian markets), DuckDB (direct database), yfinance (US/Global), CCXT (Crypto), custom providers
-
DuckDB support: supports both custom DuckDB and OpenAlgo Historify format
-
API keys loaded from single root .env via python-dotenv
- find_dotenv() — never hardcode keys
-
Technical indicators: TA-Lib (ALWAYS - never use VectorBT built-in indicators)
-
Specialty indicators: openalgo.ta for Supertrend, Donchian, Ichimoku, HMA, KAMA, ALMA, ZLEMA, VWMA
-
Signal cleaning: openalgo.ta for exrem, crossover, crossunder, flip
-
Fee model: Indian market standard (STT + statutory charges + Rs 20/order)
-
Benchmark: NIFTY 50 via OpenAlgo (NSE_INDEX ) by default
-
Charts: Plotly with template="plotly_dark"
-
Environment variables loaded from single .env at project root via find_dotenv() (walks up from script dir)
-
Scripts go in backtesting/{strategy_name}/ directories (created on-demand, not pre-created)
-
Never use icons/emojis in code or logger output
Critical Rules
-
ALWAYS use TA-Lib for ALL technical indicators (EMA, SMA, RSI, MACD, BBANDS, ATR, ADX, STDDEV, MOM). NEVER use vbt.MA.run() , vbt.RSI.run() , or any VectorBT built-in indicator.
-
Use OpenAlgo ta for indicators NOT in TA-Lib: Supertrend, Donchian, Ichimoku, HMA, KAMA, ALMA, ZLEMA, VWMA.
-
Use OpenAlgo ta for signal utilities: ta.exrem() , ta.crossover() , ta.crossunder() , ta.flip() . If openalgo.ta is not importable (standalone DuckDB), use inline exrem() fallback. See duckdb-data.
-
Always clean signals with ta.exrem() after generating raw buy/sell signals. Always .fillna(False) before exrem.
-
Market-specific fees: India (indian-market-costs), US (us-market-costs), Crypto (crypto-market-costs). Auto-select based on user's market.
-
Default benchmarks: India=NIFTY via OpenAlgo, US=S&P 500 (^GSPC ), Crypto=Bitcoin (BTC-USD ). See data-fetching Market Selection Guide.
-
Always produce a Strategy vs Benchmark comparison table after every backtest.
-
Always explain the backtest report in plain language so even normal traders understand risk and strength.
-
Plotly candlestick charts must use xaxis type="category" to avoid weekend gaps.
-
Whole shares: Always set min_size=1, size_granularity=1 for equities.
-
DuckDB data loading: When user provides a DuckDB path, load data directly using duckdb.connect() with read_only=True . Auto-detect format: OpenAlgo Historify (table market_data , epoch timestamps) vs custom (table ohlcv , date+time columns). See duckdb-data.
Modular Rule Files
Detailed reference for each topic is in rules/ :
Rule File Topic
data-fetching OpenAlgo (India), yfinance (US), CCXT (Crypto), custom providers, .env setup
simulation-modes from_signals, from_orders, from_holding, direction types
position-sizing Amount/Value/Percent/TargetPercent sizing
indicators-signals TA-Lib indicator reference, signal generation
openalgo-ta-helpers OpenAlgo ta: exrem, crossover, Supertrend, Donchian, Ichimoku, MAs
stop-loss-take-profit Fixed SL, TP, trailing stop
parameter-optimization Broadcasting and loop-based optimization
performance-analysis Stats, metrics, benchmark comparison, CAGR
plotting Candlestick (category x-axis), VectorBT plots, custom Plotly
indian-market-costs Indian market fee model by segment
us-market-costs US market fee model (stocks, options, futures)
crypto-market-costs Crypto fee model (spot, USDT-M, COIN-M futures)
futures-backtesting Lot sizes (SEBI revised Dec 2025), value sizing
long-short-trading Simultaneous long/short, direction comparison
duckdb-data DuckDB direct loading, Historify format, auto-detect, resampling, multi-symbol
csv-data-resampling Loading CSV, resampling with Indian market alignment
walk-forward Walk-forward analysis, WFE ratio
robustness-testing Monte Carlo, noise test, parameter sensitivity, delay test
pitfalls Common mistakes and checklist before going live
strategy-catalog Strategy reference with code snippets
quantstats-tearsheet QuantStats HTML reports, metrics, plots, Monte Carlo
Strategy Templates (in rules/assets/)
Production-ready scripts with realistic fees, NIFTY benchmark, comparison table, and plain-language report:
Template Path Description
EMA Crossover assets/ema_crossover/backtest.py
EMA 10/20 crossover
RSI assets/rsi/backtest.py
RSI(14) oversold/overbought
Donchian assets/donchian/backtest.py
Donchian channel breakout
Supertrend assets/supertrend/backtest.py
Supertrend with intraday sessions
MACD assets/macd/backtest.py
MACD signal-candle breakout
SDA2 assets/sda2/backtest.py
SDA2 trend following
Momentum assets/momentum/backtest.py
Double momentum (MOM + MOM-of-MOM)
Dual Momentum assets/dual_momentum/backtest.py
Quarterly ETF rotation
Buy & Hold assets/buy_hold/backtest.py
Static multi-asset allocation
RSI Accumulation assets/rsi_accumulation/backtest.py
Weekly RSI slab-wise accumulation
Walk-Forward assets/walk_forward/template.py
Walk-forward analysis template
Realistic Costs assets/realistic_costs/template.py
Transaction cost impact comparison
Quick Template: Standard Backtest Script
import os from datetime import datetime, timedelta from pathlib import Path
import numpy as np import pandas as pd import talib as tl import vectorbt as vbt from dotenv import find_dotenv, load_dotenv from openalgo import api, ta
--- Config ---
script_dir = Path(file).resolve().parent load_dotenv(find_dotenv(), override=False)
SYMBOL = "SBIN" EXCHANGE = "NSE" INTERVAL = "D" INIT_CASH = 1_000_000 FEES = 0.00111 # Indian delivery equity (STT + statutory) FIXED_FEES = 20 # Rs 20 per order ALLOCATION = 0.75 BENCHMARK_SYMBOL = "NIFTY" BENCHMARK_EXCHANGE = "NSE_INDEX"
--- Fetch Data ---
client = api( api_key=os.getenv("OPENALGO_API_KEY"), host=os.getenv("OPENALGO_HOST", "http://127.0.0.1:5000"), )
end_date = datetime.now().date() start_date = end_date - timedelta(days=365 * 3)
df = client.history( symbol=SYMBOL, exchange=EXCHANGE, interval=INTERVAL, start_date=start_date.strftime("%Y-%m-%d"), end_date=end_date.strftime("%Y-%m-%d"), ) if "timestamp" in df.columns: df["timestamp"] = pd.to_datetime(df["timestamp"]) df = df.set_index("timestamp") else: df.index = pd.to_datetime(df.index) df = df.sort_index() if df.index.tz is not None: df.index = df.index.tz_convert(None)
close = df["close"]
--- Strategy: EMA Crossover (TA-Lib) ---
ema_fast = pd.Series(tl.EMA(close.values, timeperiod=10), index=close.index) ema_slow = pd.Series(tl.EMA(close.values, timeperiod=20), index=close.index)
buy_raw = (ema_fast > ema_slow) & (ema_fast.shift(1) <= ema_slow.shift(1)) sell_raw = (ema_fast < ema_slow) & (ema_fast.shift(1) >= ema_slow.shift(1))
entries = ta.exrem(buy_raw.fillna(False), sell_raw.fillna(False)) exits = ta.exrem(sell_raw.fillna(False), buy_raw.fillna(False))
--- Backtest ---
pf = vbt.Portfolio.from_signals( close, entries, exits, init_cash=INIT_CASH, size=ALLOCATION, size_type="percent", fees=FEES, fixed_fees=FIXED_FEES, direction="longonly", min_size=1, size_granularity=1, freq="1D", )
--- Benchmark ---
df_bench = client.history( symbol=BENCHMARK_SYMBOL, exchange=BENCHMARK_EXCHANGE, interval=INTERVAL, start_date=start_date.strftime("%Y-%m-%d"), end_date=end_date.strftime("%Y-%m-%d"), ) if "timestamp" in df_bench.columns: df_bench["timestamp"] = pd.to_datetime(df_bench["timestamp"]) df_bench = df_bench.set_index("timestamp") else: df_bench.index = pd.to_datetime(df_bench.index) df_bench = df_bench.sort_index() if df_bench.index.tz is not None: df_bench.index = df_bench.index.tz_convert(None) bench_close = df_bench["close"].reindex(close.index).ffill().bfill() pf_bench = vbt.Portfolio.from_holding(bench_close, init_cash=INIT_CASH, fees=FEES, freq="1D")
--- Results ---
print(pf.stats())
--- Strategy vs Benchmark ---
comparison = pd.DataFrame({ "Strategy": [ f"{pf.total_return() * 100:.2f}%", f"{pf.sharpe_ratio():.2f}", f"{pf.sortino_ratio():.2f}", f"{pf.max_drawdown() * 100:.2f}%", f"{pf.trades.win_rate() * 100:.1f}%", f"{pf.trades.count()}", f"{pf.trades.profit_factor():.2f}", ], f"Benchmark ({BENCHMARK_SYMBOL})": [ f"{pf_bench.total_return() * 100:.2f}%", f"{pf_bench.sharpe_ratio():.2f}", f"{pf_bench.sortino_ratio():.2f}", f"{pf_bench.max_drawdown() * 100:.2f}%", "-", "-", "-", ], }, index=["Total Return", "Sharpe Ratio", "Sortino Ratio", "Max Drawdown", "Win Rate", "Total Trades", "Profit Factor"]) print(comparison.to_string())
--- Explain ---
print(f"* Total Return: {pf.total_return() * 100:.2f}% vs NIFTY {pf_bench.total_return() * 100:.2f}%") print(f"* Max Drawdown: {pf.max_drawdown() * 100:.2f}%") print(f" -> On Rs {INIT_CASH:,}, worst temporary loss = Rs {abs(pf.max_drawdown()) * INIT_CASH:,.0f}")
--- Plot ---
fig = pf.plot(subplots=['value', 'underwater', 'cum_returns'], template="plotly_dark") fig.show()
--- Export ---
pf.positions.records_readable.to_csv(script_dir / f"{SYMBOL}_trades.csv", index=False)
Quick Template: DuckDB Backtest Script
import datetime as dt from pathlib import Path
import duckdb import numpy as np import pandas as pd import talib as tl import vectorbt as vbt
try: from openalgo import ta exrem = ta.exrem except ImportError: def exrem(signal1, signal2): result = signal1.copy() active = False for i in range(len(signal1)): if active: result.iloc[i] = False if signal1.iloc[i] and not active: active = True if signal2.iloc[i]: active = False return result
--- Config ---
SYMBOL = "SBIN" DB_PATH = r"path/to/market_data.duckdb" INIT_CASH = 1_000_000 FEES = 0.000225 # Intraday equity FIXED_FEES = 20
--- Load from DuckDB ---
con = duckdb.connect(DB_PATH, read_only=True) df = con.execute(""" SELECT date, time, open, high, low, close, volume FROM ohlcv WHERE symbol = ? ORDER BY date, time """, [SYMBOL]).fetchdf() con.close()
df["datetime"] = pd.to_datetime(df["date"].astype(str) + " " + df["time"].astype(str)) df = df.set_index("datetime").sort_index() df = df.drop(columns=["date", "time"])
--- Resample to 5min ---
df_5m = df.resample("5min", origin="start_day", offset="9h15min", label="right", closed="right").agg({ "open": "first", "high": "max", "low": "min", "close": "last", "volume": "sum" }).dropna() close = df_5m["close"]