spice-caching

Configure Spice.ai in-memory caching for SQL query results, search results, and embeddings. Use when setting up caching, tuning cache TTL/size/eviction, configuring stale-while-revalidate, custom cache keys, or cache-control headers.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "spice-caching" with this command: npx skills add spiceai/skills/spiceai-skills-spice-caching

Spice Caching

Configure in-memory caching for SQL query results, search results, and embeddings in the Spice runtime.

Overview

Spice caches results from SQL queries (/v1/sql), search (/v1/search), and embeddings requests. All three caches are enabled by default with a 1-second TTL and 128 MiB max size. Caching applies to HTTP and Arrow Flight APIs.

Configuration

Caching is configured under runtime.caching in spicepod.yaml:

version: v1
kind: Spicepod
name: app

runtime:
  caching:
    sql_results:
      enabled: true
      max_size: 1GiB # Default 128MiB
      item_ttl: 1m # Default 1s
      eviction_policy: lru # lru | tiny_lfu
      hashing_algorithm: xxh3
      cache_key_type: plan # plan | sql
      encoding: none # none | zstd
      stale_while_revalidate_ttl: 30s # Default 0s (disabled)
    search_results:
      enabled: true
      max_size: 1GiB
      item_ttl: 1m
      eviction_policy: lru
    embeddings:
      enabled: true
      max_size: 128MiB
      item_ttl: 1m

Common Parameters (All Cache Types)

ParameterDefaultDescription
enabledtrueEnable/disable the cache
max_size128MiBMaximum cache size
eviction_policylrulru (Least Recently Used) or tiny_lfu (higher hit rate for skewed access)
item_ttl1sCache entry TTL (Time to Live)
hashing_algorithmxxh3Hash for cache keys: xxh3, ahash, siphash, blake3, xxh32, xxh64, xxh128

SQL Results Extra Parameters

ParameterDefaultDescription
cache_key_typeplanplan = logical plan (matches semantically equivalent queries); sql = raw SQL string (faster, exact match only)
encodingnonenone or zstd (compresses cached results, 50-90% reduction)
stale_while_revalidate_ttl0sServe stale entries while refreshing in background. 0s = disabled

Choosing Parameters

cache_key_type

  • plan (default): Matches semantically equivalent queries even with different SQL syntax. Requires query parsing overhead.
  • sql: Faster lookups, exact string match. Avoid with dynamic functions like NOW().

eviction_policy

  • lru (default): Good general-purpose policy.
  • tiny_lfu: Better hit rate when some queries are accessed much more frequently than others.

encoding

  • none (default): Zero compression overhead, uses more memory.
  • zstd: High compression (50-90% reduction) with fast decompression. Use for large result sets.

hashing_algorithm

  • xxh3 (default): Fastest general-purpose.
  • ahash / xxh64 / xxh128: Lower collision probability for many cached queries.
  • blake3: Cryptographic security required.
  • siphash: Protection against hash-flooding DoS attacks.

Stale-While-Revalidate

When stale_while_revalidate_ttl is set to a non-zero value:

  1. Cache entries are served normally until item_ttl expires.
  2. After item_ttl expires but before item_ttl + stale_while_revalidate_ttl, the stale entry is served immediately with STALE status.
  3. A background task refreshes the cache entry.
  4. After item_ttl + stale_while_revalidate_ttl, the entry is evicted.
runtime:
  caching:
    sql_results:
      enabled: true
      item_ttl: 10s
      stale_while_revalidate_ttl: 10s
      # Fresh for 10s → Stale (served while refreshing) for 10s → Evicted

Conflict warning: When using refresh_mode: caching on a dataset, do not configure both runtime.caching.sql_results.stale_while_revalidate_ttl and acceleration.params.caching_stale_while_revalidate_ttl for the same dataset. Choose one approach.

Cache Control Headers

HTTP API

Use the standard Cache-Control header with /v1/sql and /v1/search:

DirectiveDescription
no-cacheSkip cache for this request; cache the result for future requests
min-fresh=NRequire cached entry to remain fresh for at least N seconds
max-stale=NAccept stale responses up to N seconds old
only-if-cachedReturn only cached responses; error on cache miss
stale-if-error=NServe stale cache (up to N seconds) if fetching fresh data fails
# Skip cache for this query
curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Only accept fresh results (at least 30s remaining)
curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Accept stale up to 60s
curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Only return if cached
curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Spice CLI

spice sql --cache-control no-cache
spice sql --cache-control min-fresh=30
spice sql --cache-control max-stale=60
spice sql --cache-control only-if-cached
spice search --cache-control no-cache

Arrow FlightSQL

Set cache-control in request metadata:

let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();
request.metadata_mut().insert("cache-control", "no-cache");

JDBC:

Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);

Custom Cache Keys

Set the Spice-Cache-Key header to share cache entries across semantically equivalent but syntactically different queries. Valid keys: up to 128 alphanumeric characters plus - and _. Custom keys take precedence over cache_key_type.

# First query — cache MISS
curl -XPOST http://localhost:8090/v1/sql \
  -H "spice-cache-key: users_spiceai" \
  -d "select * from users where org_id = 1;"

# Different query, same cache key — cache HIT
curl -XPOST http://localhost:8090/v1/sql \
  -H "spice-cache-key: users_spiceai" \
  -d "select * from users where split_part(email, '@', 2) = 'spice.ai';"

Warning: Ensure queries sharing a cache key are truly semantically equivalent. The runtime will return the cached result regardless of the actual query.

Response Headers

Responses include a header indicating cache status:

Cache TypeResponse Header
sql_resultsResults-Cache-Status
search_resultsSearch-Results-Cache-Status
StatusMeaning
HITServed from cache
MISSCache checked, result not found
BYPASSCache bypassed (e.g., cache-control: no-cache)
STALEStale entry served while revalidating
(absent)Cache did not apply (disabled or system table query)

Monitoring / Metrics

Cache metrics are available at the Prometheus-compatible metrics endpoint. Prefix by cache type: results_*, search_results_*, embeddings_*.

MetricTypeDescription
*_cache_max_size_bytesGaugeConfigured max cache size
*_cache_requestsCounterTotal cache lookups
*_cache_hitsCounterTotal cache hits
*_cache_items_countGaugeCurrent items in cache
*_cache_size_bytesGaugeCurrent cache size
*_cache_evictionsCounterTotal evictions
*_cache_hit_ratioGaugeHit ratio (hits / total)

Common Recipes

High-throughput Dashboard (Maximize Hit Rate)

runtime:
  caching:
    sql_results:
      item_ttl: 30s
      max_size: 2GiB
      eviction_policy: tiny_lfu
      encoding: zstd
      stale_while_revalidate_ttl: 30s

Low-Latency API (Exact Queries, Fast Lookups)

runtime:
  caching:
    sql_results:
      item_ttl: 5s
      cache_key_type: sql
      hashing_algorithm: xxh3

Disable Caching Entirely

runtime:
  caching:
    sql_results:
      enabled: false
    search_results:
      enabled: false
    embeddings:
      enabled: false

Troubleshooting

IssueSolution
Always getting MISSCheck item_ttl is long enough; verify cache_key_type (plan matches equivalent queries, sql requires exact strings)
Cache filling up quicklyIncrease max_size, enable zstd encoding, or reduce item_ttl
Stale data being servedReduce item_ttl or stale_while_revalidate_ttl; use cache-control: no-cache for specific queries
Dynamic functions (NOW()) returning cached resultsSwitch to cache_key_type: plan or use cache-control: no-cache
SWR conflict errorDon't set both runtime.caching.sql_results.stale_while_revalidate_ttl and acceleration.params.caching_stale_while_revalidate_ttl for the same dataset

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

spice-data-connector

No summary provided by upstream source.

Repository SourceNeeds Review
General

spice-models

No summary provided by upstream source.

Repository SourceNeeds Review
General

spicepod-config

No summary provided by upstream source.

Repository SourceNeeds Review
General

spice-accelerators

No summary provided by upstream source.

Repository SourceNeeds Review