Clay Architecture Variants
Overview
Deployment architectures for Clay data enrichment at different scales. Clay's table-based enrichment model, credit billing, and webhook-driven workflow fit differently depending on volume, team size, and data freshness requirements.
Prerequisites
-
Clay account with API access
-
Clear understanding of data volume and freshness needs
-
Infrastructure for chosen architecture tier
Instructions
Step 1: Direct Integration (Simple)
Best for: Small teams, < 1K enrichments/day, ad-hoc usage.
Application -> Clay API -> Enriched Data -> Application DB
import requests
def enrich_lead(email: str) -> dict: response = requests.post( f"{CLAY_API}/tables/{TABLE_ID}/rows", json={"rows": [{"email": email}]}, headers={"Authorization": f"Bearer {API_KEY}"} ) # Poll for enrichment completion row_id = response.json()["row_ids"][0] return poll_enrichment(TABLE_ID, row_id)
Trade-offs: Simple but synchronous. Each enrichment blocks until complete (30-60s). No retry or buffering.
Step 2: Queue-Based Pipeline (Moderate)
Best for: Growing teams, 1K-50K enrichments/day, CRM integration.
CRM Webhook -> Queue (Redis/SQS) -> Worker -> Clay API | v Enriched Data -> CRM Update
from rq import Queue import redis
q = Queue(connection=redis.Redis())
def on_new_lead(lead: dict): q.enqueue(enrich_and_update, lead, job_timeout=120)
def enrich_and_update(lead: dict): # Batch multiple leads for efficiency enriched = clay_enrich_batch([lead]) crm_client.update_contact(lead["id"], enriched[0])
Trade-offs: Decouples enrichment from user flow. Handles retries and batching. Needs queue infrastructure.
Step 3: Event-Driven with Webhooks (Scale)
Best for: Enterprise, 50K+ enrichments/day, real-time data needs.
Data Sources -> Event Bus (Kafka) -> Clay Enrichment Service | Clay Webhooks -> Event Bus -> Downstream Services
Clay enrichment microservice
class ClayEnrichmentService: def init(self, kafka_producer, credit_budget): self.producer = kafka_producer self.budget = credit_budget
async def handle_event(self, event: dict):
if not self.budget.can_afford(1):
self.producer.send("dlq.clay", event)
return
result = await self.enrich(event)
self.producer.send("enriched.contacts", result)
self.budget.record(1)
Clay webhook receiver
@app.post('/clay-webhook') async def clay_webhook(request): payload = await request.json() # Clay sends enrichment results via webhook await kafka_producer.send("enriched.contacts", payload) return {"status": "ok"}
Trade-offs: Fully async, horizontally scalable. Requires event bus, monitoring, and DLQ handling.
Decision Matrix
Factor Direct Queue-Based Event-Driven
Volume < 1K/day 1K-50K/day 50K+/day
Latency Sync (30-60s) Async (minutes) Async (seconds)
Cost Control Manual Budget caps Credit circuit breaker
Complexity Low Medium High
Team Size 1-3 3-10 10+
Error Handling
Issue Cause Solution
Slow enrichment in request path Using direct integration at scale Move to queue-based
Lost enrichment results No webhook receiver Set up Clay webhooks
Credit overspend No budget enforcement Add credit circuit breaker
Stale data No re-enrichment schedule Add periodic refresh jobs
Examples
Basic usage: Apply clay architecture variants to a standard project setup with default configuration options.
Advanced scenario: Customize clay architecture variants for production environments with multiple constraints and team-specific requirements.
Resources
-
Clay API Docs
-
Clay Webhooks
Output
-
Configuration files or code changes applied to the project
-
Validation report confirming correct implementation
-
Summary of changes made and their rationale