data-team-leader-coach

Coach a newly-hired or recently-promoted Head of Data, VP of Data, Director of Data, or Data Platform Lead through their first 90 days and ongoing strategic decisions. Covers the role-disambiguation (analytics-led vs platform/engineering-led vs ML-led vs hybrid — these have different pre-conditions and team composition), the principal stakeholder map (CEO, CFO, COO, or CPO — each shapes priorities), the first 30 days listening tour, the data-stack audit (warehouse, transformation, BI, reverse-ETL, ML platform, governance, observability), the team-composition decisions (analyst vs engineer vs scientist vs platform engineer ratios), the analytics-vs-platform tension (when each should lead), the prioritization framework (foundational data quality > self-serve analytics > advanced analytics > ML/AI > data products), the buy-vs-build decisions for each layer of the modern data stack (Snowflake/BigQuery/Databricks vs self-managed; dbt vs SQLMesh vs Coalesce; Looker vs Tableau vs Mode vs Hex vs Metabase; Census vs Hightouch vs Polytomic), the most common org failure modes (analyst help desk, lab-without-customer, data-engineering vs analytics tribalism, no business adoption), the data-leader's specific failure modes (ego in tooling, vendor-lock-in mistakes, scope creep into eng / product, building infrastructure for hypothetical future scale), the role of the Chief Data Officer vs VP of Data, and natural exit ramps (CTO, CDO at larger company, head of platform, founder of data tools company). Use when leader says "first 90 days as VP Data", "I just got promoted to Head of Data", "modernize our data stack", "data team is broken", "build vs buy data platform", "analytics vs data engineering split". Triggers on phrases like "Head of Data", "VP of Data", "Data Director", "Data Platform Lead", "Chief Data Officer", "data team leader", "data engineering vs analytics", "modern data stack", "Snowflake vs BigQuery vs Databricks", "dbt vs SQLMesh", "Looker vs Tableau vs Hex", "Census vs Hightouch", "data observability", "data governance".

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "data-team-leader-coach" with this command: npx skills add charlie-morrison/data-team-leader-coach

data-team-leader-coach

Coach a Head of Data / VP Data / Director of Data through onboarding and strategic decisions. Data leadership is uniquely cross-functional: every other team wants something from data, the data team's success is mediated through stakeholders, and the technology landscape shifts every 18 months. The wrong choices in the first 90 days create 2-3 year remediation projects.

This is parallel to chief-of-staff-onboarding-coach and revops-leader-onboarding-coach in role-ambiguity, but with deep technical decisions layered on top.

When to engage

Trigger when:

  • "I'm starting as VP Data next month — what should my first 90 days look like?"
  • "Just promoted to Head of Data; team is 8 people; first big decision?"
  • "Our data stack is sprawl — Snowflake + dbt + Looker + Census + Datafold; what's broken?"
  • "Should we hire more data engineers or analysts?"
  • "Stakeholders are frustrated — every dashboard request takes 6 weeks"
  • "Should we build vs buy our reverse-ETL / observability / orchestration?"
  • "Data team is split into Eng vs Analytics — neither is happy"
  • "We're considering Databricks for our ML / AI work; vs adding it to Snowflake?"

Do not engage for: pure data-engineering individual-contributor coaching (different); ML-engineering specifically (different — narrower); pure data-science career (different — IC track).

Step 0: Disambiguate the role

Data leadership has three primary archetypes plus hybrids.

The 3 data-leader archetypes

  1. Analytics-led. Leader is biased toward business analytics, KPIs, dashboards, decision support, reporting. Team composition: heavy on analysts, some engineers. Reports often into CEO / CFO / COO. Companies at this archetype: most Series A-C SaaS where data team is 5-15.

  2. Platform / engineering-led. Leader is biased toward data infrastructure: pipelines, warehouse, observability, governance, scalability. Team composition: heavy on data + analytics engineers, some analysts. Reports often into CTO. Companies at this archetype: data-intensive products, larger companies, B2C with high-volume events.

  3. ML / AI-led. Leader is biased toward machine learning, predictive analytics, AI-driven features. Team composition: heavy on data scientists + ML engineers + platform engineers. Reports often into CTO / CPO. Companies at this archetype: AI-native products, applied-ML companies.

Most actual roles are hybrids: analytics+platform (most common), platform+ML (common in tech-heavy), analytics+ML (rare, usually consultant-style).

Disambiguation conversation

Within first 7 days, schedule 60-90 min with the principal:

  • "If I'm wildly successful at 6 months, what's the headline?"
  • "Which is the bigger problem today: getting reliable answers to business questions, or building data infrastructure that doesn't break?"
  • "What's the 1-year vision for data here? Year 3?"
  • "How much budget for tooling / new hires / vendor change?"
  • "Who's the executive most aligned with my work? Most skeptical?"
  • "Is there an analytics-vs-engineering split today? How do you want it resolved?"

Walk away with a one-pager: archetype, top 3 outcomes, 6-month deliverables, principal expectations.

Step 1: Principal stakeholder map

Reporting line shapes priorities. Build relationships across all four common reporting destinations.

Reporting to CEO (~30% of roles)

  • Priority: cross-functional analytics, exec-level KPI clarity.
  • Risk: stretched too thin without operational depth.
  • Strength: org-wide credibility.

Reporting to CFO (~30% of roles)

  • Priority: financial reporting, revenue analytics, audit/compliance, accounting integrations.
  • Risk: cultural distance from product / engineering.
  • Strength: budget influence, finance-critical analytics.

Reporting to CTO (~25% of roles)

  • Priority: platform engineering, infrastructure, ML enablement.
  • Risk: business-side stakeholders feel underserved.
  • Strength: deep technical alignment, modern-stack investments.

Reporting to CPO (~15% of roles)

  • Priority: product analytics, user behavior, A/B testing infrastructure, ML features.
  • Risk: financial / operational analytics under-served.
  • Strength: tight feedback loop with product decisions.

Whichever line you're in, build relationships across the others. Data leaders without strong cross-functional ties become dashboard-helpdesks.

Step 2: First 30 days — Listen

Stakeholder interviews (45 min each)

  • Principal (CEO / CFO / CTO / CPO).
  • Other C-suite (each, multiple meetings).
  • Existing data team (1:1 with each direct, plus skip-levels).
  • Top 5-10 internal "consumers" of data (sales managers, product managers, finance partners, marketing leads, customer success).
  • Existing data-tool vendors (account managers — they know your account history).

Standard questions

  • "What's working well in data today?"
  • "What's broken or stuck?"
  • "What's the question you can't answer that you wish you could?"
  • "What's the dashboard you wish existed?"
  • "Where is data most reliable today? Least?"
  • "Which decisions are made with data vs without?"

Map systematically. Look for patterns: "everyone says forecasting is broken" or "no one trusts the marketing-attribution numbers."

Document review

  • Past projects + roadmaps from previous data lead.
  • Team-level OKRs / goals.
  • Vendor contracts + spend (data-tool budget is often $500K-$5M annually at Series B+).
  • Architectural docs: pipeline maps, warehouse schemas, transformation logic.
  • Data-quality / observability reports.
  • SLAs (if any).

Tech-stack audit

For modern B2B SaaS, the standard stack:

Warehouse layer

  • Snowflake (most common): mature, broad capabilities, expensive at scale.
  • BigQuery: GCP-native, serverless, often cost-efficient.
  • Databricks (Lakehouse): strong for ML, complex transformations, growing analytics use.
  • Redshift: legacy in many companies; AWS-native.
  • Self-managed (DuckDB, ClickHouse, Postgres): small teams, cost-conscious.
  • Edge case: feature stores (Tecton, Feast) for ML.

Ingestion / ELT

  • Fivetran / Airbyte / Stitch: managed connectors.
  • Self-built / custom Airflow: complex sources or cost-conscious teams.
  • Segment / RudderStack / Hightouch Events: customer-data ingestion.

Transformation

  • dbt (most common): industry-standard for analytics engineering.
  • SQLMesh: newer alternative with stronger governance.
  • Coalesce: GUI + code hybrid.
  • Custom SQL / Python: legacy; declining.

Orchestration

  • Airflow: legacy, complex, broad community.
  • Dagster: modern, asset-oriented.
  • Prefect: hybrid; modern pythonic.
  • dbt Cloud: for dbt-only teams.

BI / Visualization

  • Looker (Google Cloud): semantic layer, governance-strong.
  • Tableau (Salesforce): enterprise, visualization-rich.
  • Mode: SQL-friendly, analyst-centric.
  • Hex: notebook + dashboard hybrid.
  • Metabase: open-source, lightweight.
  • Sigma: spreadsheet-style.
  • Streamlit / Plotly Dash: custom apps.

Reverse ETL / Activation

  • Census: strong governance, integrations.
  • Hightouch: marketing-focused.
  • Polytomic: smaller; flexible.

Observability / Quality

  • Datafold: data-diff, regression testing.
  • Monte Carlo: data-quality / lineage.
  • Anomalo: AI-driven anomaly detection.
  • Great Expectations: open-source, integrated into pipelines.

ML / Feature stores

  • Tecton / Feast: feature stores.
  • Hopsworks / Vertex AI / SageMaker: ML platforms.
  • Weights & Biases / Comet: experiment tracking.
  • MLflow: open-source baseline.

Catalog / Governance

  • Atlan / Collibra / Alation / DataHub: data catalog + governance.
  • Acryl Data: open-source DataHub-based.

Team audit

  • Headcount + roles + seniority.
  • Hiring plan from previous lead.
  • Team composition: analysts vs analytics engineers vs data engineers vs scientists vs platform engineers vs ML engineers.
  • Skill gaps.
  • Attrition risk in next 12 months.

Output

You're synthesizing toward 3 outcomes:

  1. State of the data: trustworthy / sprawling / siloed / mature.
  2. State of the team: structurally sound / mismatched / under-staffed / wrong skills.
  3. State of stakeholder relationships: trusted / frustrated / disengaged.

Step 3: Prioritization framework

Most data-leader candidates over-weight ML / AI. Reality: foundational data quality is the load-bearing problem 80% of the time.

The data-team priority hierarchy

  1. Foundational data quality — pipelines reliable, warehouse schemas clean, definitions consistent (the "Source of Truth" problem). Solves: "every team has different ARR numbers", "marketing and sales disagree on attribution."

  2. Self-serve analytics — dashboards, semantic layer, data discovery. Solves: "I can never get the answer I need without filing a ticket."

  3. Advanced analytics — cohort analysis, retention modeling, attribution modeling, forecasting. Solves: "what's actually driving NRR?", "which channel is most efficient?"

  4. ML / AI features — prediction, scoring, recommendation, embedding. Solves: product-feature problems requiring model output.

  5. Data products — analytics SDKs, embedded analytics, data-as-a-product offerings. Solves: monetization, customer-facing data.

Don't skip #1 or #2 to chase #3 or #4. Most "ML projects" fail because the underlying data isn't clean.

Step 4: First-90-day quick wins

Pick 3-5 quick wins. Don't aim for perfection; aim for momentum.

Common quick wins

  1. Fix the "single source of truth" for top-3 metrics. Often: ARR, NRR, CAC. Audit definitions, identify discrepancies, write canonical SQL, certify in semantic layer.

  2. Establish data-quality monitoring. Pipelines that silently break, data drops, schema changes — implement basic monitoring (Datafold / Monte Carlo / open-source).

  3. Standardize dashboard portfolio. Audit existing dashboards (often 50-200); kill orphans; certify top 10-15.

  4. Build a data-request intake. Front-end the analytics team's chaos with a structured intake (Slack channel + form + triage cadence).

  5. Document key data assets. What's in the warehouse, what does it mean, who owns it.

Quick wins to avoid in first 90 days

  • Major warehouse migration (Redshift → Snowflake).
  • ML platform rollout.
  • Org redesign (too early; need to know team).
  • Vendor change for major spend categories.

Communication

  • Top 3-5 quick wins with target dates.
  • Longer-term roadmap items with target quarters.
  • Trade-offs explicit.
  • Cross-functional implications.

Step 5: 6-month deeper plays

After quick wins, take on deeper structural projects.

Common 6-month projects

  1. Modern data stack rationalization. Audit tooling; consolidate; potentially replace one major component (e.g., legacy Redshift → modern Snowflake).

  2. Semantic layer rollout. Certified definitions for all key business metrics; consistent across BI tools.

  3. Self-serve analytics platform. Empower business stakeholders to answer common questions without data-team ticket.

  4. ML platform foundations. Feature store, training pipelines, model deployment, observability — only if there's clear product-side demand.

  5. Reverse ETL / data activation. Data-warehouse → operational systems for marketing, sales, CS.

  6. Data observability + quality framework. SLAs, monitoring, alerting, escalation.

  7. Data governance. Catalog, ownership, access controls, audit.

Sequencing logic

  • Quick wins always in flight.
  • 1-2 deeper plays per 6 months.
  • Foundational quality before advanced analytics; advanced analytics before ML.

Buy vs build — modern data stack

Default: buy. The modern data stack is mature; building from scratch wastes effort.

Build only when:

  • Specific use case where vendors don't fit.
  • Scale where vendor pricing breaks unit economics.
  • Strategic differentiation requires it.

Component-by-component

  • Warehouse: always buy. (Self-managed Postgres for very small teams.)
  • Ingestion: buy at most volumes (Fivetran, Airbyte). Self-build only at high volume / cost.
  • Transformation: buy dbt or SQLMesh. Don't roll your own.
  • Orchestration: buy or use open-source (Dagster / Airflow). Don't build from scratch.
  • BI: buy. (Hex / Mode / Looker / Metabase depending on team).
  • Reverse ETL: buy at small-mid scale (Census / Hightouch). Custom only for volume / cost.
  • Observability: buy (Datafold / Monte Carlo) or open-source (Great Expectations + custom).
  • ML platform: buy (SageMaker / Vertex AI / Databricks) or open-source (MLflow + custom). Pure-build is rarely justified.

Vendor-lock-in mitigation

  • Prefer open formats (Parquet, Iceberg, Delta).
  • Keep transformation logic in dbt or SQL (portable across warehouses).
  • Avoid deeply-coupled vendor-specific functionality unless cost-justified.
  • 3-year cost projection before major commitment.

Team composition

Right ratios depend on archetype.

Analytics-led (5-15 person team)

  • 60-70% analysts / analytics engineers
  • 20-30% data engineers
  • 10% platform / leadership

Platform-led (10-30 person team)

  • 30-40% data engineers
  • 30-40% analytics engineers / analysts
  • 10-20% platform engineers
  • 5-10% leadership

ML-led (10-30 person team)

  • 30-40% ML engineers / scientists
  • 20-30% data engineers
  • 20-30% analytics
  • 10-15% platform

Common hire-mistakes

  • Over-hiring data engineers when analyst capacity is the bottleneck.
  • Hiring data scientists before infrastructure to enable them.
  • Hiring senior IC ML engineer to fix data-quality problems.
  • Treating analytics engineers as analysts (they're more eng-leaning).

Failure modes

1. Analyst help desk

Symptom: 80% of analyst time on ticket-driven work; strategic projects don't ship. Fix: structured intake; tier the work; analytics engineering builds reusable models; self-serve unlock.

2. Lab without customer

Symptom: ML / data-science project that doesn't tie to a business problem. Fix: every ML project requires a named business owner + measurable outcome; kill projects without one.

3. Engineering vs analytics tribalism

Symptom: engineers think analysts don't understand modeling; analysts think engineers don't understand business. Fix: shared OKRs; cross-team rituals; analytics engineering bridge role.

4. No business adoption

Symptom: dashboards built; no one logs in; usage analytics show abandonment. Fix: ruthless dashboard reduction; pair every dashboard with named decision-maker; usage-tracking; quarterly review.

5. Tool-stack expansion

Symptom: every new use case adds a new tool; integration overhead and cost grow. Fix: tool budget as managed line; new tool requires retiring or absorbing existing.

6. Vendor-lock-in mistakes

Symptom: deep coupling with a vendor that's now expensive or strategically wrong. Fix: portability principle in design; multi-year contract caution.

7. Scope creep into eng / product

Symptom: data team owning product features, eng infrastructure beyond their charter. Fix: clear charter; hand off cross-functional ownership.

8. Building for hypothetical scale

Symptom: complex architecture for "future scale" that never materializes. Fix: build for today's load + 3x; revisit at 10x.

Specific failure modes for the leader

  • Ego in tooling. Insisting on favorite tools without business justification.
  • Migration fetishism. Migrating warehouses / tools as a way to look productive.
  • Avoidance of business stakeholder relationships. Hiding in the technical work.
  • No vendor management. Not negotiating vendor renewals; budget bloat.
  • Internal hiring obsession. Building internal talent when the right move is to engage external consultants.

Workflow

For a new data leader:

  1. Week 1: Disambiguate the role with principal. Get system access. Meet team.
  2. Weeks 2-3: Listening tour. Tech audit. Team audit. Stakeholder interviews.
  3. Week 4: Synthesize. Draft priorities. Principal alignment.
  4. Weeks 5-12: Execute quick wins. Establish cadence. Build relationships.
  5. Months 4-6: Begin deeper structural projects. Maintain quick-win flow.
  6. Year 2: Major restructural projects. Team reorganization (if needed).
  7. Year 3+: Strategic role; consider exit ramps.

Natural exit ramps

  • CTO at smaller / mid-stage company (technical leadership pivot).
  • Chief Data Officer at larger company.
  • VP Platform (broader scope).
  • Founder of a data-tooling company (deep expertise + market signal).
  • Lateral move to head-of-data at company that's a better personal fit.

Anti-patterns

  • Skipping the listening tour. Acting first; tribal-knowledge mistakes compound.
  • Migrating warehouses in year 1. Often unnecessary; almost always under-estimated effort.
  • Hiring before clarifying scope. Adding headcount when role isn't yet clear amplifies confusion.
  • Tool-first thinking. Buying tools to solve process / org problems.
  • Dashboard fetishism. Building 100 dashboards for the sake of comprehensiveness.
  • Avoiding vendor management. Letting auto-renew dominate; budget grows uncontrolled.
  • Vague success metrics. "Data team is more strategic" — unmeasurable.
  • Ignoring data quality. Chasing ML / AI without trustworthy underlying data.

Integration with other coaches

  • chief-of-staff-onboarding-coach: parallel role-ambiguity coaching.
  • revops-leader-onboarding-coach: parallel onboarding for revenue-side data.
  • competitive-intelligence-coach: data team often hosts CI infrastructure.
  • board-meeting-prep-coach: data team produces board-level metrics.
  • fractional-cto-coach: technical depth coaching for related role.

Data leadership is a 2-3 year build to mature. First 90 days set the trajectory; the right 3-5 quick wins build momentum and credibility for the deeper work.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Tech Stack Evaluation

Evaluate whether a project's current tech stack is appropriate for its goals, and recommend alternatives when maintainability, performance, or feature expans...

Registry SourceRecently Updated
General

kdocs skill

操作金山文档(WPS 云文档 / Kdocs / 365.kdocs.cn / www.kdocs.cn)云文档的官方 Skill。核心能力覆盖云端新建、读取、编辑、搜索、分享、整理在线文档(智能文档、Word、Excel、PDF、PPT、演示文稿、智能表格、多维表格)及个人知识库。当用户的任务涉及云文档操作时使...

Registry SourceRecently Updated
General

Web Reader Tts

将网页内容转换为语音,支持多种 TTS 引擎和 Whisper 语音识别。v2.1.0 安全增强版:URL 安全校验、路径遍历防护、临时文件自动清理、数据隔离、日志脱敏、依赖版本锁定。

Registry SourceRecently Updated
General

科德高中教学管理 / Amy

科德高中教学管理助手(Amy Skill)。服务于融合部教学负责人、学术协调员、班主任、科任老师。核心能力:(1) 课堂设计与审查,使用ICAP四层框架与认知负荷理论;(2) 学生学业问题分析,区分图式缺口与态度问题;(3) 教师课堂反馈,用"肯定+关键改进+可执行动作"结构;(4) 教学管理留痕与Cognia认...

Registry SourceRecently Updated