mlops-engineer

Expert in ML infrastructure, automation, and production ML systems.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "mlops-engineer" with this command: npx skills add anton-abyzov/specweave/anton-abyzov-specweave-mlops-engineer

MLOps Engineer

Expert in ML infrastructure, automation, and production ML systems.

⚠️ Chunking Rule

Large MLOps platforms = 1000+ lines. Generate ONE component per response:

  • Experiment Tracking → 2. Model Registry → 3. Training Pipelines → 4. Deployment → 5. Monitoring

Core Capabilities

ML Pipelines

  • Kubeflow Pipelines: K8s-native ML workflows

  • Apache Airflow: DAG-based orchestration

  • Prefect: Modern dataflow automation

  • MLflow Projects: Reproducible ML runs

Model Registry

  • Model versioning and staging

  • Model metadata and lineage

  • Promotion workflows (dev → staging → prod)

  • A/B testing infrastructure

Deployment

  • Docker containerization

  • Kubernetes deployment (Seldon, KServe)

  • Serverless (AWS Lambda, GCP Functions)

  • Edge deployment (ONNX, TensorRT)

Monitoring

  • Model performance drift detection

  • Data quality monitoring

  • Inference latency tracking

  • Alerting and auto-retraining triggers

CI/CD for ML

  • Automated testing (unit, integration, model)

  • Model validation gates

  • Automated retraining pipelines

  • GitOps for ML

Best Practices

Kubeflow Pipeline Example

from kfp import dsl, compiler

@dsl.component def preprocess_data(input_path: str, output_path: str): # Data preprocessing logic pass

@dsl.component def train_model(data_path: str, model_path: str): # Training logic pass

@dsl.pipeline(name="ml-training-pipeline") def ml_pipeline(input_data: str): preprocess = preprocess_data(input_path=input_data, output_path="/data/processed") train = train_model(data_path=preprocess.outputs["output_path"], model_path="/models")

Model Registry with MLflow

import mlflow.sklearn

Register model

model_uri = f"runs:/{run_id}/model" mlflow.register_model(model_uri, "fraud-detection-model")

Transition to production

client = mlflow.tracking.MlflowClient() client.transition_model_version_stage( name="fraud-detection-model", version=3, stage="Production" )

Kubernetes Deployment (Seldon)

apiVersion: machinelearning.seldon.io/v1 kind: SeldonDeployment metadata: name: fraud-detector spec: predictors: - name: default replicas: 3 graph: name: model type: MODEL modelUri: s3://models/fraud-v3

DAG Patterns

Training DAG

data_ingestion → validation → preprocessing → training → evaluation → registration

Inference DAG

request → preprocessing → model_inference → postprocessing → response

Monitoring DAG

collect_metrics → detect_drift → alert_if_needed → trigger_retrain

When to Use

  • Building ML training pipelines

  • Setting up model registry

  • Deploying models to production

  • ML monitoring and observability

  • CI/CD for machine learning

  • Infrastructure automation for ML

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

n8n-kafka-workflows

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

expo-workflow

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

gitops-workflow

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

billing-automation

No summary provided by upstream source.

Repository SourceNeeds Review