model-serving

Databricks Model Serving

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "model-serving" with this command: npx skills add databricks-solutions/ai-dev-kit/databricks-solutions-ai-dev-kit-model-serving

Databricks Model Serving

Deploy MLflow models and AI agents to scalable REST API endpoints.

Quick Decision: What Are You Deploying?

Model Type Pattern Reference

Traditional ML (sklearn, xgboost) mlflow.sklearn.autolog()

1-classical-ml.md

Custom Python model mlflow.pyfunc.PythonModel

2-custom-pyfunc.md

GenAI Agent (LangGraph, tool-calling) ResponsesAgent

3-genai-agents.md

Prerequisites

  • DBR 16.1+ recommended (pre-installed GenAI packages)

  • Unity Catalog enabled workspace

  • Model Serving enabled

Reference Files

Topic File When to Read

Classical ML 1-classical-ml.md sklearn, xgboost, autolog

Custom PyFunc 2-custom-pyfunc.md Custom preprocessing, signatures

GenAI Agents 3-genai-agents.md ResponsesAgent, LangGraph

Tools Integration 4-tools-integration.md UC Functions, Vector Search

Development & Testing 5-development-testing.md MCP workflow, iteration

Logging & Registration 6-logging-registration.md mlflow.pyfunc.log_model

Deployment 7-deployment.md Job-based async deployment

Querying Endpoints 8-querying-endpoints.md SDK, REST, MCP tools

Package Requirements 9-package-requirements.md DBR versions, pip

Quick Start: Deploy a GenAI Agent

Step 1: Install Packages (in notebook or via MCP)

%pip install -U mlflow==3.6.0 databricks-langchain langgraph==0.3.4 databricks-agents pydantic dbutils.library.restartPython()

Or via MCP:

execute_databricks_command(code="%pip install -U mlflow==3.6.0 databricks-langchain langgraph==0.3.4 databricks-agents pydantic")

Step 2: Create Agent File

Create agent.py locally with ResponsesAgent pattern (see 3-genai-agents.md).

Step 3: Upload to Workspace

upload_folder( local_folder="./my_agent", workspace_folder="/Workspace/Users/you@company.com/my_agent" )

Step 4: Test Agent

run_python_file_on_databricks( file_path="./my_agent/test_agent.py", cluster_id="<cluster_id>" )

Step 5: Log Model

run_python_file_on_databricks( file_path="./my_agent/log_model.py", cluster_id="<cluster_id>" )

Step 6: Deploy (Async via Job)

See 7-deployment.md for job-based deployment that doesn't timeout.

Step 7: Query Endpoint

query_serving_endpoint( name="my-agent-endpoint", messages=[{"role": "user", "content": "Hello!"}] )

Quick Start: Deploy a Classical ML Model

import mlflow import mlflow.sklearn from sklearn.linear_model import LogisticRegression

Enable autolog with auto-registration

mlflow.sklearn.autolog( log_input_examples=True, registered_model_name="main.models.my_classifier" )

Train - model is logged and registered automatically

model = LogisticRegression() model.fit(X_train, y_train)

Then deploy via UI or SDK. See 1-classical-ml.md.

MCP Tools

If MCP tools are not available, use the SDK/CLI examples in the reference files below.

Development & Testing

Tool Purpose

upload_folder

Upload agent files to workspace

run_python_file_on_databricks

Test agent, log model

execute_databricks_command

Install packages, quick tests

Deployment

Tool Purpose

create_job

Create deployment job (one-time)

run_job_now

Kick off deployment (async)

get_run

Check deployment job status

Querying

Tool Purpose

get_serving_endpoint_status

Check if endpoint is READY

query_serving_endpoint

Send requests to endpoint

list_serving_endpoints

List all endpoints

Common Workflows

Check Endpoint Status After Deployment

get_serving_endpoint_status(name="my-agent-endpoint")

Returns:

{ "name": "my-agent-endpoint", "state": "READY", "served_entities": [...] }

Query a Chat/Agent Endpoint

query_serving_endpoint( name="my-agent-endpoint", messages=[ {"role": "user", "content": "What is Databricks?"} ], max_tokens=500 )

Query a Traditional ML Endpoint

query_serving_endpoint( name="sklearn-classifier", dataframe_records=[ {"age": 25, "income": 50000, "credit_score": 720} ] )

Common Issues

Issue Solution

Invalid output format Use self.create_text_output_item(text, id)

  • NOT raw dicts!

Endpoint NOT_READY Deployment takes ~15 min. Use get_serving_endpoint_status to poll.

Package not found Specify exact versions in pip_requirements when logging model

Tool timeout Use job-based deployment, not synchronous calls

Auth error on endpoint Ensure resources specified in log_model for auto passthrough

Model not found Check Unity Catalog path: catalog.schema.model_name

Critical: ResponsesAgent Output Format

WRONG - raw dicts don't work:

return ResponsesAgentResponse(output=[{"role": "assistant", "content": "..."}])

CORRECT - use helper methods:

return ResponsesAgentResponse( output=[self.create_text_output_item(text="...", id="msg_1")] )

Available helper methods:

  • self.create_text_output_item(text, id)

  • text responses

  • self.create_function_call_item(id, call_id, name, arguments)

  • tool calls

  • self.create_function_call_output_item(call_id, output)

  • tool results

Resources

  • Model Serving Documentation

  • MLflow 3 ResponsesAgent

  • Agent Framework

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

databricks-python-sdk

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

python-dev

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

skill-test

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

databricks-config

No summary provided by upstream source.

Repository SourceNeeds Review