Databricks Deploy Integration
Overview
Deploy Databricks jobs and pipelines using Databricks Asset Bundles (DABs). Asset Bundles provide infrastructure-as-code for deploying jobs, notebooks, DLT pipelines, and ML models across workspaces with proper environment isolation and CI/CD integration.
Prerequisites
-
Databricks CLI v0.200+ installed (databricks command)
-
Workspace access with appropriate permissions
-
Service principal for automated deployments
-
databricks.yml bundle configuration
Instructions
Step 1: Initialize Asset Bundle
Create new bundle from template
databricks bundle init
Or manually create databricks.yml
databricks.yml
bundle: name: etl-pipeline
workspace: host: https://myworkspace.cloud.databricks.com
resources: jobs: daily_etl: name: daily-etl-${bundle.environment} schedule: quartz_cron_expression: "0 0 6 * * ?" timezone_id: "America/New_York" tasks: - task_key: extract notebook_task: notebook_path: ./src/extract.py new_cluster: spark_version: "14.3.x-scala2.12" node_type_id: "i3.xlarge" num_workers: 2 - task_key: transform depends_on: - task_key: extract notebook_task: notebook_path: ./src/transform.py
environments: development: default: true workspace: host: https://dev.cloud.databricks.com staging: workspace: host: https://staging.cloud.databricks.com production: workspace: host: https://prod.cloud.databricks.com
Step 2: Deploy to Environment
Validate bundle configuration
databricks bundle validate -e production
Deploy resources (create/update jobs, notebooks)
databricks bundle deploy -e staging
Run a specific job
databricks bundle run daily_etl -e staging
Destroy resources in an environment
databricks bundle destroy -e development
Step 3: CI/CD Pipeline
.github/workflows/deploy.yml
name: Deploy Databricks Bundle on: push: branches: [main]
jobs: deploy-staging: runs-on: ubuntu-latest environment: staging steps: - uses: actions/checkout@v4 - uses: databricks/setup-cli@main - run: databricks bundle validate -e staging env: DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }} DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }} - run: databricks bundle deploy -e staging env: DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }} DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
deploy-production: needs: deploy-staging runs-on: ubuntu-latest environment: production steps: - uses: actions/checkout@v4 - uses: databricks/setup-cli@main - run: databricks bundle deploy -e production env: DATABRICKS_HOST: ${{ secrets.DATABRICKS_PROD_HOST }} DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_PROD_TOKEN }}
Step 4: Verify Deployment
List deployed jobs
databricks jobs list --output json | jq '.[] | select(.settings.name | contains("etl"))'
Check recent runs
databricks runs list --job-id $JOB_ID --limit 5
Get run output
databricks runs get-output --run-id $RUN_ID
Error Handling
Issue Cause Solution
Bundle validation fails Invalid YAML Run databricks bundle validate locally
Permission denied Missing workspace access Check service principal permissions
Cluster start fails Quota exceeded Request quota increase or use smaller nodes
Job timeout Long-running task Set timeout_seconds in job config
Examples
Basic usage: Apply databricks deploy integration to a standard project setup with default configuration options.
Advanced scenario: Customize databricks deploy integration for production environments with multiple constraints and team-specific requirements.
Resources
-
Databricks Asset Bundles
-
Databricks CLI
-
CI/CD with Bundles
Next Steps
For multi-environment setup, see databricks-multi-env-setup .
Output
-
Configuration files or code changes applied to the project
-
Validation report confirming correct implementation
-
Summary of changes made and their rationale