databricks-hello-world

Databricks Hello World

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "databricks-hello-world" with this command: npx skills add jeremylongshore/claude-code-plugins-plus-skills/jeremylongshore-claude-code-plugins-plus-skills-databricks-hello-world

Databricks Hello World

Overview

Create your first Databricks cluster and notebook to verify setup.

Prerequisites

  • Completed databricks-install-auth setup

  • Valid API credentials configured

  • Workspace access with cluster creation permissions

Instructions

Step 1: Create a Cluster

Create a small development cluster via CLI

databricks clusters create --json '{ "cluster_name": "hello-world-cluster", "spark_version": "14.3.x-scala2.12", "node_type_id": "Standard_DS3_v2", "autotermination_minutes": 30, "num_workers": 0, "spark_conf": { "spark.databricks.cluster.profile": "singleNode", "spark.master": "local[*]" }, "custom_tags": { "ResourceClass": "SingleNode" } }'

Step 2: Create a Notebook

hello_world.py - upload as notebook

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

Create notebook content

notebook_content = """

Databricks Hello World

COMMAND ----------

Simple DataFrame operations

data = [("Alice", 28), ("Bob", 35), ("Charlie", 42)] df = spark.createDataFrame(data, ["name", "age"]) display(df)

COMMAND ----------

Delta Lake example

df.write.format("delta").mode("overwrite").save("/tmp/hello_world_delta")

COMMAND ----------

Read it back

df_read = spark.read.format("delta").load("/tmp/hello_world_delta") display(df_read)

COMMAND ----------

print("Hello from Databricks!") """

import base64 w.workspace.import_( path="/Users/your-email/hello_world", format="SOURCE", language="PYTHON", content=base64.b64encode(notebook_content.encode()).decode(), overwrite=True ) print("Notebook created!")

Step 3: Run the Notebook

from databricks.sdk import WorkspaceClient from databricks.sdk.service.jobs import Task, NotebookTask, RunNow

w = WorkspaceClient()

Create a one-time run

run = w.jobs.submit( run_name="hello-world-run", tasks=[ Task( task_key="hello", existing_cluster_id="your-cluster-id", notebook_task=NotebookTask( notebook_path="/Users/your-email/hello_world" ) ) ] )

Wait for completion

result = w.jobs.get_run(run.response.run_id).result() print(f"Run completed with state: {result.state.result_state}")

Step 4: Verify with CLI

List clusters

databricks clusters list

Get cluster status

databricks clusters get --cluster-id your-cluster-id

List workspace contents

databricks workspace list /Users/your-email/

Get run output

databricks runs get-output --run-id your-run-id

Output

  • Development cluster created and running

  • Hello world notebook created in workspace

  • Successful notebook execution

  • Delta table created at /tmp/hello_world_delta

Error Handling

Error Cause Solution

Cluster quota exceeded

Workspace limits Terminate unused clusters

Invalid node type

Wrong instance type Check available node types

Notebook path exists

Duplicate path Use overwrite=True

Cluster pending

Startup in progress Wait for RUNNING state

Permission denied

Insufficient privileges Request workspace admin access

Examples

Interactive Cluster (Cost-Effective Dev)

from databricks.sdk import WorkspaceClient from databricks.sdk.service.compute import ClusterSpec

w = WorkspaceClient()

Create single-node cluster for development

cluster = w.clusters.create_and_wait( cluster_name="dev-cluster", spark_version="14.3.x-scala2.12", node_type_id="Standard_DS3_v2", num_workers=0, autotermination_minutes=30, spark_conf={ "spark.databricks.cluster.profile": "singleNode", "spark.master": "local[*]" } ) print(f"Cluster created: {cluster.cluster_id}")

SQL Warehouse (Serverless)

from databricks.sdk import WorkspaceClient

w = WorkspaceClient()

Create SQL warehouse for queries

warehouse = w.warehouses.create_and_wait( name="hello-warehouse", cluster_size="2X-Small", auto_stop_mins=15, warehouse_type="PRO", enable_serverless_compute=True ) print(f"Warehouse created: {warehouse.id}")

Quick DataFrame Test

Run in notebook or Databricks Connect

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

Create sample data

df = spark.range(1000).toDF("id") # 1000: 1 second in ms df = df.withColumn("value", df.id * 2)

Show results

df.show(5) print(f"Row count: {df.count()}")

Resources

  • Databricks Quickstart

  • Cluster Configuration

  • Notebooks Guide

  • Delta Lake Quickstart

Next Steps

Proceed to databricks-local-dev-loop for local development setup.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

backtesting-trading-strategies

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

svg-icon-generator

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

performance-lighthouse-runner

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

mindmap-generator

No summary provided by upstream source.

Repository SourceNeeds Review