android-use

Control Android devices via ADB commands - tap, swipe, type, navigate apps

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "android-use" with this command: npx skills add shehbajdhillon/android-use/shehbajdhillon-android-use-android-use

Android Device Control Skill

This skill enables you to control Android devices connected via ADB (Android Debug Bridge). You act as both the reasoning and execution engine - reading the device's UI state directly and deciding what actions to take.

Prerequisites

  • Android device connected via USB with USB debugging enabled
  • ADB installed and accessible in PATH
  • Device authorized for debugging (accepted the "Allow USB debugging?" prompt)

Multi-Device Support

All scripts support the -s <serial> flag to target a specific device. This is essential when multiple devices are connected (e.g., a physical phone AND an emulator).

Identifying Devices

Run scripts/check-device.sh to see all connected devices:

Multiple devices connected (2):

  [PHYSICAL] 1A051FDF6007PA - Pixel 6
  [EMULATOR] emulator-5554 - sdk_gphone64_arm64

Use -s <serial> to specify which device to use.

Choosing the Right Device

When the user mentions:

  • "phone", "my phone", "physical device" → Use the [PHYSICAL] device
  • "emulator", "virtual device", "AVD" → Use the [EMULATOR] device
  • If unclear, ask the user which device they want to target

Using the Serial Flag

Once you identify the target device, pass -s <serial> to ALL subsequent scripts:

# Check specific device
scripts/check-device.sh -s 1A051FDF6007PA

# All actions on that device
scripts/get-screen.sh -s 1A051FDF6007PA
scripts/tap.sh -s 1A051FDF6007PA 540 960
scripts/launch-app.sh -s 1A051FDF6007PA chrome

Important: Be consistent - use the same serial for all commands in a session.

Core Workflow

When given a task, follow this perception-action loop:

  1. Check device connection - Run scripts/check-device.sh first
    • If multiple devices: identify target based on user intent or ask
    • Note the serial number for subsequent commands
  2. Get current screen state - Run scripts/get-screen.sh [-s serial] to dump UI hierarchy
  3. Analyze the XML - Read the accessibility tree to understand what's on screen
  4. Decide next action - Based on goal + current state, choose an action
  5. Execute action - Run the appropriate script with -s serial if needed
  6. Wait briefly - Allow UI to update (typically 500ms-1s)
  7. Repeat - Go back to step 2 until goal is achieved

Reading UI XML

The get-screen.sh script outputs Android's accessibility XML. Key attributes to look for:

<node index="0" text="Settings" resource-id="com.android.settings:id/title"
      class="android.widget.TextView" content-desc=""
      bounds="[42,234][1038,345]" clickable="true" />

Important attributes:

  • text - Visible text on the element
  • content-desc - Accessibility description (useful for icons)
  • resource-id - Unique identifier for the element
  • bounds - Screen coordinates as [left,top][right,bottom]
  • clickable - Whether element responds to taps
  • scrollable - Whether element can be scrolled
  • focused - Whether element has input focus

Calculating tap coordinates: From bounds="[left,top][right,bottom]", calculate center:

  • x = (left + right) / 2
  • y = (top + bottom) / 2

Example: bounds="[42,234][1038,345]" → tap at x=540, y=289

Available Scripts

All scripts are in the scripts/ directory. Run them via bash.

All scripts support -s <serial> to target a specific device.

Device Management

ScriptArgsDescription
check-device.sh[-s serial]List devices / verify connection
wake.sh[-s serial]Wake device and dismiss lock screen
screenshot.sh[-s serial]Capture screen image

Screen Reading

ScriptArgsDescription
get-screen.sh[-s serial]Dump UI accessibility tree

Input Actions

ScriptArgsDescription
tap.sh[-s serial] x yTap at coordinates
type-text.sh[-s serial] "text"Type text string
swipe.sh[-s serial] directionSwipe up/down/left/right
key.sh[-s serial] keynamePress key (home/back/enter/recent)

App Management

ScriptArgsDescription
launch-app.sh[-s serial] package_or_nameLaunch app by package or search by name
install-apk.sh[-s serial] path/to/file.apkInstall APK to device

Action Guidelines

When to tap

  • Target clickable elements
  • Always calculate center from bounds
  • Prefer elements with clickable="true"

When to type

  • After tapping a text input field
  • The field should have focused="true" or class="android.widget.EditText"
  • Clear existing text first if needed (select all + delete)

When to swipe

  • To scroll lists or pages
  • To navigate between screens (e.g., swipe left/right for tabs)
  • Directions: up (scroll down), down (scroll up), left, right

When to use keys

  • home - Return to home screen
  • back - Go back / close dialogs
  • enter - Submit forms / confirm
  • recent - Open recent apps

When to take screenshots

  • For visual debugging when XML doesn't capture enough info
  • To verify visual state (colors, images, etc.)
  • When the task requires visual confirmation

When to wake the device

  • Before starting any task (device may have gone to sleep)
  • If get-screen.sh returns empty or minimal XML
  • If actions don't seem to be working (screen may be off)
  • Note: Won't bypass PIN/pattern/password - user must unlock manually

Common Patterns

Opening an app

# By package name (fastest)
scripts/launch-app.sh com.android.chrome

# By app name (searches installed apps)
scripts/launch-app.sh "Chrome"

Tapping a button

  1. Get screen: scripts/get-screen.sh
  2. Find element with matching text/content-desc
  3. Calculate center from bounds
  4. Tap: scripts/tap.sh 540 289

Entering text in a field

  1. Tap the text field to focus it
  2. Wait for keyboard
  3. Type: scripts/type-text.sh "your text here"
  4. Press enter if needed: scripts/key.sh enter

Scrolling to find content

  1. Get screen to check if target is visible
  2. If not found, swipe: scripts/swipe.sh up
  3. Get screen again, repeat until found or reached end

Handling dialogs/popups

  • Look for elements with text like "OK", "Allow", "Accept", "Cancel"
  • Tap the appropriate button
  • Or press back to dismiss: scripts/key.sh back

Error Handling

No device connected

  • Check USB connection
  • Verify USB debugging is enabled
  • Run adb devices manually to troubleshoot

Element not found

  • The UI may have changed - get fresh screen dump
  • Try scrolling to find the element
  • Element might be in a different screen/state

Action didn't work

  • Wait longer between actions (UI might be slow)
  • Verify coordinates are correct
  • Check if a popup/dialog appeared

App not responding

  • Press home and reopen the app
  • Or force close and restart

Example Sessions

Single Device

User request: "Open Chrome and search for weather"

1. scripts/check-device.sh
   → Device connected: Pixel 6
   → Serial: 1A051FDF6007PA
   → Type: Physical

2. scripts/launch-app.sh com.android.chrome
   → Chrome launched

3. scripts/get-screen.sh
   → [Read XML, find search/URL bar]
   → Found: bounds="[0,141][1080,228]" resource-id="com.android.chrome:id/url_bar"
   → Center: x=540, y=184

4. scripts/tap.sh 540 184
   → Tapped URL bar

5. scripts/get-screen.sh
   → [Verify keyboard appeared and field is focused]

6. scripts/type-text.sh "weather"
   → Typed "weather"

7. scripts/key.sh enter
   → Pressed enter to search

8. scripts/get-screen.sh
   → [Verify search results loaded]
   → Task complete!

Multiple Devices

User request: "Open Settings on my phone" (with emulator also running)

1. scripts/check-device.sh
   → Multiple devices connected (2):
   →   [PHYSICAL] 1A051FDF6007PA - Pixel 6
   →   [EMULATOR] emulator-5554 - sdk_gphone64_arm64

   User said "my phone" → target the PHYSICAL device
   Serial to use: 1A051FDF6007PA

2. scripts/check-device.sh -s 1A051FDF6007PA
   → Device connected: Pixel 6
   → Serial: 1A051FDF6007PA
   → Type: Physical
   → Status: Ready

3. scripts/launch-app.sh -s 1A051FDF6007PA settings
   → Resolved 'settings' to package: com.android.settings
   → Launched: com.android.settings

4. scripts/get-screen.sh -s 1A051FDF6007PA
   → [Read XML, verify Settings app is open]
   → Task complete!

Tips

  • Be patient - Android UI can be slow, wait between actions
  • Read carefully - The XML tells you exactly what's on screen
  • Check your work - Get screen after each action to verify state
  • Use screenshots - When XML doesn't give enough context
  • Start simple - Break complex tasks into small steps
  • Multi-device - Always check for multiple devices first; ask user if target is unclear
  • Consistent serial - Once you pick a device, use -s <serial> on ALL commands

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

android-use

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

openclaw-version-monitor

监控 OpenClaw GitHub 版本更新,获取最新版本发布说明,翻译成中文, 并推送到 Telegram 和 Feishu。用于:(1) 定时检查版本更新 (2) 推送版本更新通知 (3) 生成中文版发布说明

Archived SourceRecently Updated
Coding

ask-claude

Delegate a task to Claude Code CLI and immediately report the result back in chat. Supports persistent sessions with full context memory. Safe execution: no data exfiltration, no external calls, file operations confined to workspace. Use when the user asks to run Claude, delegate a coding task, continue a previous Claude session, or any task benefiting from Claude Code's tools (file editing, code analysis, bash, etc.).

Archived SourceRecently Updated
Coding

ai-dating

This skill enables dating and matchmaking workflows. Use it when a user asks to make friends, find a partner, run matchmaking, or provide dating preferences/profile updates. The skill should execute `dating-cli` commands to complete profile setup, task creation/update, match checking, contact reveal, and review.

Archived SourceRecently Updated