operate-android-devices-with-bochi

Bochi is a command line tool for AI agents to control Android devices via ADB. Use this skill when you need to interact with Android UI elements programmatically, such as tapping buttons, waiting for elements to appear, or automating Android device interactions. Supports CSS-like selectors with attribute assertions, AND/OR logic, descendant matching, and negation.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "operate-android-devices-with-bochi" with this command: npx skills add linmx0130/bochi/linmx0130-bochi-operate-android-devices-with-bochi

Bochi - Android Device Control for AI Agents

Bochi is a command line tool for AI agents to control Android devices via ADB. Use this skill when you need to interact with Android UI elements programmatically, such as tapping buttons, waiting for elements to appear, or automating Android device interactions. Supports CSS-like element selectors with attribute assertions, AND/OR logic, descendant matching, and negation.

Features

  • Uses adb shell uiautomator dump to obtain UI layout information
  • Supports CSS-like element selectors with attribute assertions, AND/OR logic, descendant matching, and negation
  • Commands: waitFor, tap, inputText, longTap, doubleTap, scrollUp, scrollDown
  • Configurable timeout

Installation

Install from crates.io

cargo install bochi

Build from source

git clone https://github.com/linmx0130/bochi.git && cd bochi
cargo build --release

The binary will be available at target/release/bochi.

Usage

bochi [OPTIONS] --selector <SELECTOR> --command <COMMAND>

Options:
  -h, --help  Print help

Common Parameters:
  -s, --serial <SERIAL>
  -e, --selector <SELECTOR>  Element selector. Supports CSS-like syntax: - [attr=value] - attribute assertion - [attr1=v1][attr2=v2] - AND of clauses - sel1,sel2 - OR of selectors - :has(cond) - has descendant matching cond
  -c, --command <COMMAND>
  -t, --timeout <TIMEOUT>    [default: 30]

Command-Specific Parameters:
      --text <TEXT>        Text content for inputText command
      --print-descendants  Print the XML of matched elements including their descendants (for waitFor command)
      --scroll-target <SELECTOR>  Target element selector for scrollUp/scrollDown commands

Commands

All commands are executed against the elements matched by the selector. If the element is not found within the specified timeout, an error will be returned. If there are multiple elements matched, the command will be executed against the first element.

  • waitFor: Wait for an element to appear
  • tap: Tap an element
  • inputText: Input text into an element
  • longTap: Long tap (1000ms) an element
  • doubleTap: Double tap an element
  • scrollUp: Scroll up until the target element is visible (requires --scroll-target)
  • scrollDown: Scroll down until the target element is visible (requires --scroll-target)

Selector Syntax

The selector syntax is inspired by CSS selectors:

Basic Attribute Assertion

Use square brackets to match elements by attribute:

# Match element with text="Submit"
bochi -e '[text="Submit"]' -c tap

# Match element with class="Button"
bochi -e '[class=Button]' -c tap

Attribute Operators

In addition to exact match (=), you can use:

  • ^= - starts with: [attr^=value] matches if attribute starts with value
  • $= - ends with: [attr$=value] matches if attribute ends with value
  • *= - contains: [attr*=value] matches if attribute contains value
# Match text starting with "Submit"
bochi -e '[text^=Submit]' -c tap

# Match text ending with "Button"
bochi -e '[text$=Button]' -c tap

# Match text containing "Search"
bochi -e '[text*=Search]' -c tap

# Combine operators
bochi -e '[class^=android.widget][text*=Save]' -c tap

AND Logic (Multiple Clauses)

Multiple square bracket clauses connected together means AND:

# Match element with class="android.widget.Button" AND text="OK"
bochi -e '[class=android.widget.Button][text="OK"]' -c tap

# Match element with package="com.example" AND clickable="true"
bochi -e '[package=com.example][clickable=true]' -c tap

OR Logic (Comma-separated)

Use comma , to represent OR of multiple conditions:

# Match element with text="Cancel" OR text="Back"
bochi -e '[text=Cancel],[text=Back]' -c tap

# Match element with text="OK" OR text="Confirm"
bochi -e '[text=OK],[text=Confirm]' -c tap

Descendant Matching (:has())

Use :has(cond) to select nodes which have a descendant matching the condition:

# Match a ScrollView element that contains an item with text="Item 1"
bochi -e '[class=android.widget.ScrollView]:has([text="Item 1"])' -c tap

# Match any element that has a descendant with text="Submit"
bochi -e ':has([text=Submit])' -c tap

Negation (:not())

Use :not(cond) to select nodes that do NOT match the condition:

# Match elements that are not clickable=false
bochi -e ':not([clickable=false])' -c tap

# Match elements with text containing "Confirm" but not clickable=false
bochi -e '[text*=Confirm]:not([clickable=false])' -c tap

# Match elements that do not have a descendant with text="Loading"
bochi -e ':not(:has([text=Loading]))' -c waitFor

Child Combinator (>)

Use > to select direct children:

# Match clickable elements that are direct children of a ScrollView
bochi -e '[class=android.widget.ScrollView]>[clickable=true]' -c tap

# Chain child combinators
bochi -e '[class=android.widget.ScrollView]>[class*=Item]>[text=Settings]' -c tap

Note: > only matches direct children, unlike :has() which matches any descendant.

Descendant Combinator (space)

Use a space to select any descendant (direct or indirect):

# Match buttons that are descendants of a ScrollView (any depth)
bochi -e '[class=android.widget.ScrollView] [clickable=true]' -c tap

# Match text anywhere within a specific container
bochi -e '[resource-id=com.example:id/container] [text="Submit"]' -c tap

# Chain descendant combinators
bochi -e '[class=android.widget.ScrollView] [class=android.widget.LinearLayout] [text="Item 1"]' -c tap

Note: Unlike >, the space combinator matches elements at any depth, not just direct children.

Complex Selectors

Combine all features for powerful selection:

# Match Button with text "OK" OR "Confirm"
bochi -e '[class=android.widget.Button][text=OK],[class=android.widget.Button][text=Confirm]' -c tap

Supported Attributes

  • text - The text content of the element
  • contentDescription (or content-desc, content_desc) - The content description
  • resourceId (or resource-id, resource_id) - The resource ID
  • class - The class name of the element
  • package - The package name
  • checkable, checked, clickable, enabled, focusable, focused
  • long-clickable (or long_clickable), password, scrollable, selected
  • bounds - The bounding rectangle

Quoting Values

Values can be quoted or unquoted:

  • [text=Submit] - unquoted
  • [text="Submit Button"] - double quotes (required for values with spaces)
  • [text='Submit'] - single quotes

Using Opposite Quote Types

You can include one type of quote inside the other without escaping:

  • [text="It's done"] - single quote inside double quotes
  • [text='Say "Hello"'] - double quotes inside single quotes

Escape Sequences

To include the same type of quote within quoted values, use backslash escaping:

  • [text="Say \"Hello\""] - escaped double quotes
  • [text='It\'s done'] - escaped single quote
  • [text="C:\\Windows"] - escaped backslash

Supported escape sequences:

  • \" - double quote
  • \' - single quote
  • \\ - backslash
  • Unknown sequences (e.g., \n) are preserved as-is

Examples

Wait for an element to appear

bochi -e '[text=Submit]' -c waitFor

Wait for an element and print its descendants

bochi -e '[class=android.widget.ScrollView]' -c waitFor --print-descendants

Tap an element

bochi -e '[contentDescription="Open Menu"]' -c tap

If there are multiple elements matches the selector, the first element will be tapped. In order to make accurate selection, use contentDescription or resource-id in the code to set accurate description.

Input text into an element

bochi -e '[resource-id=com.example:id/username]' -c inputText --text "myusername"

If there are multiple element matches the selector, the first element will receive the input. In order to make accurate selection, use contentDescription or resource-id in the code to set accurate description.

Tap element with OR condition

bochi -e '[text=OK],[text=Confirm]' -c tap

Tap a list item within a specific container

bochi -e '[class$=RecyclerView]:has([text="Settings"])' -c tap

Match text starting with a prefix

bochi -e '[text^=Loading]' -c waitFor

Match resource-id ending with a suffix

bochi -e '[resource-id$=submit_button]' -c tap

Match text containing a substring

bochi -e '[text*=Save Changes]' -c tap

Select direct children

# Select clickable buttons directly under a toolbar
bochi -e '[class$=Toolbar]>[clickable=true]' -c tap

# Chain: Select Settings item in a RecycleView
bochi -e '[class$=RecyclerView]>[class$=LinearLayout]>[text=Settings]' -c tap

Use with specific device

bochi -s emulator-5554 -e '[resource-id=com.example:id/button]' -c tap

Set custom timeout

bochi -e '[text=Loading]' -c waitFor -t 60

Scroll to an element

For scrollable containers like RecyclerView or ScrollView, use scrollUp or scrollDown to find an element:

# Scroll down in a RecyclerView to find an item
bochi -e '[class$=RecyclerView]' -c scrollDown --scroll-target '[text="Item 50"]'

# Scroll up to find an element at the top
bochi -e '[scrollable=true]' -c scrollUp --scroll-target '[text="Header"]'

The -e selector specifies the scrollable container, and --scroll-target specifies the element to scroll into view. The command will perform gradual swipes until the target element becomes visible or the timeout is reached.

Selecting a Button Within a Specific Container

When you need to interact with a button that appears multiple times on the screen (e.g., "Reset" buttons for different layout configurations), you can combine the :has() pseudo-class with the child combinator (>) to precisely target the button within a specific container.

# Click the "Reset" button within the Portrait Layout section
bochi -e ':has([text*=Portrait]) > [clickable=true]:has([text="Reset"])' -c tap

How it works:

  1. :has([text*=Portrait]) - Selects a container element that contains a descendant with text matching "Portrait" (e.g., the "Portrait Layout" card)
  2. > - The child combinator restricts the search to direct children of the container
  3. [clickable=true]:has([text="Reset"]) - Matches a clickable element that contains the text "Reset"

This pattern is useful when:

• Multiple similar buttons exist on the same screen (e.g., "Edit" or "Reset" buttons for different settings categories) • You need to distinguish between buttons based on their container context • Elements don't have unique resource IDs but their parent containers have distinguishing text

Exit Codes

  • 0 - Success
  • 1 - Error (element not found, timeout, ADB error, etc.)

Requirements

  • Android Debug Bridge (ADB) installed and in PATH
  • Android device connected and authorized for debugging

Tips for using bochi during development

  1. In order to make accurate selection, resource-id should be the best attribute to query if it is available.
  2. In Jetpack Compose, testTag can be exposed as resource-id by applying Modifier.semantics { testTagsAsResourceId = true } on the containers.
  3. Select elements by adding accurate content description is also a good idea. Since content description will be used for accessibility tools, filling unique, concise and accurate content description to elements will benefit both automatic tools like bochi and more human users.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

obtain-screenshot-android

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

brave-api-free-search

Free Brave API alternative for OpenClaw. Completely FREE web search. Secure localhost-only deployment. Supports hidden --dev flag.

Registry SourceRecently Updated
Coding

Agent Collab Platform

Unified agent collaboration platform with shared core, automatic GitHub issue handling, intelligent message routing, and modular extensibility for PM and Dev...

Registry SourceRecently Updated
Coding

Deep Memory

One-click clone of a production-grade semantic memory system: HOT/WARM/COLD tiered storage + Qdrant vector DB + Neo4j graph DB + qwen3-embedding. Enables cro...

Registry SourceRecently Updated