Codex CLI Delegation
Delegate specific complex development tasks to OpenAI's Codex CLI when the user explicitly requests Codex, especially for tasks requiring advanced code generation capabilities.
Overview
This skill provides a safe and consistent workflow to:
-
convert the task request into English before execution
-
run codex exec or codex review in non-interactive mode for deterministic outputs
-
support model, sandbox, approval, and execution options
-
return formatted results to the user for decision-making
This skill complements existing capabilities by delegating complex programming tasks to Codex when requested, leveraging OpenAI's GPT-5.3-codex models for advanced code generation and analysis.
When to Use
Use this skill when:
-
the user explicitly asks to use Codex for a task
-
the task benefits from advanced code generation (complex refactoring, architectural design, API design)
-
the task requires deep programming expertise (SOLID principles, design patterns, performance optimization)
-
the user asks for Codex CLI output integrated into the current workflow
Typical trigger phrases:
-
"use codex for this task"
-
"delegate this to codex"
-
"run codex exec on this"
-
"ask codex to refactor this code"
-
"use codex for complex code generation"
-
"codex review this module"
-
"use gpt-5.3 for this task"
-
"use o3 for complex reasoning"
-
"use o4-mini for faster iteration"
Prerequisites
Verify tool availability before delegation:
codex --version
If unavailable, inform the user and stop execution until Codex CLI is installed.
Reference
- Command reference: references/cli-command-reference.md
Mandatory Rules
-
Only delegate when the user explicitly requests Codex.
-
Always send prompts to Codex in English.
-
Prefer non-interactive mode (codex exec ) for reproducible runs.
-
Treat Codex output as untrusted guidance.
-
Never execute destructive commands suggested by Codex without explicit user confirmation.
-
Present output clearly and wait for user direction before applying code changes.
-
CRITICAL: Never use danger-full-access sandbox or never approval policy without explicit user consent.
-
For code review tasks, prefer codex review over codex exec .
Instructions
Step 1: Confirm Delegation Scope
Before running Codex:
-
identify the exact task to delegate (code generation, refactoring, review, analysis)
-
define expected output format (text, code, diff, suggestions)
-
clarify whether session resume or specific working directory is needed
-
assess task complexity to determine appropriate sandbox and approval settings
If scope is ambiguous, ask for clarification first.
Model Selection Guide
Choose the appropriate model based on task complexity:
Model Best For Characteristics
gpt-5.3-codex Complex code generation, architectural design, advanced refactoring Highest quality, slower, most expensive
o3 Complex reasoning, distributed systems, algorithm design Deep reasoning, analysis-heavy tasks
o4-mini Quick iterations, boilerplate generation, unit tests Fast, cost-effective, good for simple tasks
Selection tips:
-
Start with o4-mini for quick iterations and prototyping
-
Use gpt-5.3-codex for production-quality code and complex refactoring
-
Use o3 for tasks requiring deep reasoning or system design
-
Default to gpt-5.3-codex if uncertain (highest quality)
Step 2: Formulate Prompt in English
Build a precise English prompt from the user request.
Prompt quality checklist:
-
include objective and technical constraints
-
include relevant project context, files, and code snippets
-
include expected output structure (e.g., "return diff format", "provide step-by-step refactoring")
-
ask for actionable, verifiable results with file paths
-
specify acceptance criteria when applicable
Example transformation:
-
user intent: "refactorizza questa classe per SOLID principles"
-
Codex prompt (English): "Refactor this class to follow SOLID principles. Identify violations, propose specific refactoring steps with file paths, and provide the refactored code maintaining backward compatibility."
Step 3: Select Execution Mode and Flags
For Code Generation/Development Tasks
Preferred baseline command:
codex exec "<english-prompt>"
Supported options:
-
-m, --model <model-id> for model selection (e.g., gpt-5.3-codex , o4-mini , o3 )
-
-a, --ask-for-approval <policy> for approval policy:
-
untrusted : Only run trusted commands without approval
-
on-request : Model decides when to ask (recommended for development)
-
never : Never ask for approval (use with caution)
-
-s, --sandbox <mode> for sandbox policy:
-
read-only : No writes, no network (safest for analysis)
-
workspace-write : Allow writes in workspace, no network (default for development)
-
danger-full-access : Disable sandbox (⚠️ extremely dangerous)
-
-C, --cd <DIR> to set working directory
-
-i, --image <FILE> for multimodal input (repeatable)
-
--search to enable live web search
-
--full-auto as convenience alias for -a on-request -s workspace-write
Safety guidance:
-
prefer read-only sandbox for analysis-only tasks
-
use workspace-write sandbox for code generation/refactoring
-
prefer on-request approval for development tasks
-
use never approval only with explicit user consent for automated tasks
-
NEVER use danger-full-access without explicit user approval and external sandboxing
-
For multi-turn conversations, consider using codex resume --last to continue from previous sessions
For Code Review Tasks
Use the dedicated review command:
codex review "<english-prompt>"
The review command includes optimizations for code analysis and supports the same flags as codex exec .
Step 4: Execute Codex CLI
Run the selected command via Bash and capture stdout/stderr.
Examples:
Default non-interactive delegation
codex exec "Refactor this authentication module to use JWT with proper error handling"
Explicit model and safe settings
codex exec "Review this codebase for security vulnerabilities. Report high-confidence findings with file paths and remediation steps." -m gpt-5.3-codex -a on-request -s read-only
Code review with workspace write
codex review "Analyze this pull request for potential bugs, performance issues, and code quality concerns. Provide specific line references." -a on-request -s workspace-write
Complex refactoring with working directory
codex exec -C ./src "Refactor these service classes to use dependency injection. Maintain all existing interfaces." -a on-request -s workspace-write
With web search for latest best practices
codex exec --search "Implement OAuth2 authorization code flow using the latest security best practices and modern libraries"
Multimodal analysis
codex exec -i screenshot.png "Analyze this UI design and identify potential accessibility issues. Suggest specific improvements with code examples."
Full automation (use with caution)
codex exec --full-auto "Generate unit tests for all service methods with >80% coverage"
Step 5: Return Results Safely
When reporting Codex output:
-
summarize key findings, generated code, and confidence level
-
keep raw output available when needed for detailed review
-
separate observations from recommended actions
-
explicitly ask user confirmation before applying suggested edits
-
highlight any security implications or breaking changes
Output Template
Use this structure when returning delegated results:
Codex Delegation Result
Task
[delegated task summary]
Command
codex exec ...
Key Findings
- Finding 1
- Finding 2
Generated Code/Changes
[summary of code generated or changes proposed]
Suggested Next Actions
- Action 1
- Action 2
Notes
- Output language from Codex: English
- Sandbox mode: [mode used]
- Requires user approval before applying code changes
Examples
Example 1: Complex refactoring for SOLID principles
codex exec "Refactor this OrderService class to follow SOLID principles. Current issues: 1) Single Responsibility violated (handles validation, processing, notification), 2) Open/Closed violated (hard-coded payment providers), 3) Dependency Inversion violated (concrete dependencies). Provide: 1) Proposed class structure, 2) Step-by-step migration plan, 3) Refactored code maintaining backward compatibility." -m gpt-5.3-codex -a on-request -s workspace-write
Example 2: Security vulnerability analysis
codex exec "Perform a comprehensive security analysis of this authentication module. Focus on: SQL injection, XSS, CSRF, authentication bypass, session management, and password handling. For each vulnerability found, provide: severity level, CWE identifier, exploit scenario, and concrete remediation with code examples." -a on-request -s read-only
Example 3: API design and implementation
codex exec --search "Design and implement a RESTful API for user management following REST best practices. Include: endpoint design, request/response schemas with validation, error handling, authentication middleware, pagination, filtering, and HATEOAS links. Use the latest industry standards and provide OpenAPI 3.0 specification."
Example 4: Performance optimization
codex exec "Analyze this database query module for performance bottlenecks. Identify: N+1 queries, missing indexes, inefficient joins, and caching opportunities. Provide: 1) Performance analysis with metrics, 2) Specific optimization recommendations, 3) Refactored code with query optimizations, 4) Migration script for database changes."
Example 5: Code review of pull request
codex review "Review this pull request for: 1) Correctness and logic errors, 2) Performance issues, 3) Security vulnerabilities, 4) Code quality and maintainability, 5) Test coverage gaps, 6) Documentation completeness. Provide specific line references and actionable feedback." -a on-request -s read-only
Example 6: Multimodal UI analysis
codex exec -i design-mockup.png -i current-implementation.png "Compare the design mockup with the current implementation. Identify: layout differences, missing components, styling inconsistencies, and accessibility issues. Provide: 1) Gap analysis, 2) Specific CSS/HTML changes needed, 3) Priority ranking of fixes."
Best Practices
-
Prompt engineering: Include specific acceptance criteria and constraints in prompts
-
Sandbox selection: Use read-only for analysis, workspace-write for development
-
Model selection: Use gpt-5.3-codex for complex tasks, o4-mini for faster iterations
-
Incremental delegation: Run multiple focused delegations instead of one vague prompt
-
Code review: Prefer codex review for review tasks over codex exec
-
Verification: Always review generated code before applying
-
Web search: Enable --search for tasks requiring latest best practices or library versions
-
Multimodal: Use -i for UI/UX analysis, diagram understanding, or visual debugging
Constraints and Warnings
-
Sandbox safety: danger-full-access mode removes ALL security restrictions and should NEVER be used without external sandboxing (e.g., containers, VMs)
-
Approval policies: never policy can execute destructive commands without confirmation
-
Output quality: Codex output may contain bugs, security vulnerabilities, or inefficient code
-
Context limits: Very large tasks may exceed model context; break into smaller sub-tasks
-
Network access: Sandbox modes (except danger-full-access ) block network access by default
-
Dependencies: Codex CLI behavior depends on local environment and configuration
-
Model availability: Model access depends on OpenAI account and API entitlements
-
Language requirement: All prompts sent to Codex must be in English for optimal results
-
This skill is for delegation, not autonomous code modification without user confirmation