Repo JSON Generator
Convert Git repository code to structured JSON instructions for AI agents and automation tools.
This tool fetches code from Git repositories (GitHub, GitLab, Bitbucket, etc.) and generates structured JSON instructions that can be consumed by any AI agent or automation system for accurate code processing and updates.
Version 3.0.0 - Modular Architecture (Latest)
- ✅ NEW: Modular codebase architecture for better maintainability
- ✅
infocommand now includes complete file content in JSON output - ✅ Unified
--no-instructionsparameter across all commands - ✅ Consistent terminal output - always shows summary information
- ✅ Flexible file output - full formatted or pure JSON format
- ✅
infocommand supports--filterand--excludeparameters for file filtering
🏗️ Architecture Overview (v3.0.0)
Modular Structure
The codebase has been restructured from a single monolithic script into a modular architecture:
scripts/
├── generator.py # Main entry point (CLI router)
├── core/
│ ├── constants.py # Shared constants and configuration
│ ├── temp_manager.py # Cross-platform temp directory management
│ ├── circuit_breaker.py # Circuit breaker & retry mechanism
│ └── security.py # Sensitive information protection
├── git/
│ └── repository.py # Git repository operations
├── processors/
│ ├── file_processor.py # File reading and filtering
│ └── instruction_gen.py # JSON instruction generation
└── output/
└── streaming.py # Streaming/chunked output
Module Dependencies
core/ (no dependencies)
↓
git/ (depends on core)
↓
processors/ (depends on core, git)
↓
output/ (depends on processors)
↓
generator.py (depends on all modules)
Benefits
- Maintainability: Each module can be updated independently
- Testability: Modules can be tested in isolation
- Reusability: Core components can be reused in other projects
- Readability: Smaller, focused files are easier to understand
🔄 AI Agent Integration Architecture
Overview
This tool (generator) generates structured JSON from Git repositories that can be consumed by any AI agent or automation system:
┌─────────────────────┐ ┌──────────────────────┐
│ repo-json- │ JSON │ AI Agent / │
│ generator │ ──────> │ Automation System │
│ │ Data │ │
│ 1. Fetch from │ │ 2. Process Code │
│ Git Repo │ │ Update Files │
│ 3. Generate JSON │ │ 3. Execute Actions │
│ Instructions │ │ │
└─────────────────────┘ └──────────────────────┘
Why Structured JSON?
| Aspect | Natural Language | Structured JSON |
|---|---|---|
| Accuracy | 70-80% | 90-95% |
| File Completeness | May miss files | Guaranteed by JSON structure |
| Control | Hard to verify | Easy to validate before processing |
| Batch Processing | Difficult | Built-in support |
| Best For | Simple queries | Full sync, large updates |
Integration Workflow
Standard Flow
User Request
↓
"Generate JSON from Git repo" / "Convert code to JSON"
↓
Step 1: generator
├─ Clone repository from Git
├─ Read all code files
├─ Generate structured JSON instructions
└─ Output: JSON data with file contents
↓
Step 2: Your AI Agent / System
├─ Receive JSON instructions
├─ Parse file list and contents
├─ Create/overwrite each file
└─ Output: Updated file list for verification
↓
Complete! Code processed by AI agent
Trigger Scenarios
Scenario 1: Direct Code Conversion
User says: "Convert repo to JSON" or "Generate code instructions"
↓
generator is triggered
↓
Generates JSON structured template
↓
Pass JSON to AI agent for execution
Scenario 2: Large Codebase - Batch Processing
User says: "Convert entire project to JSON" or "Generate batch JSON"
↓
generator detects large codebase (>50 files)
↓
Automatically splits into batches:
├─ Batch 1: Configuration files (*.json, *.yaml, *.toml)
├─ Batch 2: Frontend code (src/*.vue, src/*.js)
└─ Batch 3: Backend code (api/*.py, models/*.py)
↓
Each batch sent to AI agent sequentially
Scenario 3: Incremental Update
User says: "只更新改动的文件" or "Sync only changed files"
↓
generator uses sync command with specific commit
├─ Get changed files from commit
└─ Generate JSON for only modified files
↓
Send to AI agent
🔄 Two-Skill Collaboration Architecture
Overview
This skill (generator) works together with miaoda-app-builder in a two-step workflow:
┌─────────────────────┐ ┌──────────────────────┐
│ repo-json- │ JSON │ miaoda-app- │
│ generator │ ──────> │ builder │
│ │ Code │ │
│ 1. Fetch from │ Data │ 2. Update Code │
│ Git │ │ via Chat API │
│ 3. Generate JSON │ │ 3. Create/Overwrite │
│ Instructions │ │ Files │
└─────────────────────┘ └──────────────────────┘
Why Two Skills?
| Aspect | Using Only miaoda-app-builder | Two-Skill Collaboration |
|---|---|---|
| Accuracy | 70-80% (natural language) | 90-95% (structured JSON) |
| File Completeness | May miss files | Guaranteed by JSON structure |
| Control | Hard to verify | Easy to validate before sync |
| Batch Processing | Difficult | Built-in support |
| Best For | Small edits, UI tweaks | Full sync, large updates |
Collaboration Workflow
Standard Flow
User Request
↓
"Sync code from GitHub" / "Update with latest code"
↓
Step 1: generator
├─ Clone repository from Git
├─ Read all code files
├─ Generate structured JSON instructions
└─ Output: JSON data with file contents
↓
Step 2: miaoda-app-builder
├─ Receive JSON instructions via chat
├─ Parse file list and contents
├─ Create/overwrite each file
└─ Output: Updated file list for verification
↓
Complete! Code synced to Miaoda platform
Trigger Scenarios
Scenario 1: Direct Sync Command
User says: "用秒哒更新代码" or "Sync code from GitHub"
↓
generator is triggered
↓
Generates JSON structured template
↓
Pass JSON to miaoda-app-builder for execution
Scenario 2: Large Codebase - Batch Processing
User says: "同步整个项目代码" or "Update entire project"
↓
generator detects large codebase (>50 files)
↓
Automatically splits into batches:
├─ Batch 1: Configuration files (*.json, *.yaml, *.toml)
├─ Batch 2: Frontend code (src/*.vue, src/*.js)
└─ Batch 3: Backend code (api/*.py, models/*.py)
↓
Each batch sent to miaoda-app-builder sequentially
Scenario 3: Incremental Update
User says: "只更新改动的文件" or "Sync only changed files"
↓
generator uses sync command with commit filter
├─ Find specific commit with changed files
└─ Generate JSON for only those files
↓
Send to miaoda-app-builder
Structured JSON Template Format
Standard Template
When generator processes code from a Git repository, it generates the following JSON structure:
{
"action": "CREATE_OR_UPDATE_FILES",
"description": "Please create or update all files in the project according to the following JSON data",
"source": {
"repository": "https://github.com/username/repo",
"branch": "main",
"commit": "abc123def456"
},
"summary": {
"files_changed": 3,
"total_additions": 131,
"total_deletions": 98,
"files": [
{
"path": "src/file.ts",
"status": "modified",
"additions": 10,
"deletions": 5
}
]
},
"rules": [
"1. MUST update ALL files in the files array - no exceptions",
"2. MUST copy content EXACTLY as-is - no modifications, no reformatting, no 'improvements'",
"3. MUST preserve ALL whitespace, newlines, indentation, and comments exactly",
"4. MUST use EXACT file paths from path field - no path transformation",
"5. If file exists: overwrite completely (NOT merge)",
"6. If file does not exist: create it",
"7. After updating: output complete list of all updated files (format: 1. /path/to/file)",
"8. DO NOT ask for confirmation - just execute",
"9. DO NOT use Git commands - file operations only",
"10. DO NOT skip any file for any reason"
],
"files": [
{
"path": "package.json",
"status": "modified",
"additions": 2,
"deletions": 1,
"changes": [
{ "type": "addition", "line": 5, "content": "..." }
],
"action": "CREATE_OR_OVERWRITE",
"content": "{\n \"name\": \"my-app\",\n ...\n}"
},
{
"path": "src/App.vue",
"status": "added",
"additions": 50,
"deletions": 0,
"changes": [
{ "type": "addition", "line": 50, "content": "..." }
],
"action": "CREATE_OR_OVERWRITE",
"content": "<template>\n ...\n</template>"
}
]
}
Template Components
| Field | Type | Description |
|---|---|---|
action | String | Always "CREATE_OR_UPDATE_FILES" |
description | String | Human-readable instruction |
source | Object | Git source information for traceability |
source.repository | String | Repository URL |
source.branch | String | Git branch name |
source.commit | String | Commit hash |
summary | Object | Change statistics and file list |
summary.files_changed | Number | Total number of changed files |
summary.total_additions | Number | Total lines added |
summary.total_deletions | Number | Total lines deleted |
summary.files | Array | List of changed files with stats |
rules | Array | Execution rules that AI agent must follow |
files | Array | List of files to update |
files[].path | String | Relative file path |
files[].status | String | File status: "added", "modified", or "deleted" |
files[].additions | Number | Lines added in this file |
files[].deletions | Number | Lines deleted in this file |
files[].changes | Array | Detailed diff information (optional) |
files[].action | String | Always "CREATE_OR_OVERWRITE" |
files[].content | String | Complete file content |
Batch Template Example
For large projects, JSON is split into multiple batches:
Batch 1: Configuration
{
"action": "CREATE_OR_UPDATE_FILES",
"batch": "1/3",
"description": "Batch 1: Configuration files",
"files": [
{"path": "package.json", "action": "CREATE_OR_OVERWRITE", "content": "..."},
{"path": "tsconfig.json", "action": "CREATE_OR_OVERWRITE", "content": "..."}
]
}
Batch 2: Frontend
{
"action": "CREATE_OR_UPDATE_FILES",
"batch": "2/3",
"description": "Batch 2: Frontend source code",
"files": [
{"path": "src/App.vue", "action": "CREATE_OR_OVERWRITE", "content": "..."},
{"path": "src/components/Header.vue", "action": "CREATE_OR_OVERWRITE", "content": "..."}
]
}
⚠️ Security & Safety Considerations
Generated Instructions
This tool intentionally generates imperative instructions in the JSON output for downstream AI agents. This is by design to ensure accurate code synchronization. However:
- User Responsibility: Users should ensure that the generated JSON does not carry more authority than their actual intent
- Review Required: Always review the generated JSON, especially file lists, action fields, and rules, before passing to other AI agents
- Downstream Impact: Downstream AI agents may treat the generated repository JSON as instructions that must be strictly followed
Repository Content Trust
Treat Repository Content as Untrusted Data:
- Repository file contents are directly placed into the generated JSON
- Files may contain prompt text that could influence downstream AI agents
- Instruct downstream agents to treat file content as data, not instructions
- Review unusual repository files before use
Credential Security
GitHub Token Best Practices:
- Use minimal, read-only tokens scoped to specific repositories
- Tokens are passed via environment variables only
- Tokens are never stored in files or logs
- Token only exists in memory during execution
- Automatically redacted from all output
Note: This skill requires sensitive GitHub credentials. Verify publisher, package identifier, and version history before installation or use.
🔒 Security Mechanisms
Overview
This tool implements comprehensive security mechanisms to protect sensitive information when working with Git repositories.
Security Features
1. Token Management
Public Repositories: No token required, direct clone
Private Repositories: Set GITHUB_TOKEN environment variable
export GITHUB_TOKEN="ghp_your_token"
Token Requirements:
- Only needs
reporead permission - Format:
ghp_*,gho_*,ghu_*,ghs_*, orghr_*` - Never store in code or files
Automatic Detection:
- Public repos: Direct clone without token
- Private repos: Automatic token injection
- Token only exists in memory during execution
2. Sensitive Information Protection
Automatic Redaction: All sensitive data is automatically detected and masked:
- GitHub Tokens:
ghp_*→<GITHUB_TOKEN> - Slack Tokens:
xox[baprs]-*→<SLACK_TOKEN> - API Keys:
AIza*→<GOOGLE_API_KEY> - Passwords:
password=xxx→password=<REDACTED>
URL Credential Removal:
Input: https://x-access-token:ghp_abc123@github.com/user/repo.git
Output: https://github.com/user/repo.git
Applied to: JSON output, terminal summaries, logs, error messages
3. Secure Git Operations
Non-Interactive Mode:
GIT_ASKPASS=echo # Prevent password prompts
GIT_TERMINAL_PROMPT=0 # Disable terminal prompt
Token Security:
- ✅ Token passed via URL (not in command-line arguments)
- ✅ Not visible in process list
- ✅ Only exists in memory during execution
4. Temporary File Security
Auto-Cleanup:
- All temporary directories deleted after execution
- No sensitive data remains on disk
- UUID-based unique names prevent conflicts
Temporary Locations (Cross-Platform):
| Platform | Location | Format |
|---|---|---|
| macOS/Linux | /tmp/github-<uuid>/ | /tmp/github-a1b2c3d4/ |
| Windows | %TEMP%\github-<uuid>\ | C:\Users\<user>\AppData\Local\Temp\github-a1b2c3d4\ |
Cleanup Mechanisms:
- Context manager (
with temp_directory()) atexithandler for guaranteed cleanup- Signal handlers (
SIGTERM,SIGINT)
5. Secure Logging & Errors
All log output and error messages are automatically sanitized:
# All sensitive data automatically redacted
logger.info("Cloning with token:", token) # Shows: <GITHUB_TOKEN>
Security Best Practices
✅ DO:
- Use environment variables for tokens
- Use minimal permissions (read-only)
- Rotate tokens regularly
- Verify output doesn't contain tokens
- Use .gitignore for sensitive files
❌ DON'T:
- Never hardcode tokens in code
- Never log token values
- Never store tokens in plain text files
- Never commit tokens to Git
Troubleshooting
Token Not Working
# 1. Verify token is set
echo $GITHUB_TOKEN
# 2. Check permissions (GitHub → Settings → Developer settings)
# 3. Test manually
git ls-remote https://x-access-token:$GITHUB_TOKEN@github.com/user/repo.git
Permission Denied
# Ensure non-interactive mode
export GIT_TERMINAL_PROMPT=0
📁 Temporary Clone Locations
When this tool runs, it temporarily clones repositories to process code. All directories are automatically cleaned up after execution.
Clone Paths (Cross-Platform)
The tool uses system temporary directory via tempfile.gettempdir():
| Platform | Location | Format |
|---|---|---|
| macOS/Linux | /tmp/github-<uuid>/ | /tmp/github-a1b2c3d4/ |
| Windows | %TEMP%\github-<uuid>\ | C:\Users\<user>\AppData\Local\Temp\github-a1b2c3d4\ |
Quick Notes
- ✅ Auto-cleanup: All temporary directories are removed after script finishes
- ✅ Unique names: UUID-based directory names prevent conflicts
- ✅ Security: No sensitive data remains on disk after execution
- ✅ Guaranteed cleanup: Uses context manager, atexit handler, and signal handlers
Cleanup Mechanisms
- Context Manager:
with temp_directory()ensures cleanup on exit - Atexit Handler: Registered to clean up on normal program exit
- Signal Handlers: Catches
SIGTERMandSIGINTfor cleanup on interruption
How to Trigger Code Generation
User Commands
When users say any of the following, trigger generator:
Chinese Commands
- "生成 JSON 指令"
- "转换代码为 JSON"
- "从 Git 仓库生成指令"
- "批量生成代码指令"
- "导出代码到 JSON"
- "查看 commit 信息"
- "获取代码变更"
English Commands
- "Generate JSON instructions"
- "Convert code to JSON"
- "Generate from Git repository"
- "Batch generate code instructions"
- "Export code to JSON"
- "View commit info"
- "Get code changes"
Execution Flow
Step 0: Read Documentation & Check Help
IMPORTANT: Before using this tool, AI agents MUST read the documentation and check available options:
# Step 1: Read this SKILL.md file to understand the tool's capabilities
# Step 2: Check command-line help for available options and parameters
python3 scripts/generator.py --help
Version 3.0.0 Note: The tool now uses a modular architecture. All functionality remains the same, but the codebase is better organized for maintenance and extension.
Review the help output to understand:
- Available subcommands (
sync,info) - Required parameters (
--repo,--commit, etc.) - Optional parameters (
--filter,--exclude,--max-files,--output,--no-instructions) - Usage examples for different scenarios
Step 1: User Provides Repository URL
User: "Generate JSON from https://github.com/user/repo"
↓
generator extracts:
- repo_url: https://github.com/user/repo
- commit: (latest or specified)
Step 2: Fetch and Generate JSON
# Execute generation command
python3 scripts/generator.py sync \
--repo https://github.com/user/repo \
--commit abc123def456
Output: Structured JSON template with summary
Step 3: Send to Your AI Agent
User/Agent sends to AI system:
Please execute the following code update instructions:
```json
{
"action": "CREATE_OR_UPDATE_FILES",
"files": [...]
}
CRITICAL RULES:
- ✅ MUST update every file in the files array - no exceptions
- ✅ MUST copy content EXACTLY as-is - no modifications, no reformatting, no 'improvements'
- ✅ MUST preserve ALL whitespace, newlines, indentation, and comments exactly
- ✅ MUST use EXACT file paths from path field - no path transformation
- ✅ If file exists: overwrite completely (NOT merge)
- ✅ If file does not exist: create it
- ✅ After updating: output complete list of all updated files (format: 1. /path/to/file)
- ✅ DO NOT ask for confirmation - just execute
- ✅ DO NOT use Git commands - file operations only
- ✅ DO NOT skip any file for any reason
Verification: After all files are updated, please output:
- Total number of files updated
- Complete list of all file paths
- Any files that failed to update (if any)
### Step 4: AI Agent Executes
Your AI agent processes the JSON and:
- Parses file list
- Creates/overwrites files
- Returns completion status
---
# Info Command - Get Commit Information
## Overview
The `info` command provides detailed commit information including:
- Commit metadata (author, date, message)
- Changed files list with statistics
- Detailed diff information (optional)
- **Complete file content** for all changed files
## Usage Examples
### Get Commit Information
```bash
# Get specific commit information
python3 scripts/generator.py info \
--repo https://github.com/user/repo \
--commit abc123def456
Terminal Output (always shows summary):
📊 Summary:
Files Changed: 3
Total Additions: +131
Total Deletions: -98
📁 Changed Files (3):
🆕 Added: docs/GUEST_AUTH_AND_CONVERSION.md (+79/-0)
📝 Modified: src/contexts/AuthContext.tsx (+7/-91)
📝 Modified: src/db/guest.ts (+45/-7)
Save to File
# Save full formatted output (summary + JSON) to file
python3 scripts/generator.py info \
--repo https://github.com/user/repo \
--commit abc123def456 \
--output changes.json
- File: Contains full formatted output (summary + JSON)
- Terminal: Shows summary information
Save Pure JSON
# Save pure JSON to file (terminal still shows summary)
python3 scripts/generator.py info \
--repo https://github.com/user/repo \
--commit abc123def456 \
--output changes.json \
--no-instructions
- File: Contains pure JSON only
- Terminal: Shows summary information
Filter Files (Include Only)
# Only include TypeScript and JavaScript files
python3 scripts/generator.py info \
--repo https://github.com/user/repo \
--commit abc123def456 \
--filter "*.ts,*.tsx,*.js" \
--output changes.json
- Effect: Only files matching the patterns are included in the output
- Terminal: Shows filtered file count and list
Exclude Files
# Exclude documentation and test files
python3 scripts/generator.py info \
--repo https://github.com/user/repo \
--commit abc123def456 \
--exclude "*.md,*.txt,**/test/**,**/spec/**" \
--output changes.json
- Effect: Files matching the patterns are excluded from the output
- Terminal: Shows ⏭️ indicator for filtered out files
Combine Include and Exclude Filters
# Include Python files but exclude test files
python3 scripts/generator.py info \
--repo https://github.com/user/repo \
--commit abc123def456 \
--filter "*.py" \
--exclude "*.test.py,*.spec.py" \
--output changes.json
- Effect: First applies include filter, then applies exclude filter
- Use Case: Focus on source code while excluding tests, mocks, etc.
JSON Structure
The info command generates a comprehensive JSON structure:
{
"action": "CREATE_OR_UPDATE_FILES",
"description": "Please create or update all files...",
"source": {
"repository": "https://github.com/user/repo",
"branch": "main",
"commit": "abc123def456"
},
"summary": {
"files_changed": 3,
"total_additions": 131,
"total_deletions": 98,
"files": [
{
"path": "src/file.ts",
"status": "modified",
"additions": 10,
"deletions": 5
}
]
},
"rules": [...],
"files": [
{
"path": "src/file.ts",
"status": "modified",
"additions": 10,
"deletions": 5,
"changes": [
{ "type": "deletion", "line": 10, "content": "old code" },
{ "type": "addition", "line": 10, "content": "new code" }
],
"content": "// Complete file content here..."
}
]
}
Output Behavior Summary
| Scenario | Terminal | File |
|---|---|---|
No --output | Shows summary | Not saved |
With --output | Shows summary | Full formatted (summary + JSON) |
--output + --no-instructions | Shows summary | Pure JSON only |
Key Point: Terminal always displays summary information in all scenarios.
Filter Syntax
The --filter and --exclude parameters support various pattern formats:
| Pattern | Description | Example |
|---|---|---|
*.ext | Match by extension | *.py, *.js, *.md |
path/* | Match all files in directory | src/*, docs/* |
path/*.ext | Match specific extension in directory | src/*.py, test/*.js |
| Multiple patterns | Comma-separated | *.py,*.js,*.ts |
Filtering Priority:
- Include filter (
--filter) is applied first - Exclude filter (
--exclude) is applied second - If both are specified, files must match include AND not match exclude
How to Trigger Code Sync
User Commands
When users say any of the following, trigger generator:
Chinese Commands
- "用秒哒更新代码"
- "同步 GitHub 代码"
- "从 GitHub 拉取代码"
- "更新项目代码"
- "同步整个项目"
- "只更新改动的文件"
- "批量同步代码"
English Commands
- "Sync code from GitHub"
- "Update with latest code"
- "Pull code from GitHub"
- "Sync my repository"
- "Update entire project"
- "Sync only changed files"
- "Batch sync code"
Execution Flow
Step 1: User Provides Repository URL
User: "用秒哒更新代码,仓库地址是 https://github.com/user/repo"
↓
generator extracts:
- repo_url: https://github.com/user/repo
- app_id: (from current context)
- context_id: (from current context)
Step 2: Fetch and Generate JSON
# Execute generation command
python3 scripts/generator.py sync \
--repo https://github.com/user/repo \
--commit abc123def456
Output: Structured JSON template
Step 3: Send to miaoda-app-builder
User/Agent sends to miaoda-app-builder chat:
Please execute the following code update instructions:
```json
{
"action": "UPDATE_ALL_FILES",
"files": [...]
}
Important rules:
- Update every file in the files array
- Match content exactly as specified in content field
- Do not modify, alter, or skip any code
- Output complete list of all updated files when done
### Step 4: miaoda-app-builder Executes
```bash
# miaoda-app-builder processes via chat API
python3 scripts/miaoda_api.py chat \
--text "<JSON instructions from Step 3>" \
--app-id <app_id> \
--context-id <context_id>
Batch Processing Strategy
Automatic Splitting
When codebase exceeds thresholds, generator automatically suggests batch processing:
Split Criteria
| Condition | Action |
|---|---|
| Files > 50 | Recommend splitting |
| Total size > 5MB | Recommend splitting |
| Mixed file types | Split by category |
Recommended Batch Categories
Priority 1: Configuration Files (Must sync first)
python3 scripts/generator.py sync \
--repo <repo_url> \
--filter "*.json,*.yaml,*.yml,*.toml,*.env,package.json,requirements.txt" \
--max-files 20 \
--output batch1_config.json
Priority 2: Frontend Code
python3 scripts/generator.py sync \
--repo <repo_url> \
--filter "src/*.vue,src/*.js,src/*.jsx,src/*.ts,src/*.tsx,src/*.css,src/*.scss,src/*.html" \
--max-files 30 \
--output batch2_frontend.json
Priority 3: Backend Code
python3 scripts/generator.py sync \
--repo <repo_url> \
--filter "api/*.py,models/*.py,controllers/*.py,services/*.py,utils/*.py" \
--max-files 30 \
--output batch3_backend.json
Priority 4: Documentation & Others
python3 scripts/generator.py sync \
--repo <repo_url> \
--filter "*.md,*.txt,README*,docs/*" \
--max-files 10 \
--output batch4_docs.json
Batch Execution Order
- Send Batch 1 to AI agent
- Wait for completion and verify file list
- Send Batch 2 to AI agent
- Repeat until all batches complete
- Final verification - check all files synced
Workflow Examples
Example 1: Simple JSON Generation (Public Repository)
# Step 0: Check help to understand available options
python3 scripts/generator.py --help
# Step 1: Use generator to fetch code
export GITHUB_TOKEN="ghp_your_token"
python3 scripts/generator.py sync \
--repo https://github.com/username/my-project \
--commit abc123def456
# Step 2: Copy the JSON output
# [Output contains structured JSON template]
# Step 3: Send JSON to your AI agent
# Pass the JSON to your AI agent/automation tool
Example 2: Large Project with Batches
# Step 0: Check help first
python3 scripts/generator.py --help
# Batch 1: Configuration
python3 scripts/generator.py sync \
--repo https://github.com/username/large-project \
--filter "*.json,*.yaml,*.toml,*.env" \
--max-files 20 \
--output batch1.json
# Send to AI agent
# Process batch1.json with your AI system
# Wait for completion, then Batch 2: Frontend
python3 scripts/generator.py sync \
--repo https://github.com/username/large-project \
--filter "src/*.vue,src/*.js,src/*.css" \
--max-files 30 \
--output batch2.json
# Send to AI agent
# Process batch2.json with your AI system
# Batch 3: Backend
python3 scripts/generator.py sync \
--repo https://github.com/username/large-project \
--filter "api/*.py,models/*.py" \
--max-files 30 \
--output batch3.json
# Send to AI agent
# Process batch3.json with your AI system
AI Agent Integration Guide
Message Format for AI Agents
When sending JSON to an AI agent, use this format:
Please execute the following code update instructions:
```json
{
"action": "UPDATE_ALL_FILES",
"files": [...]
}
CRITICAL RULES:
- ✅ You must update every file in the files array
- ✅ File content must match the
contentfield exactly - ✅ Do not modify, alter, optimize, or reformat any code
- ✅ Create file if it doesn't exist, completely overwrite if it exists
- ✅ Do not skip any file from the list
- ✅ Preserve the exact file structure and paths
- ✅ After updating, output a complete list of all updated files for verification
Verification: After all files are updated, please output:
- Total number of files updated
- Complete list of all file paths
- Any files that failed to update (if any)
## Verification After Processing
After AI agent completes:
1. **Check file count**: Compare with original JSON `files.length`
2. **Verify file list**: All paths should match
3. **Review application**: Test the updated code
4. **Test functionality**: Run key features
5. **Deploy if successful**: Follow your deployment process
---
# Error Handling & Optimization
## Common Issues & Solutions
| Issue | Cause | Solution |
|-------|-------|----------|
| Files missing after sync | AI skipped some files | Use JSON template with strict rules |
| Code modified/altered | AI tried to "improve" code | Emphasize "DO NOT MODIFY" in rules |
| Sync incomplete | Too many files at once | Use batch processing |
| Token limit exceeded | JSON too large | Split into smaller batches |
| Private repo access denied | Missing token | Provide GITHUB_TOKEN |
## Optimization Strategies
### 1. Prioritize Critical Files
```bash
# Sync config files first (affects entire app)
--filter "package.json,requirements.txt,*.yaml,*.toml"
2. Use Commit Hashes for Reproducibility
# Pin to specific commit
--commit abc123def456
3. Exclude Unnecessary Files
# Only sync source code, skip docs/tests
--filter "src/*,api/*,models/*"
4. Parallel Batch Processing (Advanced)
For independent batches, you can prepare all JSON files first, then send sequentially:
# Prepare all batches
python3 scripts/generator.py sync --repo <url> --filter "*.json" --output batch1.json
python3 scripts/generator.py sync --repo <url> --filter "src/*.vue" --output batch2.json
python3 scripts/generator.py sync --repo <url> --filter "api/*.py" --output batch3.json
# Send to AI agent one by one
# (Must wait for each to complete before sending next)
Limitations & Workarounds
Current Constraints
- File Limit: Recommended <50 files per batch (AI processing limits)
- File Size: Individual files >100KB may cause issues
- Binary Files: Not supported (images, fonts, executables)
- No Direct Upload: Must go through miaoda-app-builder chat API
- AI Accuracy: ~90-95% with JSON instructions (vs 70-80% with natural language)
Best Practices
✅ DO:
- Use structured JSON templates (this skill's output)
- Batch large projects by file type
- Verify file count after each sync
- Use specific commit hashes for reproducibility
- Sync configuration files first
❌ DON'T:
- Send >50 files in one batch
- Include binary files in sync
- Skip verification step
- Use natural language for code updates (use JSON instead)
- Modify JSON content before sending to miaoda-app-builder