Ecosystem Discovery
Estimated Time: 10-30 minutes (depending on ecosystem size and GitHub search) Prerequisites: A starting repo with real code (not empty scaffolding) Output: ecosystem-map.md in the starting repo's .stackshift/ directory, .stackshift-batch-session.json in the starting repo directory for handoff
All path variables MUST be double-quoted in shell commands. This skill is single-session with no resume capability -- if interrupted, re-run from Step 1.
When to Use This Skill
Activate when:
-
The user has one repo and wants to find everything it connects to
-
A large-scale reverse-engineering project needs repo enumeration
-
The user wants to map an entire platform before running batch analysis
-
The dependency graph between multiple repos/services is unknown
Trigger Phrases:
-
"Discover the ecosystem for this repo"
-
"What other repos does this project depend on?"
-
"Map all the related services"
-
"Find all the repos in this platform"
-
"What's connected to this service?"
Process
Step 1: Pre-flight
Verify the starting repo exists and detect basic characteristics:
Verify we're in a repo with code
if [ ! -d ".git" ] && [ ! -f "package.json" ] && [ ! -f "go.mod" ] && [ ! -f "requirements.txt" ]; then echo "WARNING: This doesn't look like a code repository" fi
Detect if monorepo
MONOREPO="false" if [ -f "pnpm-workspace.yaml" ] || [ -f "turbo.json" ] || [ -f "nx.json" ] || [ -f "lerna.json" ]; then MONOREPO="true" fi
Get repo name
REPO_NAME=$(basename "$(pwd)")
Auto-discover GitHub org from git remote
REMOTE_URL=$(git remote get-url origin 2>/dev/null || echo "") GITHUB_ORG="" if [[ "$REMOTE_URL" =~ github.com:// ]]; then GITHUB_ORG="${BASH_REMATCH[1]}" echo "Auto-detected GitHub org: $GITHUB_ORG" elif [[ "$REMOTE_URL" =~ gitlab.com:// ]]; then GITHUB_ORG="${BASH_REMATCH[1]}" echo "Auto-detected GitLab group: $GITHUB_ORG" fi
Monorepo handling: If workspace config is detected:
-
Resolve all workspace globs to actual package directories
-
Mark every discovered package as CONFIRMED
-
Still scan each package for outbound signals to find external dependencies
-
The Mermaid graph shows intra-monorepo dependencies
Step 2: User Input
Show the auto-detected org and ask for confirmation:
I auto-detected the GitHub org from your git remote: {GITHUB_ORG}
Is this correct? (Y/n, or enter a different org)
If no org was detected:
I couldn't detect a GitHub org from the git remote. What GitHub org should I search? (optional, press enter to skip)
Ask about known repos:
Do you know of any related repos? (optional)
List paths or org/repo names, one per line:
- ~/git/auth-service
- ~/git/shared-libs
- myorg/inventory-api
- (or press enter to skip)
Mark user-provided repos as CONFIRMED confidence.
Step 3: Scan Starting Repo
Run all 10 signal categories on the starting repo. Follow scan-integration-signals.md for detailed instructions.
Signal categories:
-
Scoped npm packages (@org/* in package.json)
-
Docker Compose services (docker-compose*.yml )
-
Environment variables (.env* , config files)
-
API client calls (source code URLs, gRPC protos)
-
Shared databases (connection strings, schema refs)
-
CI/CD triggers (.github/workflows/*.yml )
-
Workspace configs (pnpm-workspace.yaml , turbo.json , nx.json , lerna.json )
-
Message queues/events (SQS, SNS, Kafka topic names)
-
Infrastructure refs (terraform/ , cloudformation/ , k8s/ )
-
Import paths / go.mod / requirements.txt (language-specific deps)
CHECKPOINT -- Report to user before continuing:
Signal scan complete. Found {N} candidate names across {M} signal categories. Top signals: {list top 3-5 discovered names with their categories} Proceeding to scan user repos and search GitHub...
If zero signals found, skip to the "Standalone Repo" edge case (see present-ecosystem-map.md Error Cases).
Step 4: Scan User-Provided Repos
For each repo the user listed:
-
Verify it exists (local path or clone from GitHub). If the path does not exist, warn the user and skip that repo.
-
Run the same 10 signal categories
-
Cross-reference signals with the starting repo to build connections
Step 5: GitHub Search (if org provided)
Follow github-ecosystem-search.md for detailed instructions.
Search the GitHub org for repos matching discovered signal names:
-
Package names (@org/shared-utils -> search for shared-utils repo)
-
Service names from Docker Compose or env vars
-
Repository naming patterns (same prefix, similar conventions)
Error recovery: If a GitHub API call fails with a transient error (5xx, network timeout), retry up to 2 times with 10-second backoff. If all retries fail, skip GitHub search and note it in the ecosystem map. If rate-limited, skip GitHub search entirely and rely on local results.
CHECKPOINT -- Report to user before continuing:
GitHub search complete. Found {N} matching repos ({X} exact name matches, {Y} code references). Proceeding to local filesystem scan and merge...
If GitHub search was skipped, report:
GitHub search skipped ({reason}). Proceeding with local scan and signal analysis only.
Step 6: Local Filesystem Scan
Search common development directories for matching repos:
Common locations to check
SEARCH_DIRS=( "$(dirname "$(pwd)")" # Sibling directories "$HOME/git" "$HOME/code" "$HOME/src" "$HOME/projects" "$HOME/repos" "$HOME/dev" "$HOME/workspace" )
For each discovered package/service name, look for matching directories
for name in "${DISCOVERED_NAMES[@]}"; do for dir in "${SEARCH_DIRS[@]}"; do if [ -d "$dir/$name" ]; then echo "FOUND: $dir/$name" fi done done
Step 7: Merge & Deduplicate
Follow merge-and-score.md for detailed instructions on deduplication, confidence scoring formula, and dependency graph construction.
Combine all discovery sources, deduplicate by repo identity, score confidence, and build the dependency graph.
Step 8: Present Ecosystem Map
Follow present-ecosystem-map.md for detailed instructions.
Generate ecosystem-map.md in .stackshift/ directory. Display the map to the user with a summary:
Found X repos (Y confirmed, Z high confidence, W medium, V low)
Step 9: User Confirmation
Ask the user to review and adjust:
Does this ecosystem map look right?
Options: A) Looks good -- proceed to handoff B) Add repos -- I'll add more to the list C) Remove repos -- Take some off the list D) Rescan -- Run discovery again with adjustments
If the user adds repos, mark as CONFIRMED and re-merge. If the user removes repos, update the map and graph. If the user requests a rescan, return to Step 3 with adjustments.
Step 10: Handoff
Create .stackshift-batch-session.json in the starting repo directory:
{ "sessionId": "discover-{timestamp}", "startedAt": "{iso_date}", "batchRootDirectory": "{starting_repo_path}", "totalRepos": "{length of discoveredRepos array}", "batchSize": 5, "answers": {}, "processedRepos": [], "discoveredRepos": [ { "name": "{repo_name}", "path": "{local_path}", "confidence": "CONFIRMED|HIGH|MEDIUM|LOW", "signals": ["{signal1}", "{signal2}"] } ] }
totalRepos MUST equal the length of the discoveredRepos array (all confidence levels included).
Present next steps as model actions:
What would you like to do with these {X} repos?
A) Run /stackshift.batch on all repos B) Run /stackshift.reimagine C) Export ecosystem map only D) Analyze a specific subset
On user choice:
-
A) Verify .stackshift-batch-session.json exists in the starting repo directory. Instruct user to run /stackshift.batch .
-
B) Note that reimagine needs reverse-engineering docs. Suggest running batch first (Gears 1-2 minimum), or proceed if docs exist.
-
C) Confirm map is saved to .stackshift/ecosystem-map.md . Session file preserved for later.
-
D) Let user pick repos, update batch session with selected subset, then proceed as A.
10 Signal Categories
Signal Category Where to Look Example
1 Scoped npm packages package.json dependencies @myorg/shared-utils
2 Docker Compose services docker-compose*.yml
depends_on: [user-api, redis]
3 Environment variables .env* , config files USER_SERVICE_URL , INVENTORY_API_HOST
4 API client calls Source code imports/URLs fetch('/api/v2/users') , gRPC protos
5 Shared databases Connection strings, schema refs Same DB name in multiple configs
6 CI/CD triggers .github/workflows/*.yml
paths: , repository_dispatch , cross-repo triggers
7 Workspace configs pnpm-workspace.yaml , turbo.json , nx.json , lerna.json
Monorepo package lists
8 Message queues/events Source code, config SQS queue names , SNS topics , Kafka topics
9 Infrastructure refs terraform/ , cloudformation/ , k8s/
Shared VPCs, service meshes, ALBs
10 Import paths / go.mod / requirements.txt Language-specific dependency files replace github.com/myorg/shared => ../shared
For confidence scoring criteria and formulas, see merge-and-score.md .
Edge Cases
Monorepo as Starting Point
When workspace config is detected:
-
All packages resolved from workspace globs are CONFIRMED automatically
-
Still scan each package for outbound signals (external deps, APIs, databases)
-
The ecosystem map shows both intra-monorepo and external dependencies
-
The Mermaid graph uses subgraph to group monorepo packages together
-
Handoff to batch can process each package as a separate "repo"
Standalone Repo (No Signals Found)
When signal scanning finds zero references to other repos, present options per present-ecosystem-map.md Error Cases. Do not treat this as a failure.
No GitHub Org Detected
Skip GitHub search entirely (Step 5 is skipped). Rely on local filesystem scan and signal analysis only. Report: "GitHub search skipped (no org detected). Results based on local scan only."
GitHub Search Rate Limited or Auth Failed
Fall back to local scan + signal analysis. Note in the ecosystem map: "GitHub search was skipped (rate limited / not authenticated)". For transient errors (5xx, network timeout), retry up to 2 times with 10-second backoff before falling back.
Large Ecosystem (20+ Repos)
-
Mermaid graph: show only CONFIRMED + HIGH repos in the main diagram
-
Group repos by domain using subgraph if clear clusters exist
-
Offer to filter: "Found {N} repos. Analyze all, or filter to HIGH+ confidence?"
-
Batch handoff should suggest a conservative batch size (3 at a time)
Only LOW Confidence Repos
When all discovered repos (beyond the starting point) are LOW confidence, present review options per present-ecosystem-map.md Error Cases.
Mixed Local/Remote Repos
-
Prefer local paths when available (faster to scan)
-
Note GitHub-only repos as "remote only" in the ecosystem map
-
Ask user: "Some repos are only on GitHub. Clone them locally for analysis?"