ComfyUI DirectML AMD Skill 🎨
Description
Skill for setting up and optimizing ComfyUI on AMD GPUs under Windows using DirectML. Provides guides, fixes, and ready-to-use configurations for local image, music, and video generation.
When to Use This Skill
Use this skill when:
- You have an AMD GPU (RX 7000/6000/5000 series) and want to run ComfyUI
- You're getting errors with
torch_directmlmodule - You need to fix
model_patcher.pyfor DirectML compatibility - You want to know which models are compatible with your VRAM
- You're looking for AMD GPU performance benchmarks
Installation
# 1. Install skill
clawhub install comfyui-directml-amd
# 2. Navigate to directory
cd C:\ComfyUI
# 3. Create Python 3.12 environment (if not already done)
uv venv --python 3.12
# 4. Install DirectML
.venv\Scripts\python.exe -m pip install torch-directml --force-reinstall
# 5. Apply fixes (automated by this skill)
python apply-directml-fixes.py
# 6. Start ComfyUI
.venv\Scripts\python.exe main.py --directml --port 8188
Tools
apply-directml-fixes.py
Automatically applies all necessary DirectML fixes:
- Fixes
comfy\model_patcher.py(2 locations) - Creates backups of original files
- Verifies functionality after application
download-models.ps1
Downloads recommended models:
- SDXL Turbo (6.6 GB)
- SDXL Base 1.0 (6.0 GB)
- Juggernaut XL v9 (6.6 GB)
- Optional: Flux.1, ACE-Step, Wan 2.1
benchmark.py
Tests performance with different models and settings:
- Measures generation time
- Monitors VRAM usage
- Compares with CUDA references
Configuration
Minimum Requirements
- GPU: AMD Radeon with 8GB+ VRAM
- OS: Windows 10/11 64-bit
- Python: 3.12 (DirectML has no wheels for 3.13+)
- VRAM: 8GB minimum, 12GB+ recommended
Recommended Settings
# For 16GB VRAM (RX 7900 GRE/XTX)
python main.py --directml --port 8188 --max-memory 16384
# For 12GB VRAM (RX 7800/6800)
python main.py --directml --port 8188 --max-memory 12288 --use-split-cross-attention
# For 8GB VRAM (RX 7600/6600)
python main.py --directml --port 8188 --max-memory 8192 --lowvram
Performance Benchmarks (RX 7900 GRE 16GB)
| Model | Resolution | Steps | Time | Quality |
|---|---|---|---|---|
| SDXL Turbo | 1024² | 1 | 30-60s | Good |
| SDXL Base | 1024² | 20 | 5-10 min | Very Good |
| Juggernaut XL | 1024² | 20 | 5-10 min | Excellent |
| Flux.1 schnell | 1024² | 4 | 3-5 min | Best |
Note: DirectML is ~2-3x slower than CUDA on comparable NVIDIA GPU, but fully functional!
Common Errors and Solutions
ModuleNotFoundError: torch_directml
Solution: Python 3.14 is too new → create .venv with Python 3.12
SyntaxError in model_patcher.py
Solution: Skill automatically applies fixes, or use apply-directml-fixes.py
Port already in use
Solution: ComfyUI already running → kill process or use different port
Out of Memory
Solution: Reduce resolution, use FP8 models, or add --lowvram flag
ComfyUI crash during generation
Solution: DirectML is unstable → restart server, try fewer steps
OpenClaw Integration
Skill is compatible with OpenClaw comfy plugin:
{
"plugins": {
"entries": {
"comfy": {
"enabled": true,
"config": {
"baseUrl": "http://127.0.0.1:8188",
"image": {
"workflowPath": "comfy-workflows/image-generation.json"
}
}
}
}
}
}
References
Author
Klepeto 🦞 (vilda)
Tested on: AMD Radeon RX 7900 GRE 16GB, Windows 11
License
MIT-0 - Free to use, modify, and redistribute without attribution.
Changelog
1.0.0 (2026-05-07)
- Initial release
- Complete AMD DirectML setup guide
- Automatic model_patcher.py fixes
- Benchmark scripts
- Model download scripts
- OpenClaw integration