content-memory

Content Memory Pipeline

Purpose: Take various content sources → convert to markdown → chunk them → refer to them for future use.

The pipeline is the core value. Folder layout, workspace sync, and integration with other context (e.g. Vesta 7) are secondary; those pieces will be part of context anyway.

Architecture

memory/ (project root): Content and chunks. No converted/ or chunked/ subfolders—chunks go directly in topic folders. Markdown in <folder>/markdown/ .
workspace/ (optional): Source content for sync_and_chunk; copied to memory.
Convert to markdown/: PDF/DOCX → .md written in <folder>/markdown/ for each folder.
Chunk: Reads from source/<domain>/ or memory/<domain>/ , writes to memory/<domain>/<topic>/ .

When to Activate

"Add content to memory", "refresh memory", "ingest for agent"
"Sync workspace to memory", "convert and chunk"

Step 1: Convert to Markdown

Convert non-.md files to markdown in a markdown/ subfolder per folder:

python scripts/convert_to_markdown.py --source <path> [--memory <domain>] python scripts/convert_to_markdown.py --from source/CBE/domain_journeys_approach

Writes .md in <folder>/markdown/ (e.g. CB Domain/foo.pdf → CB Domain/markdown/foo.md )
.md files are skipped

Step 2: Chunk to Memory

Chunk markdown and write directly into memory topic folders:

python scripts/chunk_markdown.py --memory <domain> [--incremental]

Reads from: source/<domain>//*.md or memory/<domain>//*.md .
Writes to: memory/<domain>/<topic>/ (no chunked/ subfolder)
--incremental : Only chunk new or modified files

Step 3: Sync Workspace (Convert + Copy + Chunk)

One command for workspace content:

python scripts/sync_and_chunk.py --workspace <topic> --memory <domain> [--incremental]

Converts non-.md to <folder>/markdown/ (in workspace)
Copies workspace/<topic> → memory/<domain>/<topic>
Chunks to memory/<domain>/<topic>/

Chunking Strategy

Slide decks ( ): One chunk per slide
Other docs (>200 lines): Split at # or ## boundaries
Small files (<200 lines): Single chunk

Each chunk includes:

Key Behaviors

Take content sources – PDF, PPTX, DOCX, etc. (or workspace content).
Convert to markdown – Non-.md files → .md in place.
Chunk – Split markdown into referable chunks (by slide, by heading, or whole file).
Refer for future use – Chunks live where agents/context can find them; source attribution in each chunk.
Incremental – Use --incremental to skip unchanged files.

Project-Specific Transformers

Location Scope

memory/<name>/transformers/

Memory-specific

.content-memory/transformers/

Workspace-level

Each .py exports EXTENSIONS and convert(path: Path) -> str .

Scripts

Script Purpose

convert_to_markdown.py

Convert to markdown/

chunk_markdown.py

Chunk to memory

sync_and_chunk.py

Convert + copy + chunk (workspace)

Run from workspace root. Set CONTENT_MEMORY_ROOT if needed.

Troubleshooting

Issue Fix

No markdown Run convert; then chunk. Or sync workspace to memory first.

Missing markitdown pip install "markitdown[all]"

content-memory

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

solution-shaping

ace-context-to-memory

ace-shaping

ace-commit-msg