Scientific Paper Creator
Write publication-quality scientific papers in LaTeX with a test-driven workflow: every section is compiled and verified before moving to the next. The paper is never in a broken state.
Bundled Resources
This skill includes ready-to-use scripts and templates. Paths are relative to the skill directory (find it via glob("**/KORTIX-paper-creator/") ):
Resource Path Purpose
LaTeX template assets/template.tex
Minimal IMRaD template -- copy to main.tex
BibTeX generator scripts/openalex_to_bibtex.py
Convert OpenAlex API JSON to .bib entries
Compiler scripts/compile.sh
pdflatex + bibtex pipeline with error reporting
Verifier scripts/verify.sh
TDD verification suite (12 checks)
The TDD Rule
After writing every section:
- WRITE section → sections/{name}.tex
- COMPILE: bash compile.sh paper/{slug}/main.tex
- VERIFY: bash verify.sh paper/{slug}/
- If FAIL → FIX → go to 2
- If PASS → move to next section
The paper must compile and pass verification at every step. Never batch errors. Never skip verification. This is non-negotiable.
Pipeline Overview
Phase 1: SCAFFOLD → Create project, copy template, initialize empty sections Verify: compiles to valid (empty) PDF ✓
Phase 2: CITE → Search literature (OpenAlex), build references.bib Verify: bibliography builds without errors ✓
Phase 3: WRITE → Per-section composition in writing order Verify: compile + verify after EACH section ✓
Phase 4: POLISH → Self-reflection pass, strict verification, final build Verify: verify.sh --strict passes with zero warnings ✓
Filesystem Architecture
paper/{paper-slug}/ ├── main.tex # Master file (\input sections) ├── references.bib # BibTeX bibliography ├── sections/ │ ├── abstract.tex │ ├── introduction.tex │ ├── related-work.tex │ ├── methods.tex │ ├── results.tex │ ├── discussion.tex │ └── conclusion.tex ├── figures/ # .pdf/.png figures (+ generation scripts) ├── data/ # Supporting data, CSVs (optional) └── build/ # Compilation output └── main.pdf
Phase 1: Scaffold
1a. Create project structure
SLUG="paper-slug-here" mkdir -p "paper/$SLUG"/{sections,figures,data,build}
1b. Copy and adapt the template
SKILL_DIR=$(find . -path "*/KORTIX-paper-creator/assets/template.tex" -exec dirname {} ; | head -1 | sed 's|/assets||') cp "$SKILL_DIR/assets/template.tex" "paper/$SLUG/main.tex"
Edit main.tex : set \title{} , \author{} , and adapt \documentclass if targeting a specific venue:
Venue Document class Notes
General / arXiv \documentclass[11pt,a4paper]{article}
Default in template
IEEE conference \documentclass[conference]{IEEEtran}
Remove geometry package
ACM conference \documentclass[sigconf]{acmart}
Remove geometry, use acmart bib style
Springer LNCS \documentclass{llncs}
Remove geometry, use splncs04 bib style
1c. Initialize section stubs
Create each section file with a section header and TODO marker:
for sec in abstract introduction related-work methods results discussion conclusion; do SECTION_TITLE=$(echo "$sec" | sed 's/-/ /g; s/\b(.)/\u\1/g') if [ "$sec" = "abstract" ]; then echo '\begin{abstract}' > "paper/$SLUG/sections/$sec.tex" echo "% TODO: Write abstract" >> "paper/$SLUG/sections/$sec.tex" echo '\end{abstract}' >> "paper/$SLUG/sections/$sec.tex" else echo "\section{$SECTION_TITLE}" > "paper/$SLUG/sections/$sec.tex" echo "\label{sec:$sec}" >> "paper/$SLUG/sections/$sec.tex" echo "% TODO: Write $sec" >> "paper/$SLUG/sections/$sec.tex" fi done
1d. Initialize empty bibliography
echo "% References for $SLUG" > "paper/$SLUG/references.bib" echo "% Generated by openalex_to_bibtex.py and manual entries" >> "paper/$SLUG/references.bib"
1e. VERIFY: First green state
bash "$SKILL_DIR/scripts/compile.sh" "paper/$SLUG/main.tex" bash "$SKILL_DIR/scripts/verify.sh" "paper/$SLUG/"
The empty paper must compile cleanly. This is your baseline. Every subsequent change maintains this green state.
Phase 2: Literature & Citations
2a. Search for papers
Load the openalex-paper-search skill for OpenAlex API reference. Search strategy:
Seminal/highly-cited papers
Recent work (last 2-3 years)
Review/survey papers
2b. Generate BibTeX entries
Pipe OpenAlex results through the converter:
curl -s "https://api.openalex.org/works?search=YOUR+TOPIC&per_page=20&mailto=agent@kortix.ai" |
python3 "$SKILL_DIR/scripts/openalex_to_bibtex.py" >> "paper/$SLUG/references.bib"
For non-OpenAlex sources (web pages, books, reports), add manual BibTeX entries:
@misc{authorYYYYtitle, title = {Page Title}, author = {Author Name}, year = {2024}, url = {https://example.com/page}, note = {Accessed: 2024-01-15} }
@book{authorYYYYbook, title = {Book Title}, author = {First Author and Second Author}, year = {2024}, publisher = {Publisher Name}, edition = {2nd} }
2c. VERIFY: Bibliography builds
bash "$SKILL_DIR/scripts/compile.sh" "paper/$SLUG/main.tex" bash "$SKILL_DIR/scripts/verify.sh" "paper/$SLUG/"
Check: zero BibTeX errors, all .bib entries parse correctly.
Phase 3: Per-Section Writing
Writing Order
Write sections in this order (each one builds on context from the previous):
-
Abstract (draft) -- establishes scope, will be revised last
-
Introduction -- context, problem, gap, contribution
-
Methods -- what you did and how
-
Results -- what you found
-
Discussion -- what it means, limitations
-
Related Work -- needs full paper context to position contribution
-
Conclusion -- recap, future work
-
Abstract (final) -- revise to match actual paper content
The Writing Loop (for each section)
- READ all previously written sections to build context
- WRITE the section following the guidance below
- COMPILE + VERIFY
- SELF-REFLECT: re-read critically
- Every factual claim has a \cite{}?
- No filler words or empty hedging?
- Logical flow between paragraphs?
- Consistent with other sections?
- Quantitative where possible?
- REVISE if needed → COMPILE + VERIFY again
- Section DONE → next
Section-Specific Guidance
Abstract (150-300 words)
Structure: Context (1-2 sentences) → Problem → Approach → Key results (quantitative!) → Implications.
\begin{abstract} [Context sentence setting the broad area.] [Problem: what gap or challenge exists.] [Approach: what this paper does -- "We propose/present/introduce..."] [Results: key quantitative findings -- "Our method achieves X% on Y, outperforming Z by N%."] [Implications: why it matters.] \end{abstract}
Rules: No citations in abstract. No undefined acronyms. Every number must appear in the actual results.
Introduction
Funnel structure: broad context → specific problem → gap in existing work → your contribution → paper outline.
\section{Introduction} \label{sec:introduction}
% Paragraph 1: Broad context (2-3 sentences) % Paragraph 2: Narrow to specific problem % Paragraph 3: What's been tried, what's missing (the gap) % Paragraph 4: Your contribution -- be specific and concrete % "Our contributions are as follows: % \begin{itemize} % \item We propose... % \item We demonstrate... % \item We release... % \end{itemize}" % Paragraph 5: Paper outline % "The remainder of this paper is organized as follows. \secref{sec:related-work} reviews..."
Rules: End with explicit contribution list. Every contribution claimed here must be backed by evidence in Results/Discussion.
Methods
Reproducible detail. Another researcher should be able to replicate your work from this section alone.
\section{Methods} \label{sec:methods}
% Define notation before using it: % "Let $\mathcal{D} = {(x_i, y_i)}_{i=1}^{N}$ denote the training set..."
% Algorithms as pseudocode: % \begin{algorithm}[t] % \caption{Algorithm Name} % \label{alg:name} % \begin{algorithmic}[1] % \REQUIRE Input description % \ENSURE Output description % \STATE Step 1 % \FOR{condition} % \STATE Step in loop % \ENDFOR % \RETURN result % \end{algorithmic} % \end{algorithm}
Rules: Define all notation. Be precise about dimensions, loss functions, optimization details. Use consistent math notation throughout (define \newcommand macros in main.tex preamble).
Results
Lead with the main finding. Then support with details.
\section{Results} \label{sec:results}
% Paragraph 1: Main result (the headline number) % "Our method achieves X% accuracy on Y benchmark, representing a Z% improvement % over the previous state of the art~\citep{baseline2024}."
% Table with main results: % \begin{table}[t] % \centering % \caption{Main results on [benchmark]. Bold = best, underline = second best.} % \label{tab:main-results} % \begin{tabular}{lcc} % \toprule % Method & Metric 1 & Metric 2 \ % \midrule % Baseline A~\citep{ref} & 82.3 & 91.1 \ % Baseline B~\citep{ref} & 84.7 & 92.4 \ % \textbf{Ours} & \textbf{87.2} & \textbf{94.1} \ % \bottomrule % \end{tabular} % \end{table}
% Subsequent paragraphs: ablation studies, analysis, per-dataset breakdown % Every figure and table MUST be referenced: "As shown in \figref{fig:x}..."
Rules: Every figure/table referenced in text at least once. Report statistical significance or confidence intervals where applicable. No cherry-picking -- report all metrics, including where you don't win.
Discussion
Interpretation, not repetition. Don't restate results -- explain what they mean.
\section{Discussion} \label{sec:discussion}
% Paragraph 1: Interpret main findings -- what do the results tell us? % Paragraph 2: Compare with prior work -- how do results relate to existing literature? % Paragraph 3: Limitations -- be honest and specific % "Our approach has several limitations. First, ... Second, ..." % Paragraph 4: Broader implications -- what does this enable? % Paragraph 5: Future work -- concrete directions, not vague hand-waving
Rules: Limitations must be genuine, not token gestures. Future work should be specific enough to be actionable.
Related Work
Written after Methods/Results/Discussion so you can precisely position your contribution against the literature.
\section{Related Work} \label{sec:related-work}
% Organize THEMATICALLY, not chronologically: % \subsection{Theme A} % "Several works have addressed ... \citep{a,b,c}. Specifically, A does X, B does Y. % In contrast, our approach differs by Z." % % \subsection{Theme B} % "Another line of work focuses on ... \citep{d,e,f}. % While these methods ..., they do not address ..., which is the focus of our work."
Rules: For each group of related work, explicitly state how your work differs. Don't just list papers -- synthesize and contrast. Every paper cited here should be in references.bib .
Conclusion
No new information. Brief recap of contribution and results, then future directions.
\section{Conclusion} \label{sec:conclusion}
% Paragraph 1: What we did and what we found (3-4 sentences max) % "In this paper, we presented [method] for [problem]. % Our approach [key insight]. Experiments on [benchmarks] demonstrate [main result]." % % Paragraph 2: Future directions (2-3 sentences) % "In future work, we plan to explore ... and extend our approach to ..."
Rules: Claims here must match what's in Results. Don't overclaim. 0.5-1 page max.
LaTeX Conventions
Enforce these throughout all sections:
Cross-references (defined as macros in template):
\figref{fig:architecture} % → "Figure 1" (via cleveref) \tabref{tab:main-results} % → "Table 2" \eqnref{eq:loss} % → "Equation 3" \secref{sec:methods} % → "Section 4"
Citations:
\citet{smith2024} % → "Smith et al. (2024)" (textual -- subject of sentence) \citep{smith2024} % → "(Smith et al., 2024)" (parenthetical) \citep{smith2024,jones2023} % → "(Smith et al., 2024; Jones et al., 2023)"
Use \citet when the author is the grammatical subject: "\citet{smith2024} showed that...". Use \citep for parenthetical: "Recent work has shown significant gains \citep{smith2024}."
Figures:
\begin{figure}[t] \centering \includegraphics[width=\linewidth]{figure-name} \caption{Descriptive caption. Best viewed in color.} \label{fig:figure-name} \end{figure}
Tables:
\begin{table}[t] \centering \caption{Caption ABOVE the table (convention).} \label{tab:table-name} \begin{tabular}{lcc} \toprule Column 1 & Column 2 & Column 3 \ \midrule Row 1 & data & data \ Row 2 & data & data \ \bottomrule \end{tabular} \end{table}
Math notation: Define reusable macros in the preamble of main.tex :
\newcommand{\loss}{\mathcal{L}} \newcommand{\model}{\mathcal{M}} \newcommand{\dataset}{\mathcal{D}} \newcommand{\R}{\mathbb{R}} \newcommand{\E}{\mathbb{E}} \newcommand{\argmin}{\operatorname{argmin}}
Phase 4: Polish & Final Verification
4a. Self-Reflection Pass
After all sections are written, re-read the entire paper and check:
-
Consistency: Same terminology throughout? Abstract claims match Results numbers?
-
Redundancy: Any paragraph repeated across sections? Cut ruthlessly.
-
Flow: Smooth transitions between sections? Each section starts by connecting to the previous?
-
Claims: Every claim in Intro/Abstract backed by evidence in Results?
-
Notation: Math notation consistent? All symbols defined before first use?
4b. Strict Verification
bash "$SKILL_DIR/scripts/verify.sh" "paper/$SLUG/" --strict
The --strict flag treats warnings as failures. All 12 checks must pass:
-
PDF exists and is non-trivial size
-
Zero LaTeX errors
-
Zero undefined \ref{} references
-
Zero undefined \cite{} citations
-
Zero BibTeX errors
-
Zero overfull boxes > 10pt
-
All referenced figure files exist
-
All sections have content (no empty stubs)
-
All .bib entries cited at least once (no dead refs)
-
Zero TODO/FIXME comments remaining
-
Abstract word count 100-300
-
All figures/tables referenced in text
4c. Final Build
bash "$SKILL_DIR/scripts/compile.sh" "paper/$SLUG/main.tex" --clean
The --clean flag removes auxiliary files, leaving only the PDF. Deliverable: paper/{slug}/build/main.pdf .
Figure Generation
When the paper needs figures, generate them as PDF or high-res PNG and save to figures/ . Keep the generation script alongside for reproducibility.
matplotlib (recommended for most plots)
import matplotlib.pyplot as plt import matplotlib matplotlib.rcParams.update({ 'font.size': 12, 'font.family': 'serif', 'figure.figsize': (6, 4), 'figure.dpi': 150, 'savefig.dpi': 300, 'savefig.bbox': 'tight', 'axes.grid': True, 'grid.alpha': 0.3, })
fig, ax = plt.subplots()
... plot data ...
ax.set_xlabel('X Label') ax.set_ylabel('Y Label') ax.legend() fig.savefig('paper/{slug}/figures/plot-name.pdf') plt.close()
Rules:
-
Always save as PDF for vector quality (PNG only for raster images like photos/heatmaps)
-
Font size >= 10pt (readable when scaled to column width)
-
Save the plotting script to figures/plot_name.py for reproducibility
-
Use consistent color palettes across all figures
-
Include axis labels, legends, and titles
Diagrams and Architecture Figures
For architecture/flow diagrams, either:
-
Generate with matplotlib/tikz and save as PDF
-
Use image-gen tool and save to figures/
-
Hand-draw in any tool and export as PDF/PNG
LaTeX Troubleshooting
Common errors and fixes:
Error Cause Fix
Missing $ inserted
Math symbol outside math mode Wrap in $...$
Undefined control sequence
Typo in command or missing package Check spelling, add \usepackage
File not found
Wrong path in \includegraphics
Check \graphicspath and filename
Citation undefined
Key mismatch between \cite and .bib
Check exact key spelling
Missing \begin{document}
Error in preamble Check for typos above \begin{document}
Overfull \hbox
Line too wide Rephrase, add \linebreak , or use \sloppy locally
Package clash
Two packages conflict Check compatibility, load order matters
When compile fails: read the first error in the log (subsequent errors often cascade from the first). Fix it, recompile. Don't try to fix multiple errors at once.
Scientific Writing Standards
Quick-reference checklist for quality prose:
-
Active voice: "We propose X" not "X is proposed"
-
Strong verbs: "achieves" not "is able to achieve", "reduces" not "leads to a reduction in"
-
Quantitative: "improves by 12%" not "significantly improves"
-
Define before use: Every acronym, symbol, and technical term defined on first use
-
Parallel structure: Lists and comparisons use the same grammatical form
-
No orphan pronouns: "This approach" not "This" alone at the start of a sentence
-
Hedging (appropriate): "suggests" / "indicates" for uncertain claims; "demonstrates" / "establishes" for proven ones
-
Person: "We" for authors' actions; passive voice sparingly for general facts; never "I" in multi-author papers
-
Tense: Present for established facts and paper structure ("Section 3 presents..."); past for what you did ("We trained the model on...")
-
Citations as nouns: "\citet{smith2024} showed..." (not "It was shown in \citep{smith2024}...")