Caffe CIFAR-10 Build and Training
This skill provides procedural guidance for building the Caffe deep learning framework from source and training models on the CIFAR-10 dataset.
When to Use This Skill
-
Building Caffe from source on Ubuntu/Debian systems
-
Training CIFAR-10 or similar image classification models with Caffe
-
Configuring Caffe for CPU-only execution
-
Troubleshooting Caffe build and dependency issues
Critical Requirements Checklist
Before starting, identify ALL requirements from the task specification:
-
Execution mode: CPU-only vs GPU (affects solver configuration)
-
Iteration count: Specific number of training iterations required
-
Output files: Where training logs and models should be saved
-
Model checkpoints: Which iteration's model file is expected
Phase 1: Dependency Installation
System Dependencies
Install required packages before attempting to build:
apt-get update && apt-get install -y
build-essential cmake git
libprotobuf-dev libleveldb-dev libsnappy-dev
libhdf5-serial-dev protobuf-compiler
libatlas-base-dev libgflags-dev libgoogle-glog-dev liblmdb-dev
libopencv-dev libboost-all-dev
python3-dev python3-numpy python3-pip
Verification Step
Confirm critical libraries are installed:
dpkg -l | grep -E "libhdf5|libopencv|libboost"
Phase 2: Caffe Source Acquisition
Clone and Checkout
git clone https://github.com/BVLC/caffe.git cd caffe git checkout 1.0 # Note: Tag is "1.0", not "1.0.0"
Common Mistake
The release tag is 1.0 , not 1.0.0 . Verify with git tag -l if uncertain.
Phase 3: Makefile.config Configuration
Create Configuration File
cp Makefile.config.example Makefile.config
Essential Configuration Changes
Apply these modifications to Makefile.config :
CPU-Only Mode (if no GPU available):
CPU_ONLY := 1
OpenCV Version (for OpenCV 3.x or 4.x):
OPENCV_VERSION := 3
Note: OpenCV 4 may require additional compatibility patches.
HDF5 Paths (Ubuntu-specific):
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial
Python Configuration (Python 3):
PYTHON_LIBRARIES := boost_python3 python3.8 PYTHON_INCLUDE := /usr/include/python3.8 /usr/lib/python3/dist-packages/numpy/core/include
Adjust version numbers based on installed Python version.
Configuration Verification
After editing, verify no duplicate definitions exist:
grep -n "PYTHON_INCLUDE|PYTHON_LIB|CPU_ONLY" Makefile.config
Ensure each setting appears only once in an uncommented form.
Phase 4: Building Caffe
Memory-Aware Compilation
Avoid using all CPU cores on memory-constrained systems:
For systems with limited RAM (< 8GB)
make all -j2
For systems with adequate RAM
make all -j$(nproc)
Build Failure Recovery
If the build fails or is killed (often due to memory):
Clean the build:
make clean
Rebuild with reduced parallelism:
make all -j1
Build Verification
Confirm the binary exists after build:
ls -la .build_release/tools/caffe.bin
or for CPU-only builds:
ls -la .build_release/tools/caffe
Phase 5: Dataset Preparation
Download CIFAR-10
./data/cifar10/get_cifar10.sh
Convert to LMDB Format
./examples/cifar10/create_cifar10.sh
Verification
Confirm LMDB directories exist:
ls -la examples/cifar10/cifar10_train_lmdb ls -la examples/cifar10/cifar10_test_lmdb
Phase 6: Solver Configuration
Modify Solver for Requirements
Edit examples/cifar10/cifar10_quick_solver.prototxt :
Set iteration count:
max_iter: 500 # Or as specified in task
Set execution mode:
solver_mode: CPU # Change from GPU if required
Verification
grep -E "max_iter|solver_mode" examples/cifar10/cifar10_quick_solver.prototxt
Phase 7: Training Execution
Run Training with Output Capture
./build/tools/caffe train
--solver=examples/cifar10/cifar10_quick_solver.prototxt
2>&1 | tee training_output.txt
Alternative Binary Paths
Depending on build configuration, the binary may be at:
-
.build_release/tools/caffe
-
build/tools/caffe
-
.build_release/tools/caffe.bin
Phase 8: Verification
Required Outputs Checklist
Caffe binary exists:
test -f .build_release/tools/caffe && echo "OK" || echo "MISSING"
Model file exists (iteration-specific):
ls -la examples/cifar10/cifar10_quick_iter_*.caffemodel
Training output captured:
test -f training_output.txt && echo "OK" || echo "MISSING"
Solver configured correctly:
grep "solver_mode: CPU" examples/cifar10/cifar10_quick_solver.prototxt
Common Pitfalls
- Premature Termination
Never stop after make clean or intermediate steps. Complete the full workflow: Dependencies -> Build -> Dataset -> Configure -> Train -> Verify
- Missing Solver Configuration
The solver file must be modified for:
-
CPU vs GPU execution mode
-
Specific iteration count requirements
- Skipping Dataset Preparation
Training will fail without LMDB data. Always run both:
-
get_cifar10.sh (download)
-
create_cifar10.sh (convert)
- Build Parallelism Issues
High parallelism (-j$(nproc) ) can exhaust memory. Start with -j2 on constrained systems.
- Duplicate Configuration Entries
Multiple edits to Makefile.config can create duplicate definitions. Always verify single definitions for each setting.
- Wrong Git Tag
Use 1.0 not 1.0.0 for the stable release.
Decision Framework
When encountering issues:
-
Build killed: Reduce parallelism, run make clean , rebuild with -j1
-
Missing headers: Check HDF5 and OpenCV include paths in Makefile.config
-
Python errors: Verify Python version matches configuration
-
Training fails immediately: Check dataset preparation completed
-
Wrong output location: Verify solver paths and output file redirection