GrepAI Embeddings with LM Studio

This skill covers using LM Studio as the embedding provider for GrepAI, offering a user-friendly GUI for managing local models.

When to Use This Skill

Want local embeddings with a graphical interface
Already using LM Studio for other AI tasks
Prefer visual model management over CLI
Need to easily switch between models

What is LM Studio?

LM Studio is a desktop application for running local LLMs with:

🖥️ Graphical user interface
📦 Easy model downloading
🔌 OpenAI-compatible API
🔒 100% private, local processing

Prerequisites

Download LM Studio from lmstudio.ai
Install and launch the application
Download an embedding model

Installation

Step 1: Download LM Studio

Visit lmstudio.ai and download for your platform:

macOS (Intel or Apple Silicon)
Windows
Linux

Step 2: Launch and Download a Model

Open LM Studio
Go to the Search tab
Search for an embedding model:
nomic-embed-text-v1.5
bge-small-en-v1.5
bge-large-en-v1.5
Click Download

Step 3: Start the Local Server

Go to the Local Server tab
Select your embedding model
Click Start Server
Note the endpoint (default: http://localhost:1234 )

Configuration

Basic Configuration

.grepai/config.yaml

embedder: provider: lmstudio model: nomic-embed-text-v1.5 endpoint: http://localhost:1234

With Custom Port

embedder: provider: lmstudio model: nomic-embed-text-v1.5 endpoint: http://localhost:8080

With Explicit Dimensions

embedder: provider: lmstudio model: nomic-embed-text-v1.5 endpoint: http://localhost:1234 dimensions: 768

Available Models

nomic-embed-text-v1.5 (Recommended)

Property Value

Dimensions 768

Size ~260 MB

Quality Excellent

Speed Fast

embedder: provider: lmstudio model: nomic-embed-text-v1.5

bge-small-en-v1.5

Property Value

Dimensions 384

Size ~130 MB

Quality Good

Speed Very fast

Best for: Smaller codebases, faster indexing.

embedder: provider: lmstudio model: bge-small-en-v1.5 dimensions: 384

bge-large-en-v1.5

Property Value

Dimensions 1024

Size ~1.3 GB

Quality Very high

Speed Slower

Best for: Maximum accuracy.

embedder: provider: lmstudio model: bge-large-en-v1.5 dimensions: 1024

Model Comparison

Model Dims Size Speed Quality

bge-small-en-v1.5

384 130MB ⚡⚡⚡ ⭐⭐⭐

nomic-embed-text-v1.5

768 260MB ⚡⚡ ⭐⭐⭐⭐

bge-large-en-v1.5

1024 1.3GB ⚡ ⭐⭐⭐⭐⭐

LM Studio Server Setup

Starting the Server

Open LM Studio
Navigate to Local Server tab (left sidebar)
Select an embedding model from the dropdown
Configure settings:
Port: 1234 (default)
Enable Embedding Endpoint
Click Start Server

Server Status

Look for the green indicator showing the server is running.

Verifying the Server

Check server is responding

curl http://localhost:1234/v1/models

Test embedding

curl http://localhost:1234/v1/embeddings
-H "Content-Type: application/json"
-d '{ "model": "nomic-embed-text-v1.5", "input": "function authenticate(user)" }'

LM Studio Settings

Recommended Settings

In LM Studio's Local Server tab:

Setting Recommended Value

Port 1234

Enable CORS Yes

Context Length Auto

GPU Layers Max (for speed)

GPU Acceleration

LM Studio automatically uses:

macOS: Metal (Apple Silicon)
Windows/Linux: CUDA (NVIDIA)

Adjust GPU layers in settings for memory/speed balance.

Running LM Studio Headless

For server environments, LM Studio supports CLI mode:

Start server without GUI (check LM Studio docs for exact syntax)

lmstudio server start --model nomic-embed-text-v1.5 --port 1234

Common Issues

❌ Problem: Connection refused ✅ Solution: Ensure LM Studio server is running:

Open LM Studio
Go to Local Server tab
Click Start Server

❌ Problem: Model not found ✅ Solution:

Download the model in LM Studio's Search tab
Select it in the Local Server dropdown

❌ Problem: Slow embedding generation ✅ Solutions:

Enable GPU acceleration in LM Studio settings
Use a smaller model (bge-small-en-v1.5)
Close other GPU-intensive applications

❌ Problem: Port already in use ✅ Solution: Change port in LM Studio settings:

embedder: endpoint: http://localhost:8080 # Different port

❌ Problem: LM Studio closes and server stops ✅ Solution: Keep LM Studio running in the background, or consider using Ollama which runs as a system service

LM Studio vs Ollama

Feature LM Studio Ollama

GUI ✅ Yes ❌ CLI only

System service ❌ App must run ✅ Background service

Model management ✅ Visual ✅ CLI

Ease of use ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐

Server reliability ⭐⭐⭐ ⭐⭐⭐⭐⭐

Recommendation: Use LM Studio if you prefer a GUI, Ollama for always-on background service.

Migrating from LM Studio to Ollama

If you need a more reliable background service:

Install Ollama:

brew install ollama ollama serve & ollama pull nomic-embed-text

Update config:

embedder: provider: ollama model: nomic-embed-text endpoint: http://localhost:11434

Re-index:

rm .grepai/index.gob grepai watch

Best Practices

Keep LM Studio running: Server stops when app closes
Use recommended model: nomic-embed-text-v1.5 for best balance
Enable GPU: Faster embeddings with hardware acceleration
Check server before indexing: Ensure green status indicator
Consider Ollama for production: More reliable as background service

Output Format

Successful LM Studio configuration:

✅ LM Studio Embedding Provider Configured

Provider: LM Studio Model: nomic-embed-text-v1.5 Endpoint: http://localhost:1234 Dimensions: 768 (auto-detected) Status: Connected

Note: Keep LM Studio running for embeddings to work.

grepai-embeddings-lmstudio

Safety Notice

Copy this and send it to your AI assistant to learn

.grepai/config.yaml

Check server is responding

Test embedding

Start server without GUI (check LM Studio docs for exact syntax)

Source Transparency

Related Skills

grepai-search-basics

grepai-search-advanced

grepai-search-tips

grepai-trace-graph