LangChain4J Testing Strategies
Overview
LangChain4J testing requires specialized strategies due to the non-deterministic nature of LLM responses and the complexity of AI workflows. This skill provides comprehensive patterns for unit testing with mocks, integration testing with Testcontainers, and end-to-end testing for RAG systems, AI Services, and tool execution.
When to Use This Skill
Use this skill when:
-
Building AI-powered applications with LangChain4J
-
Writing unit tests for AI services and guardrails
-
Setting up integration tests with real LLM models
-
Creating mock-based tests for faster test execution
-
Using Testcontainers for isolated testing environments
-
Testing RAG (Retrieval-Augmented Generation) systems
-
Validating tool execution and function calling
-
Testing streaming responses and async operations
-
Setting up end-to-end tests for AI workflows
-
Implementing performance and load testing
Instructions
To test LangChain4J applications effectively, follow these key strategies:
- Start with Unit Testing
Use mock models for fast, isolated testing of business logic. See references/unit-testing.md for detailed examples.
// Example: Mock ChatModel for unit tests ChatModel mockModel = mock(ChatModel.class); when(mockModel.generate(any(String.class))) .thenReturn(Response.from(AiMessage.from("Mocked response")));
var service = AiServices.builder(AiService.class) .chatModel(mockModel) .build();
- Configure Testing Dependencies
Setup proper Maven/Gradle dependencies for testing. See references/testing-dependencies.md for complete configuration.
Key dependencies:
-
langchain4j-test
-
Testing utilities and guardrail assertions
-
testcontainers
-
Integration testing with containerized services
-
mockito
-
Mock external dependencies
-
assertj
-
Better assertions
- Implement Integration Tests
Test with real services using Testcontainers. See references/integration-testing.md for container setup examples.
@Testcontainers class OllamaIntegrationTest { @Container static GenericContainer<?> ollama = new GenericContainer<>( DockerImageName.parse("ollama/ollama:0.5.4") ).withExposedPorts(11434);
@Test
void shouldGenerateResponse() {
ChatModel model = OllamaChatModel.builder()
.baseUrl(ollama.getEndpoint())
.build();
String response = model.generate("Test query");
assertNotNull(response);
}
}
- Test Advanced Features
For streaming responses, memory management, and complex workflows, refer to references/advanced-testing.md .
- Apply Testing Workflows
Follow testing pyramid patterns and best practices from references/workflow-patterns.md .
-
70% Unit Tests: Fast, isolated business logic testing
-
20% Integration Tests: Real service interactions
-
10% End-to-End Tests: Complete user workflows
Examples
Basic Unit Test
@Test void shouldProcessQueryWithMock() { ChatModel mockModel = mock(ChatModel.class); when(mockModel.generate(any(String.class))) .thenReturn(Response.from(AiMessage.from("Test response")));
var service = AiServices.builder(AiService.class)
.chatModel(mockModel)
.build();
String result = service.chat("What is Java?");
assertEquals("Test response", result);
}
Integration Test with Testcontainers
@Testcontainers class RAGIntegrationTest { @Container static GenericContainer<?> ollama = new GenericContainer<>( DockerImageName.parse("ollama/ollama:0.5.4") );
@Test
void shouldCompleteRAGWorkflow() {
// Setup models and stores
var chatModel = OllamaChatModel.builder()
.baseUrl(ollama.getEndpoint())
.build();
var embeddingModel = OllamaEmbeddingModel.builder()
.baseUrl(ollama.getEndpoint())
.build();
var store = new InMemoryEmbeddingStore<>();
var retriever = EmbeddingStoreContentRetriever.builder()
.chatModel(chatModel)
.embeddingStore(store)
.embeddingModel(embeddingModel)
.build();
// Test complete workflow
var assistant = AiServices.builder(RagAssistant.class)
.chatLanguageModel(chatModel)
.contentRetriever(retriever)
.build();
String response = assistant.chat("What is Spring Boot?");
assertNotNull(response);
assertTrue(response.contains("Spring"));
}
}
Best Practices
Test Isolation
-
Each test must be independent
-
Use @BeforeEach and @AfterEach for setup/teardown
-
Avoid sharing state between tests
Mock External Dependencies
-
Never call real APIs in unit tests
-
Use mocks for ChatModel, EmbeddingModel, and external services
-
Test error handling scenarios
Performance Considerations
-
Unit tests should run in < 50ms
-
Integration tests should use container reuse
-
Include timeout assertions for slow operations
Quality Assertions
-
Test both success and error scenarios
-
Validate response coherence and relevance
-
Include edge case testing (empty inputs, large payloads)
Reference Documentation
For comprehensive testing guides and API references, see the included reference documents:
-
Testing Dependencies - Maven/Gradle configuration and setup
-
Unit Testing - Mock models, guardrails, and individual components
-
Integration Testing - Testcontainers and real service testing
-
Advanced Testing - Streaming, memory, and error handling
-
Workflow Patterns - Test pyramid and best practices
Common Patterns
Mock Strategy
// For fast unit tests ChatModel mockModel = mock(ChatModel.class); when(mockModel.generate(anyString())).thenReturn(Response.from(AiMessage.from("Mocked")));
// For specific responses when(mockModel.generate(eq("Hello"))).thenReturn(Response.from(AiMessage.from("Hi"))); when(mockModel.generate(contains("Java"))).thenReturn(Response.from(AiMessage.from("Java response")));
Test Configuration
// Use test-specific profiles @TestPropertySource(properties = { "langchain4j.ollama.base-url=http://localhost:11434" }) class TestConfig { // Test with isolated configuration }
Assertion Helpers
// Custom assertions for AI responses assertThat(response).isNotNull().isNotEmpty(); assertThat(response).containsAll(expectedKeywords); assertThat(response).doesNotContain("error");
Performance Requirements
-
Unit Tests: < 50ms per test
-
Integration Tests: Use container reuse for faster startup
-
Timeout Tests: Include @Timeout for external service calls
-
Memory Management: Test conversation window limits and cleanup
Security Considerations
-
Never use real API keys in tests
-
Mock external API calls completely
-
Test prompt injection detection
-
Validate output sanitization
Testing Pyramid Implementation
70% Unit Tests ├─ Business logic validation ├─ Guardrail testing ├─ Mock tool execution └─ Edge case handling
20% Integration Tests ├─ Testcontainers with Ollama ├─ Vector store testing ├─ RAG workflow validation └─ Performance benchmarking
10% End-to-End Tests ├─ Complete user journeys ├─ Real model interactions └─ Performance under load
Related Skills
-
spring-boot-test-patterns
-
unit-test-service-layer
-
unit-test-boundary-conditions
References
-
Testing Dependencies
-
Unit Testing
-
Integration Testing
-
Advanced Testing
-
Workflow Patterns
Constraints and Warnings
-
AI model responses are non-deterministic; tests should use mocks for reliability.
-
Real API calls in tests should be avoided to prevent costs and rate limiting issues.
-
Integration tests with Testcontainers require Docker to be available.
-
Memory management tests should verify proper cleanup between test runs.
-
Tool execution tests should validate both success and failure scenarios.
-
Streaming response tests require proper handling of partial data.
-
RAG tests need properly seeded embedding stores for consistent results.
-
Performance tests may have high variance due to LLM response times.
-
Always use test-specific configuration profiles to avoid affecting production data.
-
Mock-based tests cannot guarantee actual LLM behavior; supplement with integration tests.