Welcome to Serapeum#
A modular Python LLM framework for building intelligent applications
Serapeum provides clean, composable abstractions for working with Large Language Models. Built on a provider-agnostic architecture, it lets you focus on building applications rather than wrestling with API differences.
Features#
-
Modular Architecture
Provider-agnostic core with pluggable integrations. Switch between Ollama, OpenAI, Azure, or any other provider without changing your application code.
-
Tool Calling Made Easy
Create tools from Python functions or Pydantic models. Automatic JSON schema generation, validation, and execution.
-
Async-First Design
Full support for both sync and async operations with streaming. Built for high-performance production applications.
-
Structured Outputs
Force LLMs to return structured data using Pydantic models. Type-safe responses with automatic validation.
-
LLM Orchestration
High-level orchestrators that manage conversation flow, tool execution, and prompt composition automatically.
-
Namespace Packages
Clean package hierarchy using PEP 420 namespace packages. Install only what you need:
serapeum-core,serapeum-ollama, etc.
Quick Start#
Install the core package and a provider:
Create your first LLM application:
import os
from serapeum.ollama import Ollama
from serapeum.core.llms import Message, MessageRole
# Initialize LLM
llm = Ollama(model="gpt-oss:20b", api_key=os.environ.get("OLLAMA_API_KEY"))
# Simple chat
messages = [
Message(role=MessageRole.USER, chunks=[TextChunk(content="What is Python?")])
]
response = llm.chat(messages)
print(response.message.content)
Use tools with your LLM:
import os
from serapeum.core.tools import CallableTool
from serapeum.ollama import Ollama
# Initialize LLM
llm = Ollama(model="gpt-oss:20b", api_key=os.environ.get("OLLAMA_API_KEY"))
# Create a simple tool
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
return f"The weather in {city} is sunny and 72°F"
weather_tool = CallableTool.from_function(get_weather)
# Chat with tool calling
response = llm.generate_tool_calls(
tools=[weather_tool],
message="What's the weather in San Francisco?"
)
print(response.message.additional_kwargs["tool_calls"])
# [ToolCall(function=Function(name='get_weather', arguments={'city': 'San Francisco'}))]
Get structured outputs:
import os
from pydantic import BaseModel
from serapeum.core.prompts import PromptTemplate
from serapeum.ollama import Ollama
class CityInfo(BaseModel):
name: str
country: str
population: int
famous_for: list[str]
# Create a prompt template
prompt = PromptTemplate(
"Provide information about {city} in JSON format. "
"Include: name, country, population, and famous_for (list of attractions)."
)
# Force structured output
llm_json = Ollama(model="llama3.1", api_key=os.environ.get("OLLAMA_API_KEY"), json_mode=True)
result = llm_json.parse(
schema=CityInfo,
prompt=prompt,
city="Paris"
)
print(result.name) # "Paris"
print(result.famous_for) # ["Eiffel Tower", "Louvre Museum", ...]
Architecture Overview#
Serapeum follows a layered architecture from base abstractions to high-level orchestration:
graph TB
subgraph "Orchestration Layer"
A[ToolOrchestratingLLM]
B[TextCompletionLLM]
end
subgraph "LLM Layer"
C[LLM]
D[FunctionCallingLLM]
E[StructuredOutputLLM]
end
subgraph "Base Layer"
F[BaseLLM Protocol]
G[Message Models]
H[Response Types]
end
subgraph "Provider Layer"
I[Ollama]
J[OpenAI]
K[Azure OpenAI]
end
subgraph "Tools Layer"
L[CallableTool]
M[BaseTool]
end
A --> C
A --> L
B --> C
C --> F
D --> F
E --> C
I --> D
J --> D
K --> D
L --> M
Key Layers:
- Base Layer: Core protocols and data models that all providers implement
- LLM Layer: Prompt formatting, structured prediction, and tool-calling specialization
- Tools Layer: Tool interfaces with automatic schema generation
- Orchestration Layer: High-level components that compose prompts, LLMs, and toolsets
- Provider Layer: Concrete implementations (Ollama, OpenAI, Azure, etc.)
Why Serapeum?#
Provider Agnostic#
Write your application once, switch providers anytime. The same code works with Ollama, OpenAI, Azure, or any other provider.
Type Safe#
Full type annotations with Pydantic integration. Catch errors at development time, not runtime.
Production Ready#
Async-first design with streaming support. Built for high-throughput production applications.
Composable#
Build complex workflows by composing simple, reusable components. Tools, prompts, and LLMs work together seamlessly.
Well Documented#
Comprehensive documentation with examples, architecture diagrams, and API references.
Project Structure#
The repository uses a monorepo structure with multiple packages:
serapeum/
├── libs/
│ ├── core/ # serapeum-core: provider-agnostic core
│ └── providers/
│ ├── ollama/ # serapeum-ollama: Ollama integration
│ ├── openai/ # serapeum-openai: OpenAI integration
│ └── azure-openai/ # serapeum-azure-openai: Azure OpenAI
├── docs/ # MkDocs documentation
├── examples/ # Usage examples and notebooks
└── prompts/ # Prompt templates
Benefits:
- Unified development: All packages share the same development environment
- Consistent versioning: Coordinated releases across packages
- Shared tooling: Single configuration for testing, linting, and documentation
Next Steps#
-
Get started with detailed installation instructions for different environments
-
Understand the project structure and key components
-
Dive deep into the architectural patterns and design decisions
-
Explore the complete API documentation with examples
Community & Support#
- GitHub Repository: Serapieum-of-alex/Serapeum
- Issue Tracker: Report bugs or request features
- Changelog: View release history
- Contributing: Contribution guidelines
- License: GNU General Public License v3