Architecture Diagrams#
This section provides comprehensive Mermaid diagrams for the Serapeum package, covering architecture, class hierarchies, workflows, data flows, state machines, component interactions, deployment, and dependencies.
Refer to the Legend for styles and color coding, and the Integration Guide for how these diagrams relate to each other and to the codebase.
Legend and Guides
- Legend: Legend
- Integration Guide: Integration Guide
1. Overview (System Context & Components)#
System Context#
%% Architecture Overview: System Context (C4-like) %% Color/Style Legend is defined in legend.md flowchart LR %% Styles classDef abstract fill:#f5f5f5,stroke:#666,stroke-dasharray: 5 3 classDef concrete fill:#e3f2fd,stroke:#1565c0 classDef interface fill:#fffde7,stroke:#f9a825 classDef datamodel fill:#e8f5e9,stroke:#2e7d32 classDef layerBase fill:#ede7f6,stroke:#5e35b1 classDef layerCore fill:#e0f7fa,stroke:#00838f classDef layerHigh fill:#fff3e0,stroke:#ef6c00 classDef external fill:#ffebee,stroke:#c62828
subgraph L1[Layer 1: Base Models & Interfaces] direction TB AB(serapeum.core.base.llms.base.BaseLLM):::abstract DM1(serapeum.core.base.llms.types.Message):::datamodel DM2(ChatResponse):::datamodel DM3(CompletionResponse):::datamodel ML(MessageList):::datamodel end class L1 layerBase
subgraph L2[Layer 2: Core Abstractions] direction TB LLM(serapeum.core.llms.base.LLM):::abstract PT(serapeum.core.prompts.base.PromptTemplate):::concrete CPT(ChatPromptTemplate):::concrete OP(serapeum.core.output_parsers):::interface end class L2 layerCore
subgraph L3[Layer 3: High-level Orchestration] direction TB FCLLM(serapeum.core.llms.function_calling.FunctionCallingLLM):::concrete SLLM(serapeum.core.llms.structured_output_llm.StructuredOutputLLM):::concrete TO(serapeum.core.llms.ToolOrchestratingLLM):::concrete Tools(serapeum.core.tools.callable_tool.CallableTool):::concrete TModels(serapeum.core.tools.types.*):::concrete ACR(serapeum.core.chat.types.AgentChatResponse):::concrete end class L3 layerHigh
subgraph L4[Layer 4: Concrete Implementations] direction TB OLL(serapeum.ollama.base.Ollama):::concrete end
subgraph EXT[External Systems] direction TB U(User Application Code):::external OServer(Ollama Server API):::external Pyd(Pydantic Models):::external end class EXT external
%% Relations U -->|uses| PT U -->|calls| LLM PT -->|formats to| ML ML -->|consumed by| AB LLM -->|produces| DM2 LLM -->|produces| DM3 LLM -->|parses via| OP FCLLM -->|extends| LLM SLLM -->|wraps| LLM TO -->|uses| Tools TO -->|uses| FCLLM Tools -->|emit| TModels FCLLM -->|returns| ACR OLL -->|extends| FCLLM OLL -->|HTTP| OServer SLLM -->|validates with| Pyd TO -->|validates with| Pyd
Component Diagram#
%% Architecture Overview: Component Diagram flowchart TB classDef abstract fill:#f5f5f5,stroke:#666,stroke-dasharray: 5 3 classDef concrete fill:#e3f2fd,stroke:#1565c0 classDef interface fill:#fffde7,stroke:#f9a825 classDef datamodel fill:#e8f5e9,stroke:#2e7d32 classDef external fill:#ffebee,stroke:#c62828
U[User App]:::external
subgraph CBLLMs[Core.Base.LLMs] BaseLLM[BaseLLM metadata chat/complete stream async]:::abstract Models[Models Message MessageList ChatResponse CompletionResponse Metadata]:::datamodel end
subgraph CLLM[Core.LLM] LLM[LLM predict apredict stream astream parse astream_parse]:::abstract FCLLM[FunctionCallingLLM generate_tool_calls invoke_callable stream astream]:::concrete SLLM[StructuredOutputLLM wrapper]:::concrete end
subgraph CTools[Core.Tools] Tools[CallableTool ToolMetadata ToolOutput ToolCallArguments SyncAsyncConverter]:::concrete end
subgraph CSTools[Core.StructuredTools] TLLM[ToolOrchestratingLLM]:::concrete end
subgraph CPrompts[Core.Prompts] PT[PromptTemplate]:::concrete CPT[ChatPromptTemplate]:::concrete end
subgraph CChat[Core.Chat] ACR[AgentChatResponse]:::concrete end
subgraph LOllama[ollama] Ollama[Ollama chat complete stream astream structured]:::concrete end
Server[Ollama Server HTTP]:::external Pyd[Pydantic Models]:::external
%% interactions U --> PT U --> CPT PT --> LLM CPT --> LLM LLM --> Models FCLLM --> Tools TLLM --> FCLLM TLLM --> Tools SLLM --> LLM SLLM --> Pyd FCLLM --> ACR Ollama --> FCLLM Ollama -.-> Server
2. Class Diagrams#
LLM Hierarchy#
%% Class Diagram: LLM Hierarchy classDiagram class SerializableModel
class BaseLLM { +metadata() Metadata +chat(messages, stream=false) ChatResponse | ChatResponseGen +complete(prompt, formatted=false, stream=false) CompletionResponse | CompletionResponseGen +achat(messages, stream=false) ChatResponse | ChatResponseAsyncGen +acomplete(prompt, formatted=false, stream=false) CompletionResponse | CompletionResponseAsyncGen +convert_chat_messages(messages) List[Any] }
class LLM { +predict(prompt) str +stream(prompt) CompletionResponseGen +apredict(prompt) str +astream(prompt) CompletionResponseAsyncGen +parse(schema, prompt, llm_kwargs=None, prompt_args) BaseModel +aparse(schema, prompt, llm_kwargs=None, prompt_args) BaseModel +stream_parse(schema, prompt, llm_kwargs=None, prompt_args) BaseModelGen +astream_parse(schema, prompt, llm_kwargs=None, prompt_args) BaseModelAsyncGen +_get_prompt(prompt, prompt_args) +_get_messages(prompt, prompt_args) +_parse_output(output) Any +_extend_prompt(formatted_prompt) str +_extend_messages(messages) List[Message] }
class FunctionCallingLLM { +generate_tool_calls(tools, user_msg, chat_history, verbose, allow_parallel_tool_calls, stream) +agenerate_tool_calls(tools, user_msg, chat_history, verbose, allow_parallel_tool_calls, stream) +invoke_callable(tools, ...) +ainvoke_callable(tools, ...) +get_tool_calls_from_response(response, error_on_no_tool_call) }
class StructuredOutputLLM { +llm: LLM +output_cls: Type[BaseModel] +chat(messages, stream=false) ChatResponse | ChatResponseGen +achat(messages, stream=false) ChatResponse | ChatResponseAsyncGen +complete(prompt, stream=false) CompletionResponse | CompletionResponseGen +acomplete(prompt, stream=false) CompletionResponse | CompletionResponseAsyncGen }
class Ollama { +chat(messages, stream=false) ChatResponse | ChatResponseGen +achat(messages, stream=false) ChatResponse | ChatResponseAsyncGen +complete(prompt, stream=false) CompletionResponse | CompletionResponseGen +acomplete(prompt, stream=false) CompletionResponse | CompletionResponseAsyncGen +parse(...) +aparse(...) }
class Message class ChatResponse class CompletionResponse
SerializableModel <|-- BaseLLM BaseLLM <|-- LLM LLM <|-- FunctionCallingLLM LLM <|-- StructuredOutputLLM FunctionCallingLLM <|-- Ollama
BaseLLM ..> Message : uses LLM ..> Message : uses BaseLLM ..> ChatResponse : returns BaseLLM ..> CompletionResponse : returns StructuredOutputLLM ..> LLM : wraps
Tool System#
%% Class Diagram: Tool System classDiagram class ToolMetadata { +get_schema() +tool_schema_str() +get_name() +to_openai_tool(skip_length_check=false) }
class ToolOutput { +tool_name: str +content: str +chunks: List[Chunk] +is_error: bool +str() }
class ToolCallArguments { +arguments: dict }
class BaseTool { +metadata() ToolMetadata +call(input_values) ToolOutput }
class AsyncBaseTool { +call(args, *kwargs) ToolOutput +call(input_values) ToolOutput +acall(input_values) ToolOutput }
class BaseToolAsyncAdapter { +tool: BaseTool +metadata() ToolMetadata +call(input_values) +acall(input_values) }
class SyncAsyncConverter { +to_async(fn) +async_to_sync(fn) }
class CallableTool { +from_function(func, name, description, return_direct, tool_schema, tool_metadata, default_arguments) CallableTool +from_model(output_cls) CallableTool +metadata() ToolMetadata +sync_func() +async_func() +input_func() +call(args, kwargs) ToolOutput +call(args, kwargs) ToolOutput +acall(*args, kwargs) ToolOutput }
BaseTool <|-- AsyncBaseTool AsyncBaseTool <|-- CallableTool AsyncBaseTool <|-- BaseToolAsyncAdapter
CallableTool ..> ToolMetadata : uses CallableTool ..> SyncAsyncConverter : adapts BaseTool ..> ToolOutput : returns AsyncBaseTool ..> ToolOutput : returns CallableTool ..> ToolOutput : returns CallableTool ..> ToolCallArguments : parses
Prompt System#
%% Class Diagram: Prompt System classDiagram class BasePromptTemplate { +partial_format(kwargs) BasePromptTemplate +format(llm=None, kwargs) str +format_messages(llm=None, **kwargs) List[Message] +get_template(llm=None) Any -_map_template_vars(kwargs) -_map_function_vars(kwargs) -_map_all_vars(kwargs) }
class PromptTemplate { +partial_format(kwargs) PromptTemplate +format(llm=None, completion_to_prompt=None, kwargs) str +format_messages(llm=None, **kwargs) List[Message] +get_template(llm=None) str }
class ChatPromptTemplate { +from_messages(message_templates) ChatPromptTemplate +partial_format(kwargs) ChatPromptTemplate +format(llm=None, messages_to_prompt=None, kwargs) str +format_messages(llm=None, **kwargs) List[Message] +get_template(llm=None) List[Message] }
class Message class MessageList
BasePromptTemplate <|-- PromptTemplate BasePromptTemplate <|-- ChatPromptTemplate ChatPromptTemplate ..> Message : uses ChatPromptTemplate ..> MessageList : produces PromptTemplate ..> Message : produces via LLM
3. Sequence Diagrams#
Simple LLM Prediction#
%% Sequence Diagram: Simple LLM Prediction sequenceDiagram autonumber actor User participant PT as PromptTemplate participant L as LLM participant O as Ollama participant OP as Output Parser
User->>L: predict(prompt=PromptTemplate, **prompt_args) activate L L->>PT: format()/format_messages() PT-->>L: formatted prompt/messages L->>L: _extend_prompt/_extend_messages alt Completion path L->>O: complete(prompt) O-->>L: CompletionResponse else Chat path L->>O: chat(messages) O-->>L: ChatResponse end L->>OP: _parse_output(output) OP-->>L: parsed text L-->>User: Response (str) deactivate L
Structured Output Generation#
%% Sequence Diagram: Structured Output Generation sequenceDiagram autonumber actor User participant L as LLM participant PT as PromptTemplate participant O as Ollama participant PJ as JSON Validator participant PM as Pydantic Model
User->>L: parse(schema, prompt=PromptTemplate, llm_kwargs) activate L L->>PT: format() PT-->>L: formatted prompt L->>L: build JSON schema from schema note over L: Program creation for LLM L->>O: complete(prompt) or chat(messages) O-->>L: raw JSON string L->>PJ: validate JSON against schema PJ-->>L: validated JSON L->>PM: instantiate schema(**json) PM-->>L: model instance L-->>User: Pydantic model instance deactivate L
Function Calling / Tool Execution#
%% Sequence Diagram: Function Calling / Tool Execution sequenceDiagram autonumber actor User participant F as FunctionCallingLLM participant O as Ollama participant T as Tools participant A as AgentChatResponse
User->>F: invoke_callable(tools, user_msg, chat_history, ...) activate F F->>F: _prepare_chat_with_tools() F->>O: chat(messages with tool schemas) O-->>F: ChatResponse (with tool_calls) F->>F: _validate_chat_with_tools_response() F->>F: get_tool_calls_from_response() par execute tools (maybe parallel) F->>T: call(ToolCallArguments) T-->>F: ToolOutput and F->>T: call(...) T-->>F: ToolOutput end F->>A: build AgentChatResponse(tool_outputs, model_response) A-->>User: AgentChatResponse deactivate F
Tool Orchestration (ToolOrchestratingLLM)#
%% Sequence Diagram: Tool Orchestration (ToolOrchestratingLLM) sequenceDiagram autonumber actor User participant TLLM as ToolOrchestratingLLM participant PT as PromptTemplate participant CT as CallableTool participant F as FunctionCallingLLM participant P as Pydantic Model
User->>TLLM: call(**kwargs, llm_kwargs) activate TLLM TLLM->>PT: format()/format_messages() PT-->>TLLM: formatted prompt/messages TLLM->>CT: from_model(schema) CT-->>TLLM: tool instance TLLM->>F: invoke_callable([tool], user_msg, chat_history, ...) F-->>TLLM: AgentChatResponse with ToolOutputs TLLM->>P: validate and instantiate schema P-->>TLLM: model instance TLLM-->>User: Pydantic model instance deactivate TLLM
opt Streaming User->>TLLM: call(stream=True) / acall(stream=True) TLLM-->>User: partial models (incremental) TLLM-->>User: final validated model end
4. Data Flow Diagrams#
Message Processing Flow#
%% Data Flow: Message Processing flowchart LR classDef datamodel fill:#e8f5e9,stroke:#2e7d32 classDef process fill:#e3f2fd,stroke:#1565c0
S[String or Dict]:::datamodel M[Message]:::datamodel ML[MessageList]:::datamodel L[LLM]:::process CR[ChatResponse]:::datamodel CO[CompletionResponse]:::datamodel P[Parsed Output]:::process
S --> M M --> ML ML --> L L --> CR L --> CO CR --> P CO --> P
Streaming Data Flow#
%% Data Flow: Streaming (Sync and Async) flowchart TB classDef process fill:#e3f2fd,stroke:#1565c0 classDef datamodel fill:#e8f5e9,stroke:#2e7d32
U[User]:::process SC["chat(stream=True)"]:::process SSC["complete(stream=True)"]:::process G[Generator ChatResponseGen]:::datamodel CG[Generator CompletionResponseGen]:::datamodel ASP["achat(stream=True)"]:::process ASSP["acomplete(stream=True)"]:::process AG[AsyncGenerator ChatResponseAsyncGen]:::datamodel ACG[AsyncGenerator CompletionResponseAsyncGen]:::datamodel TP[stream_response_to_tokens]:::process T[Tokens]:::datamodel
subgraph Sync_Path U --> SC --> G --> TP --> T U --> SSC --> CG --> TP --> T end
subgraph Async_Path U --> ASP --> AG --> TP --> T U --> ASSP --> ACG --> TP --> T end
5. Component Interaction Diagram#
%% Component Interaction Diagram flowchart TB classDef process fill:#e3f2fd,stroke:#1565c0 classDef datamodel fill:#e8f5e9,stroke:#2e7d32 classDef external fill:#ffebee,stroke:#c62828
PT[Prompt Templates]:::process L[LLM]:::process OP[Output Parsers]:::process F[FunctionCallingLLM]:::process T[Tools System]:::process ACR[AgentChatResponse]:::datamodel TO[ToolOrchestratingLLM]:::process CT[CallableTool]:::process P[Pydantic Models]:::external
PT -- format --> L L -- uses --> OP F -- extends --> L F -- calls --> T T -- returns --> ACR F -- builds --> ACR TO -- wraps --> L TO -- creates --> CT CT -- validates --> P TO -- validates --> P TO -- delegates --> F L -- returns --> ACR
6. State Machine Diagrams#
Tool Execution Lifecycle#
%% State Machine: Tool Execution Lifecycle stateDiagram-v2 [*] --> Created Created --> Prepared: validate metadata and inputs Prepared --> Executing: start call Executing --> Completed: success Executing --> Failed: exception or error Failed --> Completed: error handled optional
Streaming Response States#
%% State Machine: Streaming Response States stateDiagram-v2 [*] --> Idle Idle --> Streaming: request started Streaming --> Buffering: receive deltas Buffering --> Parsing: chunk to tokens Parsing --> Streaming: more deltas Parsing --> Complete: end of stream Streaming --> Error: network or protocol error Error --> Complete: error handled
7. Deployment/Integration#
%% Deployment / Integration Diagram flowchart LR classDef external fill:#ffebee,stroke:#c62828 classDef lib fill:#e3f2fd,stroke:#1565c0 classDef process fill:#fff3e0,stroke:#ef6c00
UA[User Application]:::external PY[Pydantic Models]:::external OL[Ollama Server API]:::external
subgraph Serapeum Library direction TB CORE[Core Modules LLM Tools Prompts Chat Structured]:::lib OLL[Ollama Backend]:::lib end
UA -->|install import| CORE CORE -->|HTTP calls| OLL OLL -.->|HTTP| OL CORE -->|validate| PY UA -->|provides models| PY UA -->|sync or async| CORE
8. Package Dependency Graph#
%% Package Dependency Graph (Layered) flowchart TB classDef layer1 fill:#ede7f6,stroke:#5e35b1 classDef layer2 fill:#e0f7fa,stroke:#00838f classDef layer3 fill:#fff3e0,stroke:#ef6c00 classDef layer4 fill:#e3f2fd,stroke:#1565c0 classDef external fill:#ffebee,stroke:#c62828
subgraph L1[Layer 1 Base Models and Utilities] direction TB L1A[core.base.llms.types]:::layer1 L1B[core.base.llms.base]:::layer1 end
subgraph L2[Layer 2 Core Abstractions] direction TB L2A[core.llms.base]:::layer2 L2B[core.prompts.base]:::layer2 L2C[core.output_parsers]:::layer2 end
subgraph L3[Layer 3 High level Orchestration] direction TB L3A[core.llms.function_calling]:::layer3 L3B[core.llms.structured_output_llm]:::layer3 L3C[core.llms.orchestrators]:::layer3 L3D[core.tools.callable_tool]:::layer3 L3E[core.chat.types]:::layer3 end
subgraph L4[Layer 4 Concrete Implementations] direction TB L4A[ollama.base]:::layer4 end
PY[Pydantic]:::external OL[ollama]:::external AS[asyncio]:::external
%% Layer dependencies L2A --> L1A L2A --> L1B L2B --> L1A L3A --> L2A L3A --> L2B L3B --> L2A L3B --> L2B L3C --> L3A L3C --> L3D L3D --> L2B L3E --> L1A L4A --> L3A
%% External deps PY -.-> L1A PY -.-> L3B PY -.-> L3C OL -.-> L4A AS -.-> L3A AS -.-> L4A
See also: Overview/Codebase Map