Skip to content

LLM Classes Comparison#

This document provides a comprehensive comparison of the four main LLM classes in Serapeum's core library, explaining their purposes, relationships, and when to use each one.

Overview#

Serapeum provides four distinct LLM classes organized across two architectural layers:

┌─────────────────────────────────────────────────────────────┐
│  Orchestration Layer (High-level workflows)                 │
├─────────────────────────────────────────────────────────────┤
│  ToolOrchestratingLLM       │  TextCompletionLLM            │
│  (uses function calling)    │  (uses text parsing)          │
│  - Converts models to tools │  - Binds prompt+parser+LLM    │
│  - Executes tool calls      │  - Parses raw text output     │
│  - Returns Pydantic models  │  - Returns Pydantic models    │
└─────────────────────────────────────────────────────────────┘
                            │ uses
┌─────────────────────────────────────────────────────────────┐
│  LLM Layer (Core abstractions)                              │
├─────────────────────────────────────────────────────────────┤
│  FunctionCallingLLM         │  StructuredOutputLLM          │
│  (base for providers)       │  (wrapper for structured IO)  │
│  - Tool calling interface   │  - Forces Pydantic outputs    │
│  - Provider implementations │  - Wraps any LLM              │
│  - Abstract methods         │  - Format conversion          │
└─────────────────────────────────────────────────────────────┘

Detailed Comparison#

1. FunctionCallingLLM#

Location: libs/core/src/serapeum/core/llms/function_calling.py:21 Layer: LLM Layer (core abstraction) Type: Base class for provider implementations

Purpose#

Provides the foundation for LLM providers that support function/tool calling. This is an abstract base class that concrete provider implementations (like Ollama, OpenAI) should inherit from.

Key Features#

  • Extends the base LLM class with tool-calling capabilities
  • Provides convenience methods for tool workflows:
  • generate_tool_calls(stream=False) - Chat with function calling; pass stream=True for streaming (sync)
  • agenerate_tool_calls(stream=False) - Chat with function calling; pass stream=True for streaming (async)
  • invoke_callable() - Predict and execute tool (sync)
  • ainvoke_callable() - Predict and execute tool (async)
  • get_tool_calls_from_response() - Extract tool calls from response
  • Abstract method _prepare_chat_with_tools() that providers must implement

When to Use#

  • You're implementing a new provider (e.g., OpenAI, Anthropic, Cohere)
  • You need the base functionality for tool/function calling
  • You're building low-level LLM integrations

Example#

from serapeum.core.llms import FunctionCallingLLM

class MyProviderLLM(FunctionCallingLLM):
    """Custom provider implementation."""

    def _prepare_chat_with_tools(self, tools, **kwargs):
        # Convert tools to provider-specific format
        tool_schemas = [tool.to_json_schema() for tool in tools]
        return {
            "messages": kwargs.get("chat_history", []),
            "tools": tool_schemas,
        }

    def get_tool_calls_from_response(self, response, **kwargs):
        # Extract tool calls from provider response
        return response.tool_calls

2. StructuredOutputLLM#

Location: libs/core/src/serapeum/core/llms/structured_output_llm.py:25 Layer: LLM Layer (wrapper) Type: Wrapper class for structured outputs

Purpose#

Wraps an existing LLM to force all outputs into a specific Pydantic model format. Acts as an adapter that converts any LLM into a structured output generator.

Key Features#

  • Takes two inputs:
  • llm: Any LLM instance (base LLM, function-calling LLM, etc.)
  • output_cls: A Pydantic model class defining the output structure
  • Delegates to the underlying LLM's parse() method
  • Converts all responses to JSON representations of the output model
  • Maintains the same interface as the base LLM (chat(stream=...), etc.)
  • Supports streaming structured outputs

When to Use#

  • You want to guarantee a specific output format from any LLM
  • You're wrapping an existing LLM to enforce schema compliance
  • You need structured outputs without manually handling parsing

Example#

```python function_calling import os from pydantic import BaseModel from serapeum.ollama import Ollama from serapeum.core.llms import StructuredOutputLLM, Message, TextChunk

class PersonInfo(BaseModel): name: str age: int occupation: str

Wrap an LLM to always return PersonInfo#

base_llm = Ollama( model="llama3.1", timeout=90 ) structured_llm = StructuredOutputLLM( llm=base_llm, output_cls=PersonInfo )

All responses will be PersonInfo instances#

response = structured_llm.chat([ Message(role="user", chunks=[TextChunk(content="Tell me about Alice, a 30-year-old engineer")]) ]) print(response.raw)

PersonInfo(name='Alice', age=30, occupation='Engineer')#

---

### 3. ToolOrchestratingLLM

**Location**: `libs/core/src/serapeum/core/llms/orchestrators/tool_based.py:33`
**Layer**: Orchestration Layer (high-level)
**Type**: Orchestrator for function-calling workflows

#### Purpose
High-level orchestrator that converts Pydantic models or Python functions into tools, executes them via function-calling, and returns structured outputs. This is the recommended way to get structured outputs from function-calling models.

#### Key Features
- **Automatic tool creation**: Converts Pydantic models OR Python functions to `CallableTool` instances
- **Full orchestration**: Handles prompt formatting, LLM invocation, tool execution, and output parsing
- **Flexible inputs**:
  - `output_cls`: Either a Pydantic model or a callable function
  - `prompt`: Template string or `BasePromptTemplate`
  - `llm`: A `FunctionCallingLLM` instance
- **Advanced capabilities**:
  - Streaming support via `__call__(stream=True)` and `acall(stream=True)`
  - Parallel tool calls with `allow_parallel_tool_calls=True`
  - Custom tool selection with `tool_choice` parameter
- **Sync and async**: Both `__call__()` and `acall()` methods

#### When to Use
- **You want structured outputs from a function-calling model** (recommended approach)
- You're building applications that need reliable Pydantic outputs
- You want automatic tool creation from your data models
- You need streaming structured outputs
- You're using modern LLMs with function-calling support (GPT-4, Claude, Llama 3.1+)

#### Example with Pydantic Model

```python function_calling
import os
from pydantic import BaseModel
from serapeum.ollama import Ollama
from serapeum.core.llms import ToolOrchestratingLLM


class WeatherInfo(BaseModel):
    """Weather information for a location."""
    location: str
    temperature: float
    conditions: str

llm = Ollama(
    model="llama3.1",
)
# Create orchestrator
weather_extractor = ToolOrchestratingLLM(
    schema=WeatherInfo,
    prompt="Extract weather information from: {text}",
    llm=llm,
)

# Get structured output
result = weather_extractor(
    text="It's 72 degrees and sunny in San Francisco"
)
print(result)
# WeatherInfo(location='San Francisco', temperature=72.0, conditions='sunny')

Example with Function#

```python function_calling import os from serapeum.ollama import Ollama from serapeum.core.llms import ToolOrchestratingLLM

def calculate_sum(a: int, b: int) -> dict: """Calculate the sum of two numbers.""" return {"result": a + b}

llm = Ollama(model="llama3.1")

Create orchestrator with function#

calculator = ToolOrchestratingLLM( schema=calculate_sum, prompt="Calculate the sum of {x} and {y}", llm=llm, )

result = calculator(x=5, y=3) print(result)

{'result': 8}#

#### Example with Streaming

```python
import os
from pydantic import BaseModel
from serapeum.ollama import Ollama
from serapeum.core.llms import ToolOrchestratingLLM


class Story(BaseModel):
    title: str
    content: str
    genre: str

story_generator = ToolOrchestratingLLM(
    schema=Story,
    prompt="Generate a short {genre} story",
    llm=Ollama(model="qwen3.5:397b", api_key=os.environ.get("OLLAMA_API_KEY"), timeout=90),
)

# Stream partial results
for partial_story in story_generator(genre="sci-fi", stream=True):
    print(partial_story)  # Progressively complete Story objects

4. TextCompletionLLM#

Location: libs/core/src/serapeum/core/llms/orchestrators/text_completion_llm.py:14 Layer: Orchestration Layer (simpler alternative) Type: Text-based structured output generator

Purpose#

Provides structured outputs by parsing raw text completions (without using function calling). This is useful for models that don't support function calling or when you prefer text-based parsing.

Key Features#

  • Simple pipeline: Binds prompt + output parser + LLM together
  • Text-based parsing: Uses PydanticParser to parse raw LLM output into Pydantic models
  • No function calling required: Works with any LLM (chat or completion models)
  • Explicit parsing: Uses output parsers to handle the conversion
  • Lightweight: Less overhead than function-calling approaches

When to Use#

  • Your LLM doesn't support function calling (older models, smaller models)
  • You prefer text-based parsing over function calling
  • You want explicit control over the parsing logic
  • You're working with completion-style models (non-chat)
  • You need a simpler, more transparent approach

Example#

import os
from pydantic import BaseModel
from serapeum.ollama import Ollama
from serapeum.core.output_parsers import PydanticParser
from serapeum.core.llms import TextCompletionLLM

class Task(BaseModel):
    title: str
    priority: int
    completed: bool

llm = Ollama(model="ministral-3:14b", api_key=os.environ.get("OLLAMA_API_KEY"), timeout=90)
# Create text completion LLM
task_extractor = TextCompletionLLM(
    output_parser=PydanticParser(output_cls=Task),
    prompt="Extract task information from: {text}.",
    llm=llm,
)

result = task_extractor(
    text="Finish the report - high priority, not done yet"
)
result
# Task(title='Finish the report', priority=1, completed=False)

Example with Just output_cls#

import os
from pydantic import BaseModel
from serapeum.ollama import Ollama
from serapeum.core.llms import TextCompletionLLM

class Product(BaseModel):
    name: str
    price: float

# Parser is auto-created from output_cls
product_extractor = TextCompletionLLM(
    output_cls=Product,  # Parser created automatically
    prompt="Extract product: {description}",
    llm=Ollama(model="qwen3.5:397b", api_key=os.environ.get("OLLAMA_API_KEY"), timeout=90),
)

result = product_extractor(description="iPhone 15 Pro - $999")
result
# Product(name='iPhone 15 Pro', price=999.0)

Comparison Matrix#

Feature FunctionCallingLLM StructuredOutputLLM ToolOrchestratingLLM TextCompletionLLM
Layer LLM LLM Orchestration Orchestration
Type Base class Wrapper Orchestrator Pipeline
Requires Function Calling N/A No Yes No
Primary Use Case Building providers Enforcing output format Structured outputs (recommended) Text-based structured outputs
Input N/A LLM + output_cls output_cls + prompt + LLM prompt + parser + LLM
Output ChatResponse ChatResponse (with Pydantic in raw) Pydantic model(s) Pydantic model
Streaming Support Yes Yes Yes No
Parallel Tool Calls N/A No Yes No
Complexity High (abstract) Low Medium Low
Flexibility High Low High Medium

Decision Tree: Which Class Should I Use?#

Are you implementing a new LLM provider?
├─ YES → Use FunctionCallingLLM (inherit from it)
└─ NO → Continue...

Do you need structured Pydantic outputs?
├─ NO → Use base LLM classes
└─ YES → Continue...

Does your LLM support function calling?
├─ NO → Use TextCompletionLLM
└─ YES → Continue...

Do you just want to wrap an existing LLM to enforce a format?
├─ YES → Use StructuredOutputLLM
└─ NO → Use ToolOrchestratingLLM (recommended for most use cases)

Best Practices#

For Application Developers#

  1. Default to ToolOrchestratingLLM for structured outputs with modern LLMs
  2. Most flexible and powerful
  3. Handles tool creation automatically
  4. Supports streaming and parallel calls

  5. Use TextCompletionLLM when:

  6. Your model doesn't support function calling
  7. You prefer explicit text parsing
  8. You need simpler, more predictable behavior

  9. Use StructuredOutputLLM when:

  10. You have an existing LLM instance you want to wrap
  11. You just need to enforce an output format
  12. You don't need tool orchestration features

For Framework Developers#

  1. Inherit from FunctionCallingLLM when building provider integrations
  2. Implement _prepare_chat_with_tools() for your provider's format
  3. Implement get_tool_calls_from_response() to extract tool calls
  4. Follow the async/streaming patterns from existing providers (e.g., Ollama)

  5. Compose higher-level abstractions using the orchestration layer

  6. Build on ToolOrchestratingLLM for complex workflows
  7. Create domain-specific wrappers around TextCompletionLLM

Code References#

  • FunctionCallingLLM: libs/core/src/serapeum/core/llms/function_calling.py
  • StructuredOutputLLM: libs/core/src/serapeum/core/llms/structured_output_llm.py
  • ToolOrchestratingLLM: libs/core/src/serapeum/core/llms/orchestrators/tool_based.py
  • TextCompletionLLM: libs/core/src/serapeum/core/llms/orchestrators/text_completion_llm.py