Completion Adapters

Completion Adapters provide a standardized interface for integrating various AI completion providers (OpenAI, Ollama, Grok, Cohere, etc.) into the agent-swarm-kit system. These adapters handle the translation between the framework's internal message format and each provider's specific API requirements, including message transformation, tool call processing, and error handling.

For information about AI model performance monitoring, see Performance Monitoring. For details about agent-level completion configuration, see Schema Services.

The completion adapter system is built around the AdapterUtils class, which provides factory methods that return standardized completion functions. Each adapter method transforms the internal ICompletionArgs format into provider-specific requests and converts responses back to the framework's IModelMessage format.

Mermaid Diagram

The AdapterUtils class serves as the primary factory for creating provider-specific completion functions. Each method returns a TCompleteFn that conforms to the standard completion interface.

Component Type Purpose
AdapterUtils Class Main factory for completion adapters
TCompleteFn Type Standard completion function signature
TOOL_PROTOCOL_PROMPT Constant XML-based tool call instruction template
Adapter Singleton Default instance of AdapterUtils

The adapter system uses several configuration constants for reliability and performance:

Constant Value Purpose
EXECPOOL_SIZE 5 Maximum concurrent executions
EXECPOOL_WAIT 0 Delay between executions (ms)
RETRY_COUNT 5 Maximum retry attempts
RETRY_DELAY 5000 Delay between retries (ms)

The fromOpenAI method creates an adapter for OpenAI's chat completions API, supporting both GPT models and custom response formats.

Mermaid Diagram

The fromOllama method handles Ollama's local model API, including custom tool call protocol injection for models that don't natively support function calling.

Key features:

  • Injects TOOL_PROTOCOL_PROMPT for tool call support
  • Handles local model persistence with keep_alive: "24h"
  • Generates tool call IDs using randomString()
Provider Method Key Features
Grok fromGrok X.AI's Grok model API integration
Cortex fromCortex Local Cortex server with message deduplication
Cohere fromCohereClientV2 Cohere Command-R model support
LMStudio fromLMStudio OpenAI-compatible local server

All adapters perform similar input transformations to convert from the internal ICompletionArgs format:

Mermaid Diagram

Tool calls require special handling to convert between the internal format and provider-specific representations:

Mermaid Diagram

All adapters implement retry logic using the retry utility with configurable attempts and delays:

retry(
async (args: ICompletionArgs) => {
// Adapter implementation
},
RETRY_COUNT, // 5 attempts
RETRY_DELAY // 5000ms delay
)

Requests are managed through execpool to limit concurrency and prevent API rate limiting:

execpool(
retryFunction,
{
maxExec: EXECPOOL_SIZE, // 5 concurrent
delay: EXECPOOL_WAIT // 0ms delay
}
)

Some adapters (like Cortex) implement message deduplication to handle consecutive messages of the same role:

All adapters integrate with the Logger class for request tracking:

Logger.logClient(
clientId,
"AdapterUtils fromOpenAI completion",
JSON.stringify(rawMessages)
);

Completion adapters are registered and accessed through the completion schema services in the dependency injection system. The adapters are typically configured at the agent level through completion schema definitions.

Adapter executions are tracked by the performance monitoring system for metrics collection and analysis. See Performance Monitoring for details on how completion times and throughput are measured.