Integrate LLM services

Required Hotfix for Version 3.7.0

It has been determined that hotfix hotFix_ANS_370_004 is required for the LLM service to work properly on Module Suite version 3.7.0. Please ensure this hotfix is applied to your system before using the LLM service. Failure to apply this hotfix may result in unexpected behavior or errors when using LLM-related features.

To obtain and apply this hotfix, please contact AnswerModules support or refer to the official documentation for hotfix installation procedures.

Integrate Large Language Models in your workflow¶

Introduction¶

Large Language Models (LLMs) are revolutionizing the way organizations process and leverage information. These sophisticated AI models, trained on vast amounts of textual data, can understand, generate, and manipulate human-like text with remarkable accuracy. As businesses increasingly deal with enormous volumes of unstructured data, integrating LLMs into existing workflows has become a game-changer for enhancing productivity, improving decision-making processes, and unlocking new insights.

Module Suite plays a crucial role in seamlessly incorporating LLMs into your enterprise content management ecosystem. By bridging the gap between your organization's content repositories and cutting-edge AI capabilities, Module Suite empowers you to:

Automate content classification: Leverage LLMs to automatically categorize and classify documents, making information retrieval faster and more accurate.
Enhance search functionality: Utilize natural language processing to improve search results, allowing users to find relevant information using conversational queries.
Generate intelligent summaries: Create concise summaries of lengthy documents, enabling quick understanding of key points without manual review.
Streamline content creation: Assist users in drafting documents, emails, and reports by providing AI-powered suggestions and completions.
Facilitate knowledge discovery: Uncover hidden patterns and relationships within your content, leading to valuable insights for decision-makers.
Improve data extraction: Extract relevant information from unstructured documents, making it easier to populate structured databases or forms.

By integrating LLMs through Module Suite, organizations can harness the power of AI to transform their content management processes, leading to increased efficiency, reduced manual effort, and improved overall productivity.

LLM Integration Considerations

While LLMs offer significant benefits, it's important to consider factors such as data privacy, model selection, and fine-tuning requirements when integrating them into your workflow. Module Suite provides the necessary tools and interfaces to address these considerations effectively.

Architecture and Networking¶

Module Suite acts as a central hub for communication between Extended ECM (xECM), various LLM API services, and local resources.

Here's an overview of how the networking and communication work:

flowchart TD
    subgraph ECM["Extended ECM (xECM)"]
        MS[Module Suite]
    end

    API1[OpenAI API]
    API2[Azure AI API]
    API3[Ollama API]

    MS <--> |Internal APIs| ECM
    MS <--> |HTTP/REST| API1
    MS <--> |HTTP/REST| API2
    MS <--> |HTTP/REST| API3

    style ECM fill:#f9f,stroke:#333,stroke-width:2px
    style MS fill:#bbf,stroke:#333,stroke-width:2px
    style API1 fill:#bfb,stroke:#333,stroke-width:2px
    style API2 fill:#bfb,stroke:#333,stroke-width:2px
    style API3 fill:#bfb,stroke:#333,stroke-width:2px

Integration with xECM¶

Module Suite runs directly on xECM, providing seamless access to all xECM APIs. This tight integration allows for efficient data exchange and leveraging of xECM's content management capabilities.

LLM API Communication¶

Module Suite implements independent communication channels to various LLM API providers, which can be: - Public internet services (e.g., OpenAI) - VPN-accessible services (e.g., Azure AI) - On-premises solutions (e.g., LLAMA3 using Ollama to expose an API)

OpenAI-Compatible API Privileged Support

Module Suite features a rich API specifically designed for OpenAI-compatible API service providers, most commonly used with OpenAI and Azure. This allows for flexible integration with different LLM services while maintaining a consistent interface.

Local Embedding Indexes¶

To enhance performance and maintain data control, Module Suite allows administrators to configure and create local embedding indexes. These indexes are typically used for implementing Retrieval-Augmented Generation (RAG) systems. Key points include:

Based on an adapted version of the Lucene open-source indexing engine
Requires sending text chunks to the LLM API service provider for embedding computation
Does not store entire documents outside your organization
Provides full control over chunking policies and methodologies

Permission Considerations

When implementing local embedding indexes, it's crucial to ensure that permissions are properly considered. This helps maintain data security and access control in line with your organization's policies. We will explore this in detail in the following sections.

Typical Communication Sequence¶

Below is a diagram illustrating a typical communication sequence when using Module Suite with xECM and an LLM API service for implementing a RAG:

sequenceDiagram
    participant User
    participant xECM
    participant Module Suite UI Widget
    participant Module Suite (Script Engine)
    participant LocalIndex
    participant LLMAPI

    User->>xECM: Request content
    xECM->>Module Suite UI Widget: Pass request
    Module Suite UI Widget->>Module Suite (Script Engine): Pass request and history
    Module Suite (Script Engine)->>LocalIndex: Query local index
    LocalIndex-->>Module Suite (Script Engine): Return relevant chunks coordinates
    Module Suite (Script Engine)->>xECM: Retrive relevant chunks (context)
    xECM-->>Module Suite (Script Engine): Returns relevant chunks
    Module Suite (Script Engine)->>LLMAPI: Send prompt with context
    LLMAPI-->>Module Suite (Script Engine): Return LLM response
    Module Suite (Script Engine)->>Module Suite UI Widget: Process and format response
    Module Suite UI Widget->>xECM: Render result
    xECM->>User: Display result

Service Provider Support in Module Suite¶

Module Suite offers extensive support for various LLM API providers, with a focus on OpenAI-compatible APIs and limited support for other providers. Below is a detailed breakdown of the supported features for each provider type.

OpenAI API Providers (OpenAI and Microsoft Azure AI)¶

Module Suite provides comprehensive support for OpenAI-compatible APIs, including those from OpenAI itself and Microsoft Azure AI. The following features are supported:

Chat Completion
Text Completion
Function Invocation
Vision (image analysis and processing)
Text-to-Speech
Speech-to-Text
Assistant API
Embeddings
Fine-tuning
Moderation

This wide range of supported features allows for versatile integration of AI capabilities into your Extended ECM workflows, enabling tasks from simple text generation to complex multimodal interactions.

Ollama API Support¶

For Ollama-based API providers, Module Suite currently offers limited but essential support:

Embeddings
Chat Completion

While more restricted than the OpenAI API support, these features still allow for crucial functionalities such as semantic search and conversational AI interactions using on-premises or self-hosted models.

graph TD
    A[Module Suite API Support] --> B[OpenAI APIs]
    A --> C[Ollama APIs]

    B --> D[Chat Completion]
    B --> E[Text Completion]
    B --> F[Function Invocation]
    B --> G[Vision]
    B --> H[Text-to-Speech]
    B --> I[Speech-to-Text]
    B --> J[Assistant API]
    B --> K[Embeddings]
    B --> L[Fine-tuning]
    B --> M[Moderation]

    C --> N[Embeddings]
    C --> O[Chat Completion]

    style B fill:#f9f,stroke:#333,stroke-width:2px
    style C fill:#bbf,stroke:#333,stroke-width:2px
    style N stroke:#333,stroke-width:2px
    style O stroke:#333,stroke-width:2px

API Support Evolution

The landscape of LLM APIs is rapidly evolving. Module Suite's API support is regularly updated to include new providers and features. Always refer to the latest documentation for the most up-to-date information on supported APIs and features.

Choosing the Right API Provider

When selecting an API provider for your Module Suite implementation, consider the following factors:

Feature requirements: Assess which AI capabilities are crucial for your use case.
Data privacy and compliance: Determine if you need to keep data on-premises or if cloud-based solutions are acceptable.
Performance needs: Evaluate the response times and throughput required for your applications.
Cost considerations: Compare pricing models of different providers, especially for high-volume usage.
Integration complexity: Consider the ease of integration with your existing infrastructure.

Components of the LLM Service Integration¶

Module Suite provides a comprehensive set of components to enable seamless integration with LLM services. These components work together to offer a robust and flexible AI-enhanced experience within the Extended ECM environment. Let's explore each of these components:

Content Script¶

Module Suite features a set of dedicated extension package to support LLM API integration.

OpenAI Extension Package Service¶

The OpenAI service is a dedicated Content Script extension package service specifically designed for OpenAI-compatible API integrations. Key features include:

Multi-profile support for flexible configuration
Comprehensive implementation of OpenAI's API features
Optimized for use with OpenAI and Azure AI services

LLM Extension Package Service¶

The LLM service is a more general-purpose Content Script extension package service for integrating various LLM providers. Its features include:

Multi-profile configuration for supporting different LLM services
Currently focused on Ollama API support
Extensible architecture for future LLM provider integrations

Widgets¶

This Smart Pages widget brings AI-powered capabilities directly into the Extended ECM user interface. This widget:

Provides an interactive AI assistant interface within Smart Pages
Leverages the power of LLM models for various tasks
Enhances user productivity by offering AI-assisted functionalities

This Beautiful WebForm Widget extends AI capabilities to WebForms, allowing for:

AI-enhanced form interactions
Intelligent form filling assistance
Dynamic content generation based on form context

Services¶

Content Script Service (named carl)¶

This Content Script Service acts as the backend engine for LLM-related functionalities, providing:

Integration between widgets and LLM services
Business logic for processing AI requests and responses
Customizable workflows for AI-assisted operations

Code Snippets¶

Content Script Snippets¶

Module Suite includes several Content Script snippets that:

Facilitate quick implementation of common LLM-related tasks
Provide reusable code for developers to extend AI functionalities
Demonstrate best practices for integrating AI capabilities into Content Scripts

CARL (Content Server Artificial intelligence Resource and Liaison)¶

CARL is a feature of Module Suite, introduced with version 3.5, that implements a Content Script co-pilot based on the integration with GPT family models.

CARL Integration in Content Script Editor¶

When enabled in the Base Configuration, CARL integrates directly into the Content Script editor, offering:

AI-assisted code completion and suggestions
Context-aware help and documentation
Intelligent debugging assistance

CARL Beta Status

CARL is currently in beta. As a beta feature, it may undergo changes and improvements in future releases. Users are encouraged to provide feedback to help shape its development.

graph TD
    A[Module Suite] --> B[OpenAI Service]
    A --> C[LLM Service]
    A --> D[LLM Integration Features]
    D --> E[Smart Pages Widget CARL]
    D --> F[Beautiful WebForm Widget CARL]
    D --> G[Content Script Service carl]
    A --> H[Content Script Snippets]
    A --> I[CARL Co-pilot Feature]
    I --> J[Content Script Editor Integration]

    B --> K[OpenAI/Azure AI]
    C --> L[Ollama]

    style A fill:#f9f,stroke:#333,stroke-width:2px
    style D fill:#bbf,stroke:#333,stroke-width:2px
    style I fill:#fbb,stroke:#333,stroke-width:2px
    style B stroke:#333,stroke-width:2px
    style C stroke:#333,stroke-width:2px

Integration Use Cases¶

Module Suite offers various capabilities for integrating AI-powered functionalities into your Extended ECM environment. Let's explore common use cases and how to implement them.

Chat Completion¶

Chat completion allows you to create interactive, context-aware conversations with an AI assistant. This functionality is valuable for implementing:

Intelligent chatbots for user support
Virtual assistants for guided ECM operations
Interactive help systems within your ECM applications
Natural language interfaces for complex queries or tasks

Example: Basic Chat Interaction¶

Here's a simple example of how to implement a chat completion interaction:

OpenAI ExampleCommentsOllama exampleComments

def systemPreamble = """You are C.A.R.L. (Content Server Artificial intelligence Resource and Liaison), an LLM designed to help users working with OpenText Extended ECM"""
def defaultTemperature = 0.7
def model = "gpt-4"
def maxTokens = 2000

try {
    // Our services are implemented using fluent APIs and builders to simplify their usage
    // Use the auto-completion feature of the editor (CTRL+Space) to explore available configuration options
    def reqBuilder = openai.newChatCompletionRequestBuilder()
        .model(model)
        .temperature(defaultTemperature)
        .maxTokens(maxTokens)
        .n(1) // Request only one completion (most common use case)

    def req = reqBuilder.build()

    // Instruct the agent about its purpose and constraints using a system message
    req.addChatMessage("system", systemPreamble)

    // Add user request or message
    req.addChatMessage("user", "Does xECM support the concept of metadata?")

    // Submit the request and synchronously wait for the response
    def result = openai.createChatCompletion(req)

    // Extract and output the content of the first (and only) choice
    out << result.choices[0].message.content
} catch (Exception e) {
    out << "An error occurred: " + e.getMessage()
}

This code configures the request with specific parameters such as the model to use (GPT-4), temperature setting for response randomness, and maximum token limit. The system message defines C.A.R.L.'s role, followed by a user question about xECM's metadata support. The openai.newChatCompletionRequestBuilder() method initializes the request, which is then configured and built. The addChatMessage() method is used to add both system and user messages to the conversation. Finally, the createChatCompletion() method sends the request to the OpenAI API and retrieves the response.

This example showcases the ease of use of the OpenAI service in Module Suite, allowing developers to quickly implement AI-powered chat functionalities within their Extended ECM environment. The service handles the complexities of API interaction, allowing developers to focus on crafting effective prompts and integrating the responses into their applications.

Use Autocompletion

Remember to use the auto-completion feature of the editor (CTRL+Space) to explore available configuration options when working with the request builder.

def systemPreamble = """You are C.A.R.L. (Content Server Artificial intelligence Resource and Liaison), an LLM designed to help users working with OpenText Extended ECM"""
def defaultTemperature = 0.7
def model = "llama3"
def maxTokens = 2000
def msgID = "MyMessage"

try {
    // Use the llm service with a LangChain-based builder for Ollama
    def reqBuilder = llm.newLangChainChatCompletionRequestBuilder("ollama")
        .model(model)
        .temperature(0.0)
        .logRequestsAndResponses(true)

    // Set a custom timeout for non-OpenAI services
    reqBuilder.builder.timeout(java.time.Duration.ofSeconds(180))

    def req = reqBuilder.build()

    // Add system message to define C.A.R.L.'s role
    req.addChatMessage("system", systemPreamble)

    // Add user query
    req.addChatMessage("user", "Does xECM support the concept of metadata?")

    // Submit the request and wait for the response
    def result = llm.createChatCompletion(req)

    // Output the AI-generated response
    out << result.choices[0].message.content
} catch (Exception e) {
    out << "An error occurred: " + e.getMessage()
}

This example illustrates the flexibility of Module Suite's AI integration.

Key points to note:

The llm service is used instead of a specific provider service, allowing for more generic implementations.
The newLangChainChatCompletionRequestBuilder method is used with "ollama" as the provider.
LangChain is utilized for integration with non-OpenAI services, offering additional configuration options like custom timeouts.
The overall structure of setting up the request, adding messages, and retrieving the response remains similar to the previous example.

By using the generic llm service, you can easily switch between different AI providers or models while maintaining a consistent implementation structure. This flexibility allows you to choose the most suitable AI backend for your specific use case or requirements.

Different models different results

Experiment with different models and providers to find the best balance of performance, cost, and capabilities for your ECM AI integration needs.

Chat Completion (continued)¶

Example: Streaming Chat Completion¶

Streaming chat completion allows for real-time delivery of AI-generated responses, enhancing the responsiveness of your AI-powered applications. This is particularly useful for:

Creating more interactive and dynamic user experiences
Implementing typing-like effects in chatbots
Handling long-form content generation without waiting for the entire response

Here's an example of how to implement streaming chat completion:

OpenAI ExampleComments

def systemPreamble = """You are C.A.R.L. (Content Server Artificial intelligence Resource and Liaison), an LLM designed to help users working with OpenText Extended ECM"""
def defaultTemperature = 0.7
def model = "gpt-4"
def maxTokens = 2000
def msgID = "MyMessage"

try {
    def reqBuilder = openai.newChatCompletionRequestBuilder()
        .model(model)
        .temperature(defaultTemperature)
        .n(1)

    def req = reqBuilder.build()
    req.addChatMessage("system", systemPreamble)
    req.addChatMessage("user", "Does xECM support the concept of metadata?")

    def iterator = 0
    result = openai.streamChatCompletion(req, { block ->
        // This closure is invoked with streaming chunks as they become available
        def map = [
            content: block.choices?[0]?.message?.content,
            role:    block.choices?[0]?.message?.role,
            finishReason: (block.choices?[0]?.message?.content != null) ? block.choices?[0]?.finishReason : 'stop'
        ]
        log.debug("GOT {}", map) // Log for debugging
        cache.put(msgID + "_" + iterator++, 500, map)
    })

    out << result.choices
} catch (Exception e) {
    out << "An error occurred: " + e.getMessage()
}

Key points about this streaming implementation:

The streamChatCompletion method is used instead of createChatCompletion. A closure is provided as the second argument to streamChatCompletion. This closure is called for each chunk of the response as it becomes available. The closure implements a producer-consumer pattern:

It acts as the producer, storing each chunk in the memcache service. The UI can act as the consumer, retrieving chunks from the memcache to display in real-time.

Each chunk is stored in the memcache with a unique key combining the msgID and an incrementing iterator. The finishReason is tracked to determine when the response is complete.

This approach allows for more responsive AI interactions, as the UI can start displaying the response before it's fully generated. It's particularly useful for longer responses or when you want to create a more dynamic, "typing" effect in your AI interface.

Use debouncing

When implementing the UI for streaming responses, consider using techniques like debouncing to balance between real-time updates and performance.

Additional considerations: Producer-Consumer Pattern¶

The streaming chat completion implementation uses a producer-consumer pattern to handle real-time data flow. Here's a sequence diagram illustrating this process:

Default Consumer Service¶

A default implementation of the consumer service, named "carl", is provided as a Content Script Service.

You can find it at: Content Script Volume:CSTools:CARL:CSServices

This service helps manage the consumption of streamed chat completion responses, making it easier to implement the consumer side of the producer-consumer pattern in your applications.

Sequence Diagram¶

Here's a sequence diagram illustrating the producer-consumer process, including the default "carl" service:

sequenceDiagram
    participant U as UI
    participant S as Script
    participant A as AI Service
    participant C as Memcache

    U->>S: Initiate chat (msgID)
    S->>A: streamChatCompletion request
    activate A
    loop For each chunk
        A-->>S: Stream chunk
        S->>C: Store chunk (msgID_i)
        S->>S: Log chunk
    end
    deactivate A
    S-->>U: Completion notification

    loop Until all chunks received
        U->>C: Request chunk (msgID_i)
        C-->>U: Return chunk
        U->>U: Update display
    end

This diagram illustrates the following sequence:

The UI initiates the chat, providing a unique msgID.
The script sends a streaming chat completion request to the AI service.
As the AI service generates the response:
It streams chunks of the response back to the script.
The script stores each chunk in the memcache with a unique key (msgID_i, where i is an incrementing counter).
The script also logs each chunk for debugging purposes.
Once all chunks are received, the script notifies the UI that the completion is finished.
The UI then repeatedly requests chunks from the memcache using the msgID and incrementing counter.
As the UI receives each chunk, it updates the display, creating a real-time streaming effect.

This pattern allows for efficient handling of large responses and provides a smooth, responsive user experience. The memcache serves as a buffer between the AI service's output rate and the UI's consumption rate, ensuring that no data is lost and that the UI can process the response at its own pace.

Adapt it as needed

The actual implementation may vary depending on your specific UI needs.This diagram represents a general approach that can be adapted to various technical environments.

Benefits of Using the Default "carl" Service

When implementing streaming chat completion in your applications, consider using the provided "carl" service to streamline your development process and ensure robust handling of streamed responses.

Using the provided "carl" service offers several advantages:

Simplified Implementation: You don't need to write your own consumer logic, reducing development time and potential errors.
Consistency: The service ensures a consistent approach to consuming streamed responses across your applications.
Optimization: The service may include optimizations for efficient retrieval and assembly of chunked responses.
Maintenance: As part of Module Suite, the service will be maintained and updated, ensuring compatibility with future versions.

To use the "carl" service in your applications, you can call it from your UI code after initiating a streaming chat completion. The service will handle the retrieval and assembly of the chunked response, allowing you to focus on displaying the results to the user.

Function Calling¶

Function calling allows the AI to interact directly with Content Server, performing actions or retrieving information as needed. This powerful feature enables the AI to manipulate content and execute operations within the Extended ECM environment.

Example: Creating Folders Using AI¶

In this example, we'll demonstrate how to use function calling to create folders in Content Server based on natural language input.

OpenAI Example ChatCompletionOpenAI Example StreamChatCompletionResultComments

def systemPreamble = """You are C.A.R.L. (Content Server Artificial intelligence Resource and Liaison), an LLM designed to help users working with OpenText Extended ECM"""
def defaultTemperature = 0.7
def model = "gpt-4"
def maxTokens = 2000
def msgID = "MyMessage"

try {
    def reqBuilder = openai.newChatCompletionRequestBuilder()
        .model(model)
        .temperature(defaultTemperature)
        .n(1)

    def req = reqBuilder.build()
    req.addChatMessage("system", systemPreamble)
    req.addChatMessage("user", "Create a folder for each month of the year in ${self.parent.ID}")

    // Define the function for creating a folder
    def func = openai.newChatFunctionBuilder()
        .name("createFolder")
        .description("Create a folder in the given space (identified by its ID)")
        .build()

    func.addStringParameter("folderName", "The name of the folder", true)
    func.addNumberParameter("parentID", "Parent Space Identifier", true)

    // Implement the function executor
    func.executor = { jsonArguments ->
        try {    
            def slurper = new JsonSlurper()
            def args = slurper.parseText(jsonArguments)
            def newNode = docman.createFolder(docman.getNode(args.parentID as Long), args.folderName)
            return "Created <a data-ampw='am-action' data-action='am_goTo' data-params='${newNode.ID}' data-toggle='click' href='#'>${newNode.name}</a>"
        } catch(e) {
            log.error("Unable to handle the request", e)
            return "Something went wrong"
        }
    }

    req.setFunctions([func])

    result = openai.createChatCompletion(req)
    out << result.choices[0].message
} catch (Exception e) {
    log.error("An error occurred ", e)
    out << "An error occurred: " + e.getMessage()
}

def systemPreamble = """You are C.A.R.L. (Content Server Artificial intelligence Resource and Liaison), an LLM designed to help users working with OpenText Extended ECM"""
def defaultTemperature = 0.7
def model = "gpt-4"
def maxTokens = 2000
def msgID = "MyMessage"

try {
    def reqBuilder = openai.newChatCompletionRequestBuilder()
        .model(model)
        .temperature(defaultTemperature)
        .n(1)

    def req = reqBuilder.build()
    req.addChatMessage("system", systemPreamble)
    req.addChatMessage("user", "Create a folder for each month of the year in ${self.parent.ID}")

    // Define the function for creating a folder
    def func = openai.newChatFunctionBuilder()
        .name("createFolder")
        .description("Create a folder in the given space (identified by its ID)")
        .build()

    func.addStringParameter("folderName", "The name of the folder", true)
    func.addNumberParameter("parentID", "Parent Space Identifier", true)

    // Implement the function executor
    func.executor = { jsonArguments ->
        try {    
            def slurper = new JsonSlurper()
            def args = slurper.parseText(jsonArguments)
            def newNode = docman.createFolder(docman.getNode(args.parentID as Long), args.folderName)
            return "Created <a data-ampw='am-action' data-action='am_goTo' data-params='${newNode.ID}' data-toggle='click' href='#'>${newNode.name}</a>"
        } catch(e) {
            log.error("Unable to handle the request", e)
            return "Something went wrong"
        }
    }

    req.setFunctions([func])

    result = openai.streamChatCompletion(req, { block->  //This is an asyncrnous method, meaning that the script execution
                                                        // won't wait for the completion to be terminated. The method accept 
                                                        // as its second parameter a closure that will be invoked with the streaming
                                                        // chunks as they become available

        def map = [
            content:block.choices?[0]?.message?.content,
            role:   block.choices?[0]?.message?.role,
            finishReason:(block.choices?[0]?.message?.content != null)?block.choices?[0]?.finishReason:'stop'
        ]
        log.debug("GOT {}", map) // Let's also print it into the log file for debugging
        cache.put(msgID+"_"+iterator++, 500, map) 

    })
    out << result*.content
} catch (Exception e) {
    log.error("An error occurred ", e)
    out << "An error occurred: " + e.getMessage()
}

example 2

This example demonstrates several key concepts:

Function Definition: We define a createFolder function using the newChatFunctionBuilder(). This function takes two parameters: folderName and parentID.
Function Implementation: The executor closure contains the actual implementation of the function. It uses the Content Server API (docman) to create a new folder.
AI Integration: The function is added to the chat completion request, allowing the AI to call it when necessary.
Natural Language Processing: The user's request to "Create a folder for each month of the year" is interpreted by the AI, which then calls the createFolder function multiple times to fulfill the request.
Error Handling: The implementation includes error handling to manage potential issues during folder creation.
Interactive Response: The function returns an HTML link that allows users to navigate directly to the newly created folder.

Benefits of Function Calling

Function calling in Module Suite offers several advantages:

Direct Interaction: The AI can perform actions directly in Content Server, bridging the gap between natural language requests and system operations.
Flexibility: You can define custom functions to extend the AI's capabilities, tailoring it to your specific ECM needs.
Safety: By defining specific functions, you control what actions the AI can perform, ensuring security and preventing unintended operations.
Complex Operations: You can implement complex workflows by combining multiple function calls based on user requests.

Error handling

When implementing functions, ensure proper error handling and logging to maintain system stability and aid in troubleshooting.

Functions everywhere

Consider implementing additional functions for common ECM tasks, such as searching for documents, updating metadata, or initiating workflows. This can greatly enhance the AI's utility within your Extended ECM environment.

Document Assembly¶

Document assembly is a powerful use case that combines AI-generated content with document creation and manipulation within the Extended ECM system. This approach allows for the automatic generation of documents based on user requests, leveraging AI to create content and Module Suite's capabilities to assemble and store the document.

Example: Create a presentation letter in Word¶

OpenAI Example StreamChatCompletionResultComments

def systemPreamble = "You are D.O.C.S. (Document Organizing and Creation System) an AI agent tasked to support users in creating documents.The content of the document MUST always be passed to the 'createDocument' function as valid X-HTML code (example: <b>DEMO</b>). Always wrap the content in a single 'div' element. Avoid the usage of HTML tag such as (BR). The font family for every text must be: font-family: Poppins, sans-serif tables must have a borders: 1px solid #fff;"

data = [:]
data.company = [name:"CreativeAnswer SA", address:"Via Penate 4, 6850 Mendrisio Switzerland", description:"""
At CreativeAnswer, we are a united marketing agency & software house blending creative brilliance with tech mastery.

We have developed a unique human-AI approach to help you create standout, cost-effective AI-driven marketing solutions with a competitive edge.

This isn't your typical AI tool. We don't settle for generic AI apps, nor do we just churn out text or images. Our creative and tech teams work in perfect sync, shaping ideas with proprietary AI, adapting and customizing them to your needs, and delivering the personal experience your customers want… effortlessly.

We bring you the most valuable approach to GenAI marketing, tailored exclusively for you. Just give it a try.

CreativeAnswer is a proud part of:
- AnswerModules Group, the award-winning Swiss tech company delivering personalized ECM software to 120+ enterprises, including 10 Global Fortune's 500
- Microsoft for Startups Founders Hub
"""]

data.user    = [name:"Patrick Vitali", role:"CTO"]

def defaultTemperature = 0.7
def model = "gpt-4o"
def maxTokens = 2000
def msgID = "MyMessage"
try{

    def reqBuilder  = openai.newChatCompletionRequestBuilder()
    .model(model)
    .temperature(defaultTemperature)
    .n(1) 

    def req = reqBuilder.build()

    func =  llm.newOpenAIChatFunctionBuilder().name("createDocument")
            .description("Given a title, creates a new document, matching user's requests with content generated by the AI agent").build()

    func.parameters = []
    func.addStringParameter("title",   "The document title", true)
    func.addStringParameter("content", "The document's content. The content must be a valid  X-HTML code (example: <b>DEMO</b>)", true)

    func.executor = { jsonArguments ->
        try{
            def slurper = new JsonSlurper()
            def args = slurper.parseText(jsonArguments)

            if(!args.title.endsWith('.docx')){
                args.title = "${args.title}.docx"
            }

            docTemplate = docman.getNodeByPath("CreativeAnswer:Marketing:Corporate Identity:Template.docx")
            newNode = docman.getNodeByName(docTemplate.parent, args.title)

            //Load contents from a Docx file for processing
            def doc = docx.loadWordDoc(docTemplate)

            //Creates a temporary resouce
            def res = docman.getTempResource("out", "docx") 

            //Updates the custom-xml databinding based on the OpenDoPE standard. Since: 2.3.0
            //Notice the combined used of multiple Content Script services: docx, docman, html, cache
            def xml = ""
            if(newNode){
                //Update xml with custom values
                xml = """<doc>
                            <content>&lt;div&gt;${html.escapeXML(html.htmlToXhtml(args.content))}&lt;/div&gt;</content>
                            <docID>${newNode.ID}</docID>
                        </doc>"""
                doc.updateOpenDoPEBindings(xml, true, true)
                doc.save(newNode)
            }else{
                newNode = doc.save(docTemplate.parent, args.title)
                xml = """<doc>
                            <content>&lt;div&gt;${html.escapeXML(html.htmlToXhtml(args.content))}&lt;/div&gt;</content>
                            <docID>${newNode.ID}</docID>
                        </doc>"""
                doc.updateOpenDoPEBindings(xml, true, true)
                doc.save(newNode)
            }
            ulrEditTemplate = "${url}?func=Edit.Edit&nodeid=${newNode.ID}&uiType=1&viewType=1&nexturl=${params.nextUrl}" 

            return "The requested document has been created with title: ${args.title}. <br>You can finalize the draft using the link that follows:<a target='_blank' href='${ulrEditTemplate}'>Edit</a>"

        }catch(e){
            log.error("Error ",e)
            return "Something went wrong "+e.message
        }
    }

    req.setFunctions([func])

    req.addChatMessage("system", //one out: system, assistant, user
                    systemPreamble) //We instruct the agent about its purpose and constraints using system messages at the beginning
    //of the chat
    req.addChatMessage("system", "When creating documents you should consider the following context, expressed in the form of a JSON structure: ${JsonOutput.toJson(data)}")

    //The user request   
    req.addChatMessage("user", "Create a presentation letter to be sent to John Doe, ACME's CMO") //Add user request or message

    result = openai.streamChatCompletion(req, { block->
        def map = [
            content:block.choices?[0]?.message?.content,
            role:   block.choices?[0]?.message?.role,
            finishReason:(block.choices?[0]?.message?.content != null)?block.choices?[0]?.finishReason:'stop'
        ]
        log.debug("GOT {}", map) // Let's also print it into the log file for debugging
        cache.put(msgID+"_"+iterator++, 500, map) 

    })
    out << result*.content

} catch (Exception e) {
    out << "An error occurred: " + e.getMessage()
}

example 2

This example demonstrates how to create a presentation letter using AI-generated content and Module Suite's document manipulation features.

Key components of this implementation include:

AI Content Generation: The AI is instructed to create document content based on user requests and provided context.
Document Creation Function: A specialized function is made available to the AI for creating documents within the ECM system.
Content Formatting: The AI is guided to provide content in a specific format (X-HTML) to ensure compatibility with the document creation process.
Context Provision: Relevant data, such as company and user information, is provided to the AI to inform the content generation process.
Document Template Usage: The implementation utilizes a predefined document template as a base for new documents.
Dynamic Document Update: The system checks for existing documents with the given title, updating if found or creating new ones if not.
OpenDoPE Standard: The implementation leverages the OpenDoPE standard for XML data binding, allowing for dynamic content insertion into documents.
Streaming Response: The AI's response is streamed, enabling real-time updates and potentially faster response times for large documents.

Benefits of AI-Assisted Document Assembly

Efficiency: Automates the process of creating standard documents, saving time and reducing manual effort.
Consistency: Ensures that created documents follow a consistent structure and style.
Contextual Relevance: By providing context to the AI, the generated content can be highly relevant and personalized.
Flexibility: Can be adapted for various document types and use cases within the ECM system.
Integration: Seamlessly combines AI capabilities with existing ECM features and document templates.

Additional considerations: Implementation details¶

When implementing AI-assisted document assembly:

Ensure that the AI is properly constrained to generate content in the required format and style.
Regularly validate the generated content to maintain quality and consistency.
Consider implementing error handling and logging to manage potential issues during document creation.
Design the user interface to provide clear feedback on the document creation process and results.

Note

The specific implementation details may vary depending on your ECM environment and chosen AI service. Consult your Module Suite documentation for precise integration steps.

Tip

Consider expanding this approach to other document types, such as reports, contracts, or proposals. You can create specialized templates and AI instructions for each document type to further streamline your document creation processes.

Embedding Index Generation¶

Embedding index generation is a crucial step in creating AI-powered search and retrieval systems within your ECM environment. This use case demonstrates how Module Suite can generate embedding indexes for documents, enabling advanced semantic search capabilities.

Example: Indexing a single document¶

The provided implementation showcases the generation of an embedding index for documents in the ECM system. A key feature of this approach is its flexibility in handling different document types and chunking methods.

OpenAI CodeOllamaComments

    try {
        // Retrieve the document node from the Enterprise Workspace
        def document = docman.getNodeByPath(docman.getEnterpriseWS(), "Documents:MS_3_7_0_Manual.pdf")
        def docID = document.ID as String
        def lastVersion = document.lastVersion
        def list = null
        def sources = [:]

        // Prepare document metadata for indexing
        // We store metadata instead of raw text for efficiency:
        // 1. Document ID
        // 2. Document type (pdf or raw)
        // 3. Chunk dimension (e.g., page number or word count)
        // 4. Chunk identifier
        if (lastVersion.mimeType == "application/pdf") {
            // For PDF documents: Extract text page by page
            def content = pdf.getTextForPages(document)
            sources = content.collectEntries { entry -> 
                [("${docID}_pdf_1_${entry.key}"): entry.value] 
            }
        } else {
            // For non-PDF documents: Split content into word-based chunks
            def content = docman.getContentAsRawText(document)
            def chunkSize = 100 // Number of words per chunk (adjustable)
            def index = 0
            sources = content.split("\\s").toList().collate(chunkSize) *.join(" ").collectEntries {
                val -> [("${docID}_raw_${chunkSize}_${index++}"): val]
            }
        }

        // Process sources in batches of 10
        list = sources.entrySet().toList().findAll { it.value }.collate(10)
        def chunk = 0

        list.each { subEnties ->
            log.debug("Processing Chunk ${chunk} {}", subEnties *.value)
            try {
                // Generate embeddings for the current batch
                def req = llm.newOpenAIEmbeddingRequestBuilder()
                            .model("text-embedding-ada-002")
                            .input(subEnties *.value)
                            .build()
                def embeddings = llm.createEmbeddings(req).data

                // Build vectorial index using embeddings and metadata
                llm.newCSVectorialIndexManager().buildIndex(docID, embeddings, subEnties *.key)
            } catch (e) {
                // Log errors for individual chunks without stopping the entire process
                log.error("Unable to process chunk ${chunk}", e)
            }
            log.debug("End processing Chunk ${chunk++}")
        }
    } catch (e) {
        // Log any overall process errors
        log.error("An error occurred during index generation.", e)
    }

try {
    // Retrieve the document node using the provided path
    def document = docman.getNode(2031317 )
    def docID = document.ID as String
    def lastVersion = document.lastVersion
    def list = null
    def sources = [:]

    if (lastVersion.mimeType == "application/pdf") {
        // If the document is a PDF, extract text for each page and create embeddings
        def content = pdf.getTextForPages(document)
        sources = content.collectEntries { entry -> [("${docID}_pdf_1_${entry.key}"): entry.value] }
    } else {
        // If not a PDF, split the content into chunks and generate embeddings
        def content = docman.getContentAsRawText(document)
        def chunkSize = 100 // The number of words each chunk should be made of
        def index = 0
        sources = content.split("\\s").toList().collate(chunkSize) *.join(" ").collectEntries {
            val -> [("${docID}_raw_${chunkSize}_${index++}"): val]
        }

    }

    list = sources.entrySet().toList().findAll { it.value }.collate(10)
    def chunk = 0
    list.each { subEnties ->
        log.debug("Processing Chunk ${chunk} {}", subEnties *.value)
        try {
            // Create an embedding request for the chunk

            model = llm.newCSEmbeddingModelBuilder("ollama")
            .modelName("llama3")
            .logRequests(true)
            .logResponses(true)
            .build();

            def embeddings = []
            subEnties.each{
                embeddings << llm.newCSEmbedding( model.embed(it.value) )
            }

            // Build a vectorial index for the document using the embeddings
            llm.newCSVectorialIndexManager().buildIndex(docID, embeddings, subEnties*.key)
        } catch (e) {
            log.error("Unable to process chunk ${chunk}", e)
        }
        log.debug("End processing Chunk ${chunk++}")
    }
} catch (e) {
    log.error("An error occurred. ", e)
}

Key Features

Document Type Flexibility: The system can handle various document types, with specific handling for PDF files and a general approach for other text-based documents.
Adaptive Chunking: The chunking strategy adapts based on the document type:
For PDFs: Text is extracted page by page.
For other documents: Content is split into chunks based on a specified word count.
Metadata-Rich Indexing: Instead of storing raw text, the index stores metadata that can be used to retrieve the original text:
Document ID
Document type (PDF or raw)
Chunk dimensions (e.g., page number for PDFs, word count for other documents)
Chunk identifier
Configurable Chunk Size: For non-PDF documents, the chunk size (number of words per chunk) can be easily adjusted.
Batch Processing: Embeddings are generated and indexed in batches to manage resource usage efficiently.

Additional Considerations: Implementation Highlights¶

This implementation demonstrates several important concepts:

Document Retrieval: The system retrieves documents from the ECM using the document management API.
Type-Specific Processing: Different processing logic is applied based on the document's MIME type.
Flexible Text Extraction:
- For PDFs, text is extracted page by page using a PDF processing service.
- For other documents, raw text is extracted and split into word-based chunks.
Metadata Generation: Each chunk is associated with a unique identifier that includes:
- Document ID
- Document type (PDF or raw)
- Chunk size or page number
- Chunk index
Embedding Generation: The system uses an LLM service to generate embeddings for each chunk of text.
Index Building: A vectorial index is built using the generated embeddings and associated metadata.

Retrieval-Augmented Generation (RAG)¶

Retrieval-Augmented Generation (RAG) is a powerful technique that combines the strengths of large language models with the ability to access and utilize specific, up-to-date information from your Extended ECM system. This approach significantly enhances the accuracy and relevance of AI-generated responses by grounding them in your organization's actual content.

Example: Using a all the documents in a folder as a Knowledge Base¶

In this use case, we demonstrate how to implement a RAG system within the Module Suite. The system performs the following key steps:

Embeds the user's question using a specified embedding model.
Searches a pre-built set of vector indexes of your ECM content for relevant information.
Retrieves the actual text content of the most relevant chunks.
Incorporates this retrieved context into the prompt for the large language model.
Generates a response using the LLM, now informed by the relevant context.

An index for each document

This example assumes that an embedding index has been created for each document

OpenAI CodeComments

// Configuration parameters for the RAG system
def numberOfSnippetToConsider = 1      // Number of relevant snippets to use for context
def distanceSnippetTrashold = 0.8      // Cosine similarity threshold for relevant snippets
def embeddingModel = "text-embedding-ada-002"  // Model used for generating embeddings
def defaultTemperature = 0.8           // Temperature for LLM response generation
def model = "gpt-4"                    // LLM model to use for generating responses
def maxTokens = 2000                   // Maximum number of tokens in the LLM response
def msgID = "msgID"                    // Unique identifier for the message (used in streaming scenarios)

// Function to retrieve relevant context based on the user's question
def getContext = { String prompt ->
    // Get all document IDs in the KB folder (assuming documents are of subtype 144)
    def indexes = docman.getNodeByPath("KB").childrenFast.findAll{it.subtype == 144}.collect{it.ID as String}

    // Generate embedding for the user's prompt
    def req = openai.newEmbeddingRequestBuilder().model(embeddingModel).input([prompt]).build()
    def promptEmbedding = openai.createEmbeddings(req).data

    // Search the vector indexes for similar content
    def context = llm.newCSVectorialIndexManager().getIndexesSearcher(*indexes)
                    .query(promptEmbedding[0], 100, 100)
                    .findAll{it.score >= distanceSnippetTrashold}

    def textContext = ""
    if(context) {
        // Parse the index key to extract document information
        // Format: DOCID_identifier_chunkDimension_chunkNumber
        def (docID, extMode, dimension, chunkNum) = context[0].text.split("_")

        // Retrieve the actual text content based on the extraction mode
        switch(extMode) {
            case "raw":
                // For raw text, split into words and retrieve the specific chunk
                textContext = docman.getContentAsRawText(docman.getNodeFast(docID as Long))
                                .split("\\s")
                                .toList()
                                .collate(dimension as int)
                                *.join(" ")[chunkNum as int]
                break
            case "pdf":
                // For PDFs, retrieve the specific page(s)
                textContext = pdf.getTextForPages(docman.getNodeFast(docID as Long), 
                                                chunkNum as int, 
                                                (chunkNum as int) + (dimension as int))[(chunkNum as int)]
                break
        }
    }
    return textContext
}

// Main execution
def question = "Which is the biggest update on Module Suite 3.2.0 ?"

// Set up the chat completion request
def builder = openai.newChatCompletionRequestBuilder()
def req = builder.model(model)
                .user("u" + users.current.ID)
                .temperature(defaultTemperature)
                .n(1)
                .build()

// Add system message to define CARL's role
req.addChatMessage("system", "You're C.A.R.L. (Content Server Artificial intelligence Resource and Liaison) an LLM designed to help users working with OpenText Extended ECM")

// Retrieve relevant context for the question
def context = getContext(question)

// Add user message with or without context
if(context) {
    req.addChatMessage("user", """Given the following context: ${context}
Answer the user question: ${question}""")
} else {
    req.addChatMessage("user", question)
}

// Generate and output the response
result = openai.createChatCompletion(req)
out << result.choices[0].message.content

Key Components

Embedding Generation: The system uses the OpenAI API to generate embeddings for the user's question.
Vector Index Search: A custom CSVectorialIndexManager is used to search pre-built indexes of your ECM content.
Context Retrieval: Based on the search results, the system retrieves the actual text content from your ECM documents, handling different document types (e.g., raw text, PDF) appropriately.
LLM Integration: The retrieved context is incorporated into the prompt sent to the LLM (in this case, GPT-4), allowing it to generate more accurate and relevant responses.

Benefits

Improved Accuracy: By grounding the LLM's responses in your actual ECM content, the system provides more accurate and relevant answers.
Up-to-date Information: The system can access the latest information in your ECM, ensuring responses reflect the most current data.
Customization: The RAG approach allows the AI to leverage your organization's specific knowledge and terminology.
Reduced Hallucination: By providing relevant context, the likelihood of the LLM generating incorrect or fabricated information is significantly reduced.

Additional Considerations: Implementation Highlights¶

Index Management: Ensure that your vector indexes are kept up-to-date as your ECM content changes.
Performance Optimization: Consider caching frequently accessed content or embeddings to improve response times.
Privacy and Security: Be mindful of what content is being sent to external LLM services and ensure compliance with your organization's data policies.

This implementation demonstrates a basic RAG system that can be further customized and optimized based on your specific needs and use cases within the Extended ECM environment.

Close the loop

Consider implementing a feedback mechanism to continuously improve the relevance of retrieved contexts and the quality of generated responses.

Do not index just once

The effectiveness of the RAG system heavily depends on the quality and coverage of your vector indexes. Regular maintenance and updates of these indexes are crucial for optimal performance.

Configuration¶

Base Configuration parameters description, default and purpose.

CARL Service Configuration Overview¶

The following table outlines the configuration parameters for the CARL service:

Parameter	Default Value	Description
amcs.carl.enabled	false	Enable or disable CARL feature
amcs.carl.default_mode	chat	Default mode for CARL (deprecated, chat is the only available mode)
amcs.carl.kb_cs_override	empty	Override for CARL co-pilot's knowledge base (KB)
amcs.carl.default_temperature	0.5D	CARL's request default temperature
amcs.carl.kb_cs_context	2	Number of CARL's KB entries to consider as context for each request
amcs.carl.kb_cs_distance	0.75D	Threshold for cosine distance among CARL's KB entries
amcs.carl.default_lang	EN	CARL's default language
amcs.carl.default_emb_model	text-embedding-ada-002	Default embedding model (used for updating the KB)
amcs.carl.kb_context_maxlen	20000	Maximum length of KB entries to be included in the request (in characters)
amcs.carl.default_maxtokens	8000	Default number of tokens to be used in completion requests
amcs.carl.auth_id	empty	Reserved for future use
amcs.carl.auth_secret	empty	The OpenAI API key for your organization
amcs.carl.auth_uri	empty	Reserved for future use
amcs.carl.api_uri	https://api.openai.com/v1/	OpenAI API endpoint
amcs.carl.api_version	empty	Reserved for future use

Deprecated Parameter

The amcs.carl.default_mode parameter is considered deprecated. It exists for historical reasons when LLMs first became popular and supported text completion. Currently, almost everything is done in the form of chat completion.

Knowledge Base Override

The amcs.carl.kb_cs_override parameter is optional and should only be used if you want to override the knowledge base of the CARL co-pilot. The KB of CARL (the Content Script co-pilot) has been generated based on the content of the Content Script Volume, specifically the Content Script Snippets.

The carl service features an API carl.dumpCarlKB() that allows you to regenerate this KB so that you can save it as a document on Content Server. Under The Content Script Volume:CSTools:CARL:Utilities, there is a script named "ExportCARLKB" that implements this concept. It should be used in case you have custom snippets you want to be included in CARL's KB.

Cosine Distance Threshold

The amcs.carl.kb_cs_distance parameter sets the threshold for the cosine distance among CARL's KB entries. This value determines whether an entry is considered relevant and should be included in building a request context. A lower value will result in more strict matching, while a higher value will be more lenient.

Token Limit

The amcs.carl.default_maxtokens parameter sets the default number of tokens to be used in completion requests. Be aware that different AI models may have different maximum token limits. Ensure this value does not exceed the limit of your chosen model.

API Key Security

The amcs.carl.auth_secret parameter is used to store your OpenAI API key. Ensure that this value is kept secure and not exposed in any logs or public configurations. It's recommended to use secure methods for storing and managing API keys in your production environment.

API Endpoint

The amcs.carl.api_uri parameter is set to the default OpenAI API endpoint. If you're using a different provider or a custom endpoint, you'll need to update this value accordingly.

LLM Service Configuration Overview¶

The following table outlines the configuration parameters for the LLM service:

Parameter	Default Value	Description
amcs.llm.activeProfiles	default	Comma-separated list of active profiles
amcs.llm.provider.default	openai	The LLM API Service provider
amcs.llm.auth_id.default	empty	Used if the provider uses OAuth authentication
amcs.llm.auth_secret.default	empty	The provider's integration key (secret)
amcs.llm.auth_uri.default	empty	The URL for generating an authorization URL (for OAuth)
amcs.llm.api_uri.default	https://api.openai.com/v1/	The APIs endpoint
amcs.llm.model_id.default	empty	The default model to be used (deprecated)
amcs.llm.temperature.default	0.0	The default request's temperature (deprecated)
amcs.llm.net_timeout.default	3000	The default network timeout (in milliseconds)
amcs.llm.net_log_enabled.default	false	Network Request and Responses Logging
amcs.llm.index_store	empty	The storage path for embedding indexes

Active Profiles

The amcs.llm.activeProfiles parameter supports multiple profile configurations. You can register a new profile by adding the proper "Custom variables" to the Base Configuration. This property allows you to enable existing profiles. Profiles that are not enabled are not considered.

Supported Providers

As of the time of writing, the supported LLM API Service providers (amcs.llm.provider.default) are: - openai - azure - ollama

Deprecated Parameters

The following parameters are marked as deprecated: - amcs.llm.model_id.default: The default model to be used - amcs.llm.temperature.default: The default request's temperature

Consider using alternative methods to specify these values in your requests.

API Key Security

The amcs.llm.auth_secret.default parameter is used to store your LLM provider's API key. Ensure that this value is kept secure and not exposed in any logs or public configurations. It's recommended to use secure methods for storing and managing API keys in your production environment.

Network Logging

The amcs.llm.net_log_enabled.default parameter enables logging of network requests and responses. This should be set to false in production environments to prevent potential exposure of sensitive information.

Embedding Index Storage

The amcs.llm.index_store parameter specifies the storage path for embedding indexes. In most cases, this should be a shared storage location accessible to all cluster nodes (e.g., EFS) to ensure consistency across the system (e.g. \\otadminBe01\EFS\llm_indexes\).

OAuth Authentication

If your LLM provider uses OAuth authentication, you'll need to set the amcs.llm.auth_id.default and amcs.llm.auth_uri.default parameters accordingly. These are used for generating and managing OAuth tokens.

Defining New LLM Service Profiles¶

To create a new profile for the LLM service, you need to add custom variables to the Base Configuration. The process involves replicating the existing configuration parameters but replacing the "default" suffix with the name of your new profile. This allows you to maintain multiple configurations for different LLM providers or use cases within the same system.

For example, to create an "azure" profile for Azure AI, you would add the following custom variables:

Parameter	Value
amcs.llm.api_uri.azure	https://amai-east-us.openai.azure.com/
amcs.llm.auth_secret.azure	7a234aaaa666666d5a245245422cvbs23a
amcs.llm.provider.azure	azure

Similarly, to create an "ollama_prod" profile for Ollama, you would add:

Parameter	Value
amcs.llm.provider.ollama_prod	ollama
amcs.llm.api_uri.ollama_prod	http://15.201.113.243:11434

By defining these custom variables, you create distinct profiles that can be activated using the amcs.llm.activeProfiles parameter. This flexibility allows you to easily switch between different LLM providers or configurations without modifying the core settings. Remember to include all necessary parameters for each profile to ensure proper functionality.

Profile Naming

Choose clear and descriptive names for your profiles to easily identify their purpose or associated provider. For instance, use names like "azure_prod", "openai_dev", or "ollama_test" to indicate both the provider and the environment.

Secure Configuration

When defining profiles, especially those containing sensitive information like API keys, ensure that your configuration files and storage are properly secured and follow your organization's security best practices.

Integrate LLM services

Integrate Large Language Models in your workflow¶

Introduction¶

Architecture and Networking¶

Integration with xECM¶

LLM API Communication¶

Local Embedding Indexes¶

Typical Communication Sequence¶

Service Provider Support in Module Suite¶

OpenAI API Providers (OpenAI and Microsoft Azure AI)¶

Ollama API Support¶

Components of the LLM Service Integration¶

Content Script¶

OpenAI Extension Package Service¶

LLM Extension Package Service¶

Widgets¶

Smart Pages Widget (named CARL)¶

Beautiful WebForm Widget (named CARL)¶

Services¶

Content Script Service (named carl)¶

Code Snippets¶

Content Script Snippets¶

CARL (Content Server Artificial intelligence Resource and Liaison)¶

CARL Integration in Content Script Editor¶

Integration Use Cases¶

Chat Completion¶

Example: Basic Chat Interaction¶

Chat Completion (continued)¶

Example: Streaming Chat Completion¶

Additional considerations: Producer-Consumer Pattern¶

Default Consumer Service¶

Sequence Diagram¶

Function Calling¶

Example: Creating Folders Using AI¶

Document Assembly¶

Example: Create a presentation letter in Word¶

Additional considerations: Implementation details¶

Embedding Index Generation¶

Example: Indexing a single document¶

Additional Considerations: Implementation Highlights¶

Retrieval-Augmented Generation (RAG)¶

Example: Using a all the documents in a folder as a Knowledge Base¶

Additional Considerations: Implementation Highlights¶

Configuration¶

CARL Service Configuration Overview¶

LLM Service Configuration Overview¶

Defining New LLM Service Profiles¶