AI Assistants

Your enterprise knowledge, always within reach

Multi-LLM strategies, virtual assistants, and orchestrated workflows - built to boost productivity and unlock instant access to what matters.

Services

Efficient knowledge retrieval with AI Assistants and LLM-powered chatbots.

Everything you need to deploy AI assistants that scale - from LLM strategy to multi-agent orchestration.

Multi-LLM Strategy

Helping your enterprise adopt a multi-LLM strategy to optimize AI performance across diverse tasks by leveraging the unique strengths of different models.

AI & Virtual Assistants

Creating LLM-powered AI Assistants that seamlessly integrate with your enterprise data to deliver precise, context-aware, and secure responses.

AI Global Orchestrator

Setting up multi-agent workflows where agents collaborate to resolve issues, automate business processes, and reach shared business goals.

Our solution

See how AI & Virtual Assistants work

User queries are enriched with real-time search across documents, databases, and the internet, combined with LLM intelligence to deliver fast, accurate, and context-aware responses.
AI Assistants are LLM-powered chatbots designed to assist users by answering questions and providing insights.Designed for enterprise environments, AI Assistants seamlessly integrate with internal data sources—including databases, CRM systems, enterprise knowledge bases, and proprietary documentation—to provide precise, context-aware, and secure responses. By harnessing these internal resources, they enhance decision-making, optimize operations, and deliver tailored business intelligence.

Our solution

How Global AI Orchestrator works

Global AI Orchestrator intelligently routes requests to specialized AI assistants across departments, combining full context awareness with expert knowledge to deliver accurate and efficient answers.
Orchestrator streamlines access to different AI Assistants (LLM-based chatbots) by providing single access point in a form of seamless chat, and context management.The bigger architecture of an LLM Hub relies on a system which allows to seamlessly use any of the chatbot assistants/agents.

Key capabilities

The impact you can expect

Real productivity gains, instant knowledge access, and decisions backed by data.

Productivity boost

AI assistants automate repetitive tasks, manage schedules, and handle routine queries — freeing up your team for high-value work while reducing operational costs.

Instant access to knowledge

On-demand answers, insights, and context from vast datasets or documents. No manual searching, no waiting.

Enhanced customer experience

Fast, personalized, 24/7 support across channels. Better satisfaction, lower costs.

Smarter decisions

Surface relevant data, generate recommendations, and explain key factors — helping your teams act faster and with confidence.

Portfolio

Learn how we help our customers tackle their challenges

Explore how we redefine industry standards through innovation.

Breaking the 90-day decision cycle - how a global automotive leader achieved real-time insights with Gen AI

View project

How a premium automotive enterprise reduced engineering overhead while accelerating AI innovation

View project

When support agents stop searching and start solving

How we designed an integrated AI assistant to eliminate manual work in customer support operations

View project

Connect

Interested in our services?

Reach out for tailored solutions and expert guidance.

Blog

Learn more about our AI strategies

Exploring the impact of Generative AI

Legacy modernization

Software development

LLM comparison: Find the best fit for legacy system rewrites

Legacy systems often struggle with performance, are vulnerable to security issues, and are expensive to maintain. Despite these challenges, over 65% of enterprises still rely on them for critical operations.

At the same time, modernization is becoming a pressing business need, with the application modernization services market valued at $17.8 billion in 2023 and expected to grow at a CAGR of 16.7%.

This growth highlights a clear trend: businesses recognize the need to update outdated systems to keep pace with industry demands.

The journey toward modernization varies widely. While 75% of organizations have started modernization projects, only 18% have reached a state of continuous improvement.

Data source: https://www.redhat.com/en/resources/app-modernization-report

For many, the process remains challenging, with a staggering 74% of companies failing to complete their legacy modernization efforts. Security and efficiency are the primary drivers, with over half of surveyed companies citing these as key motivators.

Given these complexities, the question arises: Could Generative AI simplify and accelerate this process?

With the surging adoption rates of AI technology, it’s worth exploring if Generative AI has a role in rewriting legacy systems.

This article explores LLM comparison, evaluating GenAI tools' strengths, weaknesses, and potential risks. The decision to use them ultimately lies with you.

Here's what we'll discuss:

Why Generative AI?
The research methodology
Generative AI tools: six contenders for LLM comparison
- OpenAI backed by ChatGPT-4o
- Claude-3-sonnet
- Claude-3-opus
- Claude-3-haiku
- Gemini 1.5 Flash
- Gemini 1.5 Pro
Comparison summary

Why Generative AI?

Traditionally, updating outdated systems has been a labor-intensive and error-prone process. Generative AI offers a solution by automating code translation, ensuring consistency and efficiency. This accelerates the modernization of legacy systems and supports cross-platform development and refactoring.

As businesses aim to remain competitive, using Generative AI for code transformation is crucial, allowing them to fully use modern technologies while reducing manual rewrite risks.

Here are key reasons to consider its use:

Uncovering dependencies and business logic - Generative AI can dissect legacy code to reveal dependencies and embedded business logic, ensuring essential functionalities are retained and improved in the updated system.

Decreased development time and expenses - automation drastically reduces the time and resources required for system re-writing. Quicker development cycles and fewer human hours needed for coding and testing decrease the overall project cost.

Consistency and accuracy - manual code translation is prone to human error. AI models ensure consistent and accurate code conversion, minimizing bugs and enhancing reliability.

Optimized performance - Generative AI facilitates the creation of optimized code from the beginning, incorporating advanced algorithms that enhance efficiency and adaptability, often lacking in older systems.

The LLM comparison research methodology

It could be tough to compare different Generative AI models to each other. It’s hard to find the same criteria for available tools. Some are web-based, some are restricted to a specific IDE, some offer a “chat” feature, and others only propose a code.

As our goal was the re-writing of existing projects , we aimed to create an LLM comparison based on the following six main challenges while working with existing code:

Analyzing project architecture - understanding the architecture is crucial for maintaining the system's integrity during re-writing. It ensures the new code aligns with the original design principles and system structure.

Analyzing data flows - proper analysis of data flows is essential to ensure that data is processed correctly and efficiently in the re-written application. This helps maintain functionality and performance.

Generating historical b acklog - this involves querying the Generative AI to create Jira (or any other tracking system) tickets that could potentially be used to rebuild the system from scratch. The aim is to replicate the workflow of the initial project implementation. These "tickets" should include component descriptions and acceptance criteria.

Converting code from one programming language to another - language conversion is often necessary to leverage modern technologies. Accurate translation preserves functionality and enables integration with contemporary systems.

Generating new code - the ability to generate new code, such as test cases or additional features, is important for enhancing the application's capabilities and ensuring comprehensive testing.

Privacy and security of a Generative AI tool - businesses are concerned about sharing their source codebase with the public internet. Therefore, work with Generative AI must occur in an isolated environment to protect sensitive data.

Source projects overview

To test the capabilities of Generative AI, we used two projects:

Simple CRUD application - The project utilizes .Net Core as its framework, with Entity Framework Core serving as the ORM and SQL Server as the relational database. The target application is a backend system built with Java 17 and Spring Boot 3.

Microservice-based application - The application is developed with .Net Core as its framework, Entity Framework Core as the ORM, and the Command Query Responsibility Segregation (CQRS) pattern for handling entity operations. The target system includes a microservice-based backend built with Java 17 and Spring Boot 3, alongside a frontend developed using the React framework

Generative AI tools: six contenders for LLM comparison

In this article, we will compare six different Generative AI tools used in these example projects:

OpenAI backed by ChatGPT-4o with a context of 128k tokens
Claude-3-sonnet - context of 200k tokens
Claude-3-opus - context of 200k tokens
Claude-3-haiku - context of 200k tokens
Gemini 1.5 Flash - context of 1M tokens
Gemini 1.5 Pro - context of 2M tokens

OpenAI

OpenAI's ChatGPT-4o represents an advanced language model that showcases the leading edge of artificial intelligence technology. Known for its conversational prowess and ability to manage extensive contexts, it offers great potential for explaining and generating code.

Analyzing project architecture

ChatGPT faces challenges in analyzing project architecture due to its abstract nature and the high-level understanding required. The model struggles with grasping the full context and intricacies of architectural design, as it lacks the ability to comprehend abstract concepts and relationships not explicitly defined in the code.

Analyzing data flows

ChatGPT performs better at analyzing data flows within a program. It can effectively trace how data moves through a program by examining function calls, variable assignments, and other code structures. This task aligns well with ChatGPT's pattern recognition capabilities, making it a suitable application for the model.

Generating historical backlog

When given a project architecture as input, OpenAI can generate high-level epics that capture the project's overall goals and objectives. However, it struggles to produce detailed user stories suitable for project management tools like Jira, often lacking the necessary detail and precision for effective use.

Converting code from one programming language to another

ChatGPT performs reasonably well in converting code, such as from C# to Java Spring Boot, by mapping similar constructs and generating syntactically correct code. However, it encounters limitations when there is no direct mapping between frameworks, as it lacks the deep semantic understanding needed to translate unique framework-specific features.

Generating new code

ChatGPT excels in generating new code, particularly for unit tests and integration tests. Given a piece of code and a prompt, it can generate tests that accurately verify the code's functionality, showcasing its strength in this area.

Privacy and security of the Generative AI tool

OpenAI's ChatGPT, like many cloud-based AI services, typically operates over the internet. However, there are solutions to using it in an isolated private environment without sharing code or sensitive data on the public internet. To achieve this, on-premise deployments such as Azure OpenAI can be used, a service offered by Microsoft where OpenAI models can be accessed within Azure's secure cloud environment.

Best tip

Use Reinforcement Learning from Human Feedback (RLHF): If possible, use RLHF to fine-tune GPT-4. This involves providing feedback on the AI's outputs, which it can then use to improve future outputs. This can be particularly useful for complex tasks like code migration.

Overall

OpenAI's ChatGPT-4o is a mature and robust language model that provides substantial support to developers in complex scenarios. It excels in tasks like code conversion between programming languages, ensuring accurate translation while maintaining functionality.

Possibilities 3/5
Correctness 3/5
Privacy 5/5
Maturity 4/5

Overall score: 4/5

Claude-3-sonnet

Claude-3-Sonnet is a language model developed by Anthropic, designed to provide advanced natural language processing capabilities. Its architecture is optimized for maintaining context over extended interactions, offering a balance of intelligence and speed.

Analyzing project architecture

Claude-3-Sonnet excels in analyzing and comprehending the architecture of existing projects. When presented with a codebase, it provides detailed insights into the project's structure, identifying components, modules, and their interdependencies. Claude-3-Sonnet offers a comprehensive breakdown of project architecture, including class hierarchies, design patterns, and architectural principles employed.

Analyzing data flows

It struggles to grasp the full context and nuances of data flows, particularly in complex systems with sophisticated data transformations and conditional logic. This limitation can pose challenges when rewriting projects that heavily rely on intricate data flows or involve sophisticated data processing pipelines, necessitating manual intervention and verification by human developers.

Generating historical backlog

Claude-3-Sonnet can provide high-level epics that cover main functions and components when prompted with a project's architecture. However, they lack detailed acceptance criteria and business requirements. While it may propose user stories to map to the epics, these stories will also lack the details needed to create backlog items. It can help capture some user goals without clear confirmation points for completion.

Converting code from one programming language to another

Claude-3-Sonnet showcases impressive capabilities in converting code, such as translating C# code to Java Spring Boot applications. It effectively translates the logic and functionality of the original codebase into a new implementation, leveraging framework conventions and best practices. However, limitations arise when there is no direct mapping between frameworks, requiring additional manual adjustments and optimizations by developers.

Generating new code

Claude-3-Sonnet demonstrates remarkable proficiency in generating new code, particularly in unit and integration tests. The AI tool can analyze existing codebases and automatically generate comprehensive test suites covering various scenarios and edge cases.

Privacy and security of the Generative AI tool

Unfortunately, Anthropic's privacy policy is quite confusing. Before January 2024, they used clients’ data to train their models. The updated legal document ostensibly provides protections and transparency for Anthropic's commercial clients, but it’s recommended to consider the privacy of your data while using Claude.

Best tip

Be specific and detailed : provide the GenerativeAI with specific and detailed prompts to ensure it understands the task accurately. This includes clear descriptions of what needs to be rewritten, any constraints, and desired outcomes.

Overall

The model's ability to generate coherent and contextually relevant content makes it a valuable tool for developers and businesses seeking to enhance their AI-driven solutions. However, the model might have difficulty fully grasping intricate data flows, especially in systems with complex transformations and conditional logic.

Possibilities 3/5
Correctness 3/5
Privacy 3/5
Maturity 3/5

Overall score: 3/5

Claude-3-opus

Claude-3-Opus is another language model by Anthropic, designed for handling more extensive and complex interactions. This version of Claude models focuses on delivering high-quality code generation and analysis with high precision.

Analyzing project architecture

With its advanced natural language processing capabilities, it thoroughly examines the codebase, identifying various components, their relationships, and the overall structure. This analysis provides valuable insights into the project's design, enabling developers to understand the system's organization better and make decisions about potential refactoring or optimization efforts.

Analyzing data flows

While Claude-3-Opus performs reasonably well in analyzing data flows within a project, it may lack the context necessary to fully comprehend all possible scenarios. However, compared to Claude-3-sonnet, it demonstrates improved capabilities in this area. By examining the flow of data through the application, it can identify potential bottlenecks, inefficiencies, or areas where data integrity might be compromised.

Generating historical backlog

By providing the project architecture as an input prompt, it effectively creates high-level epics that encapsulate essential features and functionalities. One of its key strengths is generating detailed and precise acceptance criteria for each epic. However, it may struggle to create granular Jira user stories. Compared to other Claude models, Claude-3-Opus demonstrates superior performance in generating historical backlog based on project architecture.

Converting code from one programming language to another

Claude-3-Opus shows promising capabilities in converting code from one programming language to another, particularly in converting C# code to Java Spring Boot, a popular Java framework for building web applications. However, it has limitations when there is no direct mapping between frameworks in different programming languages.

Generating new code

The AI tool demonstrates proficiency in generating both unit tests and integration tests for existing codebases. By leveraging its understanding of the project's architecture and data flows, Claude-3-Opus generates comprehensive test suites, ensuring thorough coverage and improving the overall quality of the codebase.

Privacy and security of the Generative AI tool

Like other Anthropic models, you need to consider the privacy of your data. For specific details about Anthropic's data privacy and security practices, it would be better to contact them directly.

Best tip

Break down the existing project into components and functionality that need to be recreated. Reducing input complexity minimizes the risk of errors in output.

Overall

Claude-3-Opus's strengths are analyzing project architecture and data flows, converting code between languages, and generating new code, which makes the development process easier and improves code quality. This tool empowers developers to quickly deliver high-quality software solutions.

Possibilities 4/5
Correctness 4/5
Privacy 3/5
Maturity 4/5

Overall score: 4/5

Claude-3-haiku

Claude-3-Haiku is part of Anthropic's suite of Generative AI models, declared as the fastest and most compact model in the Claude family for near-instant responsiveness. It excels in answering simple queries and requests with exceptional speed.

Analyzing project architecture

Claude-3-Haiku struggles with analyzing project architecture. The model tends to generate overly general responses that closely resemble the input data, limiting its ability to provide meaningful insights into a project's overall structure and organization.

Analyzing data flows

Similar to its limitations in project architecture analysis, Claude-3-Haiku fails to effectively group components based on their data flow relationships. This lack of precision makes it difficult to clearly understand how data moves throughout the system.

Generating historical backlog

Claude-3-Haiku is unable to generate Jira user stories effectively. It struggles to produce user stories that meet the standard format and detail required for project management. Additionally, its performance generating high-level epics is unsatisfactory, lacking detailed acceptance criteria and business requirements. These limitations likely stem from its training data, which focused on short forms and concise prompts, restricting its ability to handle more extensive and detailed inputs.

Converting code from one programming language to another

Claude-3-Haiku proved good at converting code between programming languages, demonstrating an impressive ability to accurately translate code snippets while preserving original functionality and structure.

Generating new code

Claude-3-Haiku performs well in generating new code, comparable to other Claude-3 models. It can produce code snippets based on given requirements or specifications, providing a useful starting point for developers.

Privacy and security of the Generative AI tool

Similar to other Anthropic models, you need to consider the privacy of your data, although according to official documentation, Claude 3 Haiku prioritizes enterprise-grade security and robustness. Also, keep in mind that security policies may vary for different Anthropic models.

Best tip

Be aware of Claude-3-haiku capabilities : Claude-3-haiku is a natural language processing model trained on short form. It is not designed for complex tasks like converting a project from one programming language to another.

Overall

Its fast response time is a notable advantage, but its performance suffers when dealing with larger prompts and more intricate tasks. Other tools or manual analysis may prove more effective in analyzing project architecture and data flows. However, Claude-3-Haiku can be a valuable asset in a developer's toolkit for straightforward code conversion and generation tasks.

Possibilities 2/5
Correctness 2/5
Privacy 3/5
Maturity 2/5

Overall score: 2/5

Gemini 1.5 Flash

Gemini 1.5 Flash represents Google's commitment to advancing AI technology; it is designed to handle a wide range of natural language processing tasks, from text generation to complex data analysis. Google presents Gemini Flash as a lightweight, fast, and cost-efficient model featuring multimodal reasoning and a breakthrough long context window of up to one million tokens.

Analyzing project architecture

Gemini Flash's performance in analyzing project architecture was found to be suboptimal. The AI tool struggled to provide concrete and actionable insights, often generating abstract and high-level observations instead.

Analyzing data flows

It effectively identified and traced the flow of data between different components and modules, offering developers valuable insights into how information is processed and transformed throughout the system. This capability aids in understanding the existing codebase and identifying potential bottlenecks or inefficiencies. However, the effectiveness of data flow analysis may vary depending on the project's complexity and size.

Generating historical backlog

Gemini Flash can synthesize meaningful epics that capture overarching goals and functionalities required for the project by analyzing architectural components, dependencies, and interactions within a software system. However, it may fall short of providing granular acceptance criteria and detailed business requirements. The generated epics often lack the precision and specificity needed for effective backlog management and task execution, and it struggles to generate Jira user stories.

Converting code from one programming language to another

Gemini Flash showed promising results in converting code from one programming language to another, particularly when translating from C# to Java Spring Boot. It successfully mapped and transformed language-specific constructs, such as syntax, data types, and control structures. However, limitations exist, especially when dealing with frameworks or libraries that do not have direct equivalents in the target language.

Generating new code

Gemini Flash excels in generating new code, including test cases and additional features, enhancing application reliability and functionality. It analyzed the existing codebase and generated test cases that cover various scenarios and edge cases.

Privacy and security of the Generative AI tool

Google was one of the first in the industry to publish an AI/ML privacy commitment , which outlines our belief that customers should have the highest level of security and control over their data stored in the cloud. That commitment extends to Google Cloud Generative AI products. You can set up a Gemini AI model in Google Cloud and use an encrypted TLS connection over the internet to connect from your on-premises environment to Google Cloud.

Best tip

Use prompt engineering: Starting by providing necessary background information or context within the prompt helps the model understand the task's scope and nuances. It's beneficial to experiment with different phrasing and structures; refining prompts iteratively based on the quality of the outputs. Specifying any constraints or requirements directly in the prompt can further tailor the model's output to meet your needs.

Overall

By using its AI capabilities in data flow analysis, code translation, and test creation, developers can optimize their workflow and concentrate on strategic tasks. However, it is important to remember that Gemini Flash is optimized for high-speed processing, which makes it less effective for complex tasks.

Possibilities 2/5
Correctness 2/5
Privacy 5/5
Maturity 2/5

Overall score: 2/5

Gemini 1.5 Pro

Gemini 1.5 Pro is the largest and most capable model created by Google, designed for handling highly complex tasks. While it is the slowest among its counterparts, it offers significant capabilities. The model targets professionals and developers needing a reliable assistant for intricate tasks.

Analyzing project architecture

Gemini Pro is highly effective in analyzing and understanding the architecture of existing programming projects, surpassing Gemini Flash in this area. It provides detailed insights into project structure and component relationships.

Analyzing data flows

The model demonstrates proficiency in analyzing data flows, similar to its performance in project architecture analysis. It accurately traces and understands data movement throughout the codebase, identifying how information is processed and exchanged between modules.

Generating historical backlog

By using project architecture as an input, it creates high-level epics that encapsulate main features and functionalities. While it may not generate specific Jira user stories, it excels at providing detailed acceptance criteria and precise details for each epic.

Converting code from one programming language to another

The model shows impressive results in code conversion, particularly from C# to Java Spring Boot. It effectively maps and transforms syntax, data structures, and constructs between languages. However, limitations exist when there is no direct mapping between frameworks or libraries.

Generating now code

Gemini Pro excels in generating new code, especially for unit and integration tests. It analyzes the existing codebase, understands functionality and requirements, and automatically generates comprehensive test cases.

Privacy and security of the Generative AI tool

Similarly to other Gemini models, Gemini Pro is packed with advanced security and data governance features, making it ideal for organizations with strict data security requirements.

Best tip

Manage context: Gemini Pro incorporates previous prompts into its input when generating responses. This use of historical context can significantly influence the model's output and lead to different responses. Include only the necessary information in your input to avoid overwhelming the model with irrelevant details.

Overall

Gemini Pro shows remarkable capabilities in areas such as project architecture analysis, data flow understanding, code conversion, and new code generation. However, there may be instances where the AI encounters challenges or limitations, especially with complex or highly specialized codebases. As such, while Gemini Pro offers significant advantages, developers should remain mindful of its current boundaries and use human expertise when necessary.

Possibilities 4/5
Correctness 3/5
Privacy 5/5
Maturity 3/5

Overall score: 4/5

LLM comparison summary

Embrace AI-driven approach to legacy code modernization

Generative AI offers practical support for rewriting legacy systems. While tools like GPT-4o and Claude-3-opus can’t fully automate the process, they excel in tasks like analyzing codebases and refining requirements. Combined with advanced platforms for data analysis and workflows, they help create a more efficient and precise redevelopment process.

This synergy allows developers to focus on essential tasks, reducing project timelines and improving outcomes.

Automotive

Software development

Generative AI for connected cars: Solution-oriented chatbots for personalized user support

Generative AI is becoming a major player in automotive innovation. The market is already valued at USD 480.22 million in 2024 , and it’s expected to grow to USD 3,900.03 million by 2034, with a steady annual growth rate of 23.3%. Moreover, by 2025, the global automobile sector will invest $11.1 billion in cognitive and AI technologies. These numbers show how quickly the industry is picking up on this technology’s potential.

GenAI is making its mark across various areas. From manufacturing optimization to autonomous driving, its impact is undeniable. Predictive maintenance systems identify issues early, AI-powered tools optimize vehicle development, and talking to in-car assistants is starting to feel like a scene out of a sci-fi movie.

Speaking of sci-fi, pop culture has always loved the idea of talking cars. There is K.I.T.T. (Knight Industries Two Thousand), of course, but also all Transformers and tons of cartoons, starting with Lightning McQueen. Is it just pure fiction? Not at all (except McQueen, for many reasons 😊)! Early attempts at smarter cars started with examples like a 2004 Honda offering voice-controlled navigation and Ford’s 2007 infotainment system. Fast forward to now, and we have a VW Golf with a GPT-based assistant that’s more conversational than ever.

But honestly, the most resourceful one is K.I.T.T. – it activates all onboard systems, diagnoses itself, and uses company resources (there is an episode when K.I.T.T. withdraws money from the company bank account using an ATM). In 1982, when the show first aired, it was just pure science fiction. But what about now? Is it more science or fiction? With Generative AI growing rapidly in automotive, we have to revisit that question.

Let’s break it down!

Prerequisites

Let’s assume we would like to create a solution-oriented chatbot connected with a car. By “solution-oriented,” I mean one that is really useful, able not only to change the attractive interior lighting but also to truly solve owners’ issues.

The idea is to use Generative AI, a large language model with its abilities in reasoning, problem-solving, and language processing.

Therefore, the first question is – where should the model be planted – in the cloud or a car?

For the first option, you need a constant Internet connection (which is usually not guaranteed in cars). In contrast, the second option typically involves a smaller and less versatile model, and you still need a lot of resources (hardware, power) to run it. The truth lies, as usual, in between (cloud model if available, local one otherwise), but today we’ll focus on the cloud model only.

The next step is to consider the user-facing layer. The perfect one is integrated into the car, isn’t it? Well, in most cases, yes, but there are some drawbacks.

The first issue is user-oriented – if you want to interact with your car when being outside of it, your mobile phone is probably the most convenient option (or a smartwatch, like Michael from Knight Rider). Also, infotainment systems are comprehensively tested and usually heavily sealed into cars, so introducing such a bot is very time-consuming. Therefore, the mobile phone is our choice.

We don’t want to focus on this application today, however. Depending on the target operating system, it probably should use speech-to-text recognition and text-to-speech generation and stream data both ways for a better user experience.

The core part is the chatbot backend – a regular application connecting the frontend and the LLM. It should be able to call external APIs and use two sources of knowledge – live car data and company-owned data sources.

Basics

Let’s gather the components. There is a customer-facing layer – the mobile application; then there is our main backend application, the LLM, of course, and some services to provide data and functionalities.

The diagram above is conceptual, of course. The backend is probably cloud-hosted, too, and cloud services linked to car services form the essence of the “connected cars” pattern.

The main concept for the application is “tool calling” – the LLM ability to call predefined functions with structuralized arguments. That’s why the backend is surrounded by different services. In a perfect world, those should be separated microservices designed for different use cases. However, this architecture is not scenario-based. There is no “if-else-if” ladder or so. The LLM determines how to utilize the tools based on its own decision-making process.

The sample conversation schema might look like the one presented below.

As you can see, the chatbot service calls the LLM, and the LLM returns command “call function A.” Then, the service calls the function and returns the response to the LLM (not the user!).

This approach is very flexible as functions (a.k.a. tools) might execute actions and return useful data. Also, the LLM may decide to use a function based on another function result. In the case above, it can, for example, use one function to check the climate control system status and discover that it’s running in the “eco mode”. Then, it might decide to call the “set mode” function with the argument “max AC” to change the mode. After that, the LLM can return an answer to the user with a message like “It should be fixed now”.

To build such an application, all you need to call the LLM like that (OpenAI GPT-4o example):

{ "model": "gpt-4o", "messages": [ { "role": "user", "content": "My AC is ineffective! Fix it!" } ], "tools": [ { "type": "function", "function": { "name": "get AC status", "description": "Return current status of the climate control system" } }, { "type": "function", "function": { "name": "set AC mode", "description": "Sets up the specified mode for the climate control system", "parameters": { "type": "object", "properties": { "mode": { "type": "string", "description": "Desired mode", "enum": ["ECO", “NORMAL”, "MAX AC"] } }, "required": ["mode"] } } } ], "tool_choice": "auto" }

As you can see, the response schema does not bother us here – the assumption is that the LLM is able to understand any reasonable response.

Dive

The subtitle should be a “deep dive”, but honestly, we’re just scratching the surface today. Nevertheless, let’s focus a little bit more.

So far, we have the user-facing application and the backend service. Now, let’s make it useful.

The AC example mentioned above is perfectly valid, but how can it be achieved? Let’s say there is an API for interaction with the AC in the car. It’s typical for all PHEVs and EVs and available for some HEVs, too, when you can turn on your AC remotely via the mobile app. However, the real value lies in the connected car

There is no IP address of the car hardcoded in the application. Usually, there is a digital twin in the cloud (a cloud service that represents the car). The application calls the twin, and the twin notifies the vehicle. There should also be some pub/sub queue in between to handle connectivity tier disruptions. Also, the security layer is extremely important. We don’t want anybody even to play the radio at max volume during a quiet night ride, not to mention turning off the lights or engaging breaks.

Which brings us to the list of possible actions.

Let’s assume all systems in the car are somehow connected, maybe using a common bus or a more modern ethernet-like network. Still, some executors, such as brakes, should be isolated from the system.

So, there is no “brake API” to stop a car. However, it may be beneficial for mechanics to execute some "dangerous" actions programmatically, e.g., to increase the pressure in the braking system without actually pressing the pedal. If this is the case, such functionalities should be accessible exclusively through a local connection without the need for digital twin integration. Therefore, we can assume there are two systems in the car – local and cloud-integrated, no matter the nature of the isolation (physical, network, or software). Let’s focus on the connected car aspect.

I believe the system should be able to change the vehicle settings, even if there is a risk that the driver could be surprised by an unauthorized change in the steering feel while taking a turn. This way, the chatbot might be useful and reduce support load by adjusting car settings based on the user's preferences. To avoid misusage, we can instruct the chatbot by prompt engineering to confirm each change with the user before execution and, of course, implement best-in-class security for all components. We can also allow certain operations only if the car is parked.

Which brings us back to the list of possible actions.

For the sake of this article, let’s assume the chatbot can change various car settings. Examples include:

Climate control settings
Driver assistant sensitivity and specific functions toggles
Navigation System settings, like route type or other functions toggles
360 camera system settings, like brightness adjustment
Sound system settings like equalizer
Wiper settings
Notifications settings
Active steering system settings

This list is not complete, and the best thing is – it doesn’t need to be, as adding new functions (tool definition + API availability) might be a part of the future system OVA update.

What about reading real-time data? Should we connect to the car directly and read the status? Let’s leave this option for another article 😉 and focus on communication via the cloud.

There are two possibilities.

We can provide more tools to get data per source/component (a reminder – LLM decides to call for data, which then triggers an API call, and the LLM processes the received response). Alternatively, we could implement a single tool, “get vehicle data,” that collects and merges all data available from all data sources.

For the latter approach, two ways are available – do we really need a tool? Maybe we should inject the current state into each conversation, as it’s probably beneficial to have the current state anyway to solve all cases?

Let me give the standard consultant reply to those questions.

It depends.

More data in the context means extended response time and a higher bill. Also, some cases don’t need the entire context or don’t need it at all. On the other hand, if you let the LLM decide which part of the context is necessary (which function to call), it will also affect the time and cost.

The next part is the “cost” of collecting data. Some data sources might be slow or consume valuable in-car resources to calculate (maybe not appropriate) data.

My advice would be to start collecting all data in the background during session creation and attach it to the session progressively when ready. Additionally, provide the LLM with the capability to wait for data if it has not yet been supplied. This should be implemented as a function to handle instances where the user requests data-related action, but the necessary data is not yet available

The second part is splitting the data. You can use a much smaller LLM to decide which context data is necessary to answer the current question and remove unnecessary information to keep it concise.

The last part is company data. All LLM models are trained based on thousands of manuals available over the Internet, but this is usually not sufficient. Your chatbot should have the car manual at its fingertips, so the general knowledge is not enough. Therefore, you need to provide it. You can execute the RAG procedure and add some content to the context for each prompt, but my advice would be to hide it behind a function, too. Then, the LLM may decide when to call it to get some extra information if needed.

Use cases of Generative AI for connected cars

OK, we have all the components in place – there is the frontend, the backend, the model, and the toolbox. Can it be useful? As you would expect!

In the examples below, I will skip the part where the LLM commands the backend to perform an action, and the backend calls an API. I will also omit the process of the API response traveling back through the backend to the LLM. Additionally, I will assume that no new data is added to the context, so the LLM must decide which data it needs. Besides that, make sure that your context contains some system prompts to better explain all possibilities to the assistant.

Here are some examples.

User: I think my Lane Keeping Assist activates too frequently. Can you check it?
Assistant: [get driver assistance settings]
Assistant: [diagnose sensors]
Assistant: Everything looks good, but I can see that the Lane Keeping Assist mode is set to “fast”. Do you want me to lower the setting to “medium”?
User: Yes, please.
Assistant: [change LKA settings]
Assistant: It should be better now.

In the example above, the user has a problem with one of the driver assistance components. The system proposes and, after confirmation, executes settings changes, so the driver doesn’t need to find a proper setting in the on-board computer.

Check another case.

User: Whenever I’m parking the car in my garage, I can barely see anything in my rear camera. There is not much light there, but maybe you help me in some other way?
Assistant: [get camera settings]
Assistant: I can increase the brightness of the camera, but then it might be too bright for other places. What do you think?
User: Yes, please do that.
Assistant: [change settings]
Assistant: Here you are; let me know if you want me to roll back the change.

Usually, there is no brightness setting available for the user. To adjust it, the user would need to schedule a service department meeting, spend some time, and pay for the facility. The assistant may do it in a few seconds for free.

User: There are so many notifications in the car, so I can’t hear the important ones!
Assistant: [check last notifications]
Assistant: If you find some of my notifications unnecessary, I can mute some of them. What do you think?
User: Yes, please turn off the speeding avoidance warning.
Assistant: I can’t turn it off due to legal requirements, but I can replace the sound with vibration. feedback on the acceleration pedal. Do you want me to do that?
User: Sounds great!
Assistant: [adjust settings]
Assistant: It’s done. I hope you’ll be less distracted.

In this case, the assistant solves the issue of annoying notifications. Note the alternative solution suggestion and knowledge about the legal aspect. We can achieve it with proper function descriptions.

User: It seems my car can’t connect to the Internet.
Assistant: [check the connection] -> no connection to the car from the cloud service
Assistant: It seems your car is disconnected. Can you please check the cellular network signal icon on the infotainment main screen?
User: I can see the signal is good.
Assistant: [check last logs]
Assistant: I can see there were some issues with your connection before. Please try to open the settings in the car infotainment and restart the communication module. I’ll guide you through the process.
User: OK, what should I do?
Assistant: [check user manual]
Assistant: Go to “Settings”, then “Connectivity”, then “Cellular” and then click “Restart module”. Confirm when asked.
User: OK
Assistant: [check the connection]
Assistant: I can see the connection is restored. Can you confirm that? If not, I can schedule a service meeting for you.
User: No need; it works now!

In the last example, the Assistant guides the user using logs, the manual, and after checking real-time conditions. Without the LLM, the user would need to visit the dealer or at least use online support assistance.

Final words

With modern technology, we can bring some science fiction concepts to real life. Maybe WARP speed is not possible yet, but at least a car itself can give us solutions to problems recently solved only in much more expensive ways.

The beauty of it is – it’s really simple. Of course, there is a user-facing application to be made, the security to be ensured, and a bunch of functions to be implemented, but with modern connected cars, those elements are in place anyway. The tricky, novel part is a very simple application that uses the LLM as the brain power of the system.

As you can see the “ Attention is all you need ” paper that started the LLM revolution has allowed humanity to bring to life concepts present in our culture for decades. On the other hand, would this article have been ever written if its authors hadn’t watched the K.I.T.T. in their childhood? We will never know.

View all

Stay updated with our newsletter

Subscribe for fresh insights and industry analysis.