Blog

Thinking out loud

Where we share the insights, questions, and observations that shape our approach.

Software development

Apache Kafka fundamentals

Nowadays, we have plenty of unique architectural solutions. But all of them have one thing in common – every single decision should be done after a solid understanding of the business case as well as the communication structure in a company. It is strictly connected with famous Conway’s Law:

“Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.”

In this article, we go deeper into the Event-Driven style, and we discover when we should implement such solutions. This is when Kafka comes to play.

The basic definition taken from the Apache Kafka site states that this is an open-source distributed event streaming platform . But what exactly does it mean? We explain the basic concepts of Apache Kafka, how to use the platform, and when we may need it.

Apache Kafka is all about events

To understand what the event streaming platform is, we need to have a prior understanding of an event itself. There are different ways of how the services can interact with each other – they can use Commands, Events, or Queries. So, what is the difference between them?

Command – we can call it a message in which we expect something to be done - like in the army when the commander gives an order to soldiers. In computer science, we are making requests to other services to perform some action, which causes a system state change. The crucial part is that they are synchronous, and we expect that something will happen in the future. It is the most common and natural method for communication between services. On the other hand, you do not really know if your expectation will be fulfilled by the service. Sometimes we create commands, and we do not expect any response (it is not needed for the caller.)
Event – the best definition of an event is a fact. It is a representation of the change which happened in the service (domain). It is essential that there is no expectation of any future action. We can treat an event as a notification of state change. Events are immutable. In other words - it is everything necessary for the business. This is also a single source of truth, so events need to precisely describe what happened in the system.
Query – in comparison to the others, the query is only returning a response without any modifications in the system state. A good example of how it works can be an SQL query.

Below there is a small summary which compares all the above-mentioned ways of interaction:

Now we know what the event is in comparison to other interaction styles. But what is the advantage of using events? To understand why event-driven solutions are better than synchronous request-response calls, we have to learn a bit about software architecture history.

The figure describes a difference between a system that has old monolith architecture and a system with new modern microservice architecture.

The left side of the figure presents an API communication between two monoliths. In this case, communication is straightforward and easy. There is a different problem though such monolith solutions are very complex and hard to maintain.

The question is, what happens if we want to use, instead of two big services, a few thousands of small microservices . How complex will it be? The directed graph on the right side is showing how quickly the number of calls in the system can grow, and with it, the number of shared resources. We can have a situation when we need to use data from one microservice in many places. That produces new challenges regarding communication.

What about communication style?

In both cases, we are using a request-response style of communication (figure below), and we need to know how to use API provided by the server from the caller perspective. There must be some kind of protocol to exchange messages between services.

So how to reduce the complexity and make an integration between services easier? To answer this – look at the figure below.

In this case, interactions between event producers and consumers are driven by events only. This pattern supports loose coupling between services, and what is more important for us, the event producer does not need to be aware of the event consumer state. It is the essence of the pattern. From the producer's perspective, we do not need to know who or how to use data from the topic.

Of course, as usual, everything is relative. It is not like the event-driven style is always the best. It depends on the use case. For instance, when operations should be done synchronously, then it is natural to use the request-response style. In situations like user authentication, reporting AB tests, or integration with third-party services, it is better to use a synchronous style. When the loose coupling is a need, then it is better to go with an event-driven approach. In larger systems, we are mixing styles to achieve a business goal.

The name of Kafka has its origins in the word Kafkaesque which means according to the Cambridge dictionary something extremely unpleasant, frightening, and confusing, and similar to situations described in the novels of Franz Kafka.

The communication mess in the modern enterprise was a factor to invent such a tool. To understand why - we need to take a closer look at modern enterprise systems.

The modern enterprise systems contain more than just services. They usually have a data warehouse, AI and ML analytics, search engines, and much more. The format of data and the place where data is stored are various – sometimes a part of the data is stored in RDBMS, a part in NoSQL, and other in file bucket or transferred via a queue. They can have different formats and extensions like XML, JSON, and so on. Data management is the key to every successful enterprise. That is why we should care about it. Tim O’Reilly once said:

„We are entering a new world in which data may be more important than software.”

In this case, having a good solution for processing crucial data streams across an enterprise is a must to be successful in business. But as we all know, it is not always so easy.

How to tame the beast?

For this complex enterprise data flow scenario, people invented many tools/methods. All to make this enterprise data distribution possible. Unfortunately, as usual, to use them, we have to make some tradeoffs. Here we have a list of them:

Database replication, Mirroring, and Log Shipping - used to increase the performance of an application (scaling) and backup/recovery.

ETL – Extract, Transform, Load - used to copy data from different sources for analytics/reports.

Messaging systems - provide asynchronous communication between systems.

As you can see, we have a lot of problems that we need to take care of to provide correct data flow across an enterprise organization. That is why Apache Kafka was invented. One more time we have to go to the definition of Apache Kafka. It is called a distributed event streaming platform. Now we know what the event is and how event-driven style looks like. So as you probably can guess, event streaming, in our case, means capturing, storing, manipulating, processing, reacting, and routing event streams in real-time. It is based on three main capabilities – publishing/subscribing, storing, and processing. These three capabilities make this tool very successful.

Publishing/Subscribing provides an ability to read/write to streams of events and even more – you can continuously import/export data from different sources/systems.
Storing is also very important here. It solves the abovementioned problems in messaging. You can store streams of events for as long as you want without being afraid that something will be gone.
Processing allows us to process streams in real-time or use history to process them.

But wait! There is one more word to explain – distributed. Kafka system internally consists of servers and clients. It uses a high-performance TCP Protocol to provide reliable communication between them. Kafka runs as a cluster on one or multiple servers which can be easily deployed in the cloud or on-prem in single or multiple regions. There are also Kafka Connect servers used for integration with other data sources and other Kafka Clusters. Clients that can be implemented in many programming languages have a special role to read/write and process event streams. The whole ecosystem of Kafka is distributed and of course like every distributed system has a lot of challenges regarding node failures, data loss, and coordination.

What are the basic elements of Apache Kafka?

To understand how Apache Kafka works let first explain the basic elements of the Kafka ecosystem.

Firstly, we should take a look at the event. It has a key, value, timestamp, and optional metadata headers. A key is used not only for identification, but it is used also for routing and aggregation operations for events with the same key.

As you can see in the figure below - if the message has no key attached, then data is sent using a round-robin algorithm. The situation is different when the event has a key attached. Then the events always go to the partition which holds this key. It makes sense from the performance perspective. We usually use ids to get information about objects, and in that case, it is faster to get it from the same broker than to look for it on many brokers.

The value, as you can guess, stores the essence of the event. It contains information about the business change that happened in the system.

There are different types of events:

Unkeyed Event – event in which there is no need to use a key. It describes a single fact of what happened in the system. It could be used for metric purposes.
Entity Event – the most important one. It describes the state of the business object at a given point in time. It must have a unique key, which usually is related to the id of the business object. They are playing the main role in event-driven architectures.
Keyed Event – an event with a key but not related to any business entity. The key is used for aggregation and partitioning.

Topics –storage for events. The analogy to a folder in a filesystem, where the topic is like a folder that organizes what is inside. An example name of the topic, which keeps all orders events in the e-commerce system can be “ orders” . Unlike in other messaging systems, the events stay on the topic after reading. It makes it very powerful and fault-tolerant. It also solves a problem when the consumer will process something with an error and would like to process it again. Topics can always have zero, single, and multiple producers and subscribers.

They are divided into smaller parts called partitions. A partition can be described as a “commit log”. Messages can be appended to the log and can be read only in the order from the beginning to the end. Partitions are designed to provide redundancy and scalability. The most important fact is that partitions can be hosted on different servers (brokers), and that gives a very powerful way to scale topics horizontally.

Producer – client application responsible for the creation of new events on Kafka Topic. The producer is responsible for choosing the topic partition. By default, as we mentioned earlier round-robin is used when we do not provide any key. There is also a way of creating custom business mapping rules to assign a partition to the message.

Consumer – client application responsible for reading and processing events from Kafka. All events are being read by a consumer in the order in which they were produced. Each consumer also can subscribe to more than one topic. Each message on the partition has a unique integer identifier ( offset ) generated by Apache Kafka which is increased when a new message arrives. It is used by the consumer to know from where to start reading new messages. To sum up the topic, partition and offset are used to precisely localize the message in the Apache Kafka system. Managing an offset is the main responsibility for each consumer.

The concept of consumers is easy. But what about the scaling? What if we have many consumers, but we would like to read the message only once? That is why the concept of consumer group was designed. The idea here is when consumer belongs to the same group, it will have some subset of partitions assigned to read a message. That helps to avoid the situation of duplicated reads. In the figure below, there is an example of how we can scale data consumption from the topic. When a consumer is making time-consuming operations, we can connect other consumers to the group, which helps to process faster all new events on the consumer level. We have to be careful though when we have a too-small number of partitions, we would not be able to scale it up. It means if we have more consumers than partitions, they are idle.

But you can ask – what will happen when we add a new consumer to the existing and running group? The process of switching ownership from one consumer to another is called “rebalance.” It is a small break from receiving messages for the whole group. The idea of choosing which partition goes to which consumer is based on the coordinator election problem.

Broker – is responsible for receiving and storing produced events on disk, and it allows consumers to fetch messages by a topic, partition, and offset. Brokers are usually located in many places and joined in a cluster . See the figure below.

Like in every distributed system, when we use brokers we need to have some coordination. Brokers, as you can see, can be run on different servers (also it is possible to run many on a single server). It provides additional complexity. Each broker contains information about partitions that it owns. To be secure, Apache Kafka introduced a dedicated replication for partitions in case of failures or maintenance. The information about how many replicas do we need for a topic can be set for every topic separately. It gives a lot of flexibility. In the figure below, the basic configuration of replication is shown. The replication is based on the leader-follower approach.

Everything is great! We have found all advantages of using Kafka in comparison to more traditional approaches. Now it is time to say something when to use it.

When to use Apache Kafka?

Apache Kafka provides a lot of use cases. It is widely used in many companies, like Uber, Netflix, Activision, Spotify, Slack, Pinterest, Coursera, LinkedIn, etc. We can use it as a:

Messaging system – it can be a good alternative to the existing messaging systems. It has a lot of flexibility in configuration, better throughput, and low end-to-end latency.

Website Activity tracking – it was the original use case for Kafka. Activity tracking on the website generates a high volume of data that we have to process. Kafka provides real-time processing for event-streams, which can be sometimes crucial for the business.

Figure 13 presents a simple use case for web tracking. The web application has a button that generates an event after each click. It is used for real-time analytics. Clients' events that are gathered on TOPIC 1. Partitioning is using user-id so client 1 events (user-id = 0) are stored in partition 0 and client 2 (user-id = 1) are stored in partition 1. The record is appended and offset is incremented on a topic. Now, a subscriber can read a message, and present new data on a dashboard or even use older offset to show some statistics.

Log aggregation – it can be used as an alternative to existing log aggregation solutions. It gives a cleaner way of organizing logs in form of the event streams and what is more, gives a very easy and flexible way to gather logs from many different sources. Comparing to other tools is very fast, durable, and has low end-to-end latency.
Stream processing – is a very flexible way of processing data using data pipelines. Many users are aggregating, enriching, and transforming data into new topics. It is a very quick and convenient way to process all data in real-time.
Event sourcing – is a system design in which immutable events are stored as a single source of truth about the system. A typical use case for event sourcing can be found in bank systems when we are loading the history of transactions. The transaction is represented by an immutable event which contains all data describing what exactly happened in our account.
Commit log – it can be used as an external commit-log for distributed systems. It has a lot of mechanisms that are useful in this use case (like log-compaction, replication, etc.)

Summary

Apache Kafka is a powerful tool used by leading tech enterprises. It offers a lot of use cases, so if we want to use a reliable and durable tool for our data, we should consider Kafka. It provides a loose coupling between producers and subscribers, making our enterprise architecture clean and open to changes. We hope you enjoyed this basic introduction to Apache Kafka and you will try to dig deeper into how it works after this article.

Looking for guidance on implementing Kafka or other event-driven solutions?

Get in touch with us to discuss how we can help.

Sources:

kafka.apache.org/intro
confluent.io/blog/journey-to-event-driven-part-1-why-event-first-thinking-changes-everything/
hackernoon.com/by-2020-50-of-managed-apis-projected-to-be-event-driven-88f7041ea6d8
ably.io/blog/the-realtime-api-family/
confluent.io/blog/changing-face-etl/
infoq.com/articles/democratizing-stream-processing-kafka-ksql-part2/
cqrs.nu/Faq
medium.com/analytics-vidhya/apache-kafka-use-cases-e2e52b892fe1
confluent.io/blog/transactions-apache-kafka/
martinfowler.com/articles/201701-event-driven.html
pluralsight.com/courses/apache-kafka-getting-started#
jaceklaskowski.gitbooks.io/apache-kafka/content/kafka-brokers.html

Bellemare, Adam. Building event-driven microservices: leveraging distributed large-scale data . O'Reilly Media, 2020.

Narkhede, Neha, et al. Kafka: the Definitive Guide: Real-Time Data and Stream Processing at Scale . O'Reilly Media, 2017.

Stopford, Ben. Designing Event-Driven Systems, Concepts and Patterns for Streaming Services with Apache Kafka , O'Reilly Media, 2018.

written by

Tomasz Trojanowski

Automotive

Software development

How to build an Android companion app to control a car with AAOS via Wi-Fi

In this article, we will explore how to create an application that controls HVAC functions and retrieves images from cameras in a vehicle equipped with Android Automotive OS (AAOS) 14.

The phone must be connected to the car's Wi-Fi, and communication between the Head Unit and the phone is required. The Android companion app will utilize the HTTP protocol for this purpose.

In AAOS 14, the Vehicle Hardware Abstraction Layer (VHAL) will create an HTTP server to handle our commands. This functionality is discussed in detail in the article " Exploring the Architecture of Automotive Electronics: Domain vs. Zone ".

Creating the mobile application

To develop the mobile application, we'll use Android Studio. Start by selecting File -> New Project -> Phone and Tablet -> Empty Activity from the menu. This will create a basic Android project structure.

Next, you need to create the Android companion app layout, as shown in the provided screenshot.

Below is the XML code for the example layout:

<?xml version="1.0" encoding="utf-8"?>  <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:app="http://schemas.android.com/apk/res-auto" android:id="@+id/view" android:layout_width="fill_parent" android:layout_height="fill_parent" android:orientation="vertical"> <LinearLayout android:layout_width="match_parent" android:layout_height="match_parent" android:layout_weight="1" android:orientation="vertical"> <Button android:id="@+id/evs" android:layout_width="match_parent" android:layout_height="wrap_content" android:text="EVS ON" /> <LinearLayout android:layout_width="match_parent" android:layout_height="wrap_content" android:orientation="horizontal"> <TextView android:id="@+id/temperatureText" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_marginStart="20dp" android:layout_marginTop="8dp" android:layout_marginEnd="20dp" android:text="16.0" android:textSize="60sp" /> <LinearLayout android:layout_width="match_parent" android:layout_height="wrap_content" android:orientation="vertical"> <Button android:id="@+id/tempUp" android:layout_width="match_parent" android:layout_height="wrap_content" android:text="Temperature UP" /> <Button android:id="@+id/tempDown" android:layout_width="match_parent" android:layout_height="wrap_content" android:text="Temperature Down" /> </LinearLayout> </LinearLayout> <Button android:id="@+id/getPhoto" android:layout_width="match_parent" android:layout_height="wrap_content" android:text="GET PHOTO" /> <ImageView android:id="@+id/evsImage" android:layout_width="match_parent" android:layout_height="wrap_content" app:srcCompat="@drawable/grapeup_logo" /> </LinearLayout> <View android:layout_width="fill_parent" android:layout_height="1dp" android:background="@android:color/darker_gray" /> </LinearLayout>

Adding functionality to the buttons

After setting up the layout, the next step is to connect actions to the buttons. Here's how you can do it in your MainActivity :

Button tempUpButton = findViewById(R.id.tempUp); tempUpButton.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View v) { tempUpClicked(); } }); Button tempDownButton = findViewById(R.id.tempDown); tempDownButton.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View v) { tempDownClicked(); } }); Button evsButton = findViewById(R.id.evs); evsButton.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View v) { evsClicked(); } }); Button getPhotoButton = findViewById(R.id.getPhoto); getPhotoButton.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View v) { Log.w("GrapeUpController", "getPhotoButton clicked"); new DownloadImageTask((ImageView) findViewById(R.id.evsImage)) .execute("http://192.168.1.53:8081/"); } });

Downloading and displaying an image

To retrieve an image from the car’s camera, we use the DownloadImageTask class, which downloads a JPEG image in the background and displays it:

private class DownloadImageTask extends AsyncTask<String, Void, Bitmap> { ImageView bmImage; public DownloadImageTask(ImageView bmImage) { this.bmImage = bmImage; } @Override protected Bitmap doInBackground(String... urls) { String urldisplay = urls[0]; Bitmap mIcon11 = null; try { Log.w("GrapeUpController", "doInBackground: " + urldisplay); InputStream in = new java.net.URL(urldisplay).openStream(); mIcon11 = BitmapFactory.decodeStream(in); } catch (Exception e) { Log.e("Error", e.getMessage()); e.printStackTrace(); } return mIcon11; } @Override protected void onPostExecute(Bitmap result) { bmImage.setImageBitmap(result); } }

Adjusting the temperature

To change the car’s temperature, you can implement a function like this:

private void tempUpClicked() { mTemperature += 0.5f; new Thread(new Runnable() { @Override public void run() { doInBackground("http://192.168.1.53:8080/set_temp/" + String.format(Locale.US, "%.01f", mTemperature)); } }).start(); updateTemperature(); }

Endpoint overview

In the above examples, we used two endpoints: http://192.168.1.53:8080/ and http://192.168.1.53:8081/.

The first endpoint corresponds to the AAOS 14 and the server implemented in the VHAL , which handles commands for controlling car functions.

The second endpoint is the server implemented in the EVS Driver application. It retrieves images from the car’s camera and sends them as an HTTP response.

For more information on EVS setup in AAOS, you can refer to the articles " Android AAOS 14 - Surround View Parking Camera: How to Configure and Launch EVS (Exterior View System) " and " Android AAOS 14 - EVS network camera. "

EVS driver photo provider

In our example, the EVS Driver application is responsible for providing the photo from the car's camera. This application is located in the packages/services/Car/cpp/evs/sampleDriver/aidl/src directory. We will create a new thread within this application that runs an HTTP server. The server will handle requests for images using the v4l2 (Video4Linux2) interface.

Each HTTP request will initialize v4l2, set the image format to JPEG, and specify the resolution. After capturing the image, the data will be sent as a response, and the v4l2 stream will be stopped. Below is an example code snippet that demonstrates this process:

#include <errno.h> #include <fcntl.h> #include <linux/videodev2.h> #include <stdint.h> #include <stdio.h> #include <string.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <unistd.h> #include "cpp-httplib/httplib.h" #include <utils/Log.h> #include <android-base/logging.h> uint8_t *buffer; size_t bufferLength; int fd; static int xioctl(int fd, int request, void *arg) { int r; do r = ioctl(fd, request, arg); while (-1 == r && EINTR == errno); if (r == -1) { ALOGE("xioctl error: %d, %s", errno, strerror(errno)); } return r; } int print_caps(int fd) { struct v4l2_capability caps = {}; if (-1 == xioctl(fd, VIDIOC_QUERYCAP, &caps)) { ALOGE("Querying Capabilities"); return 1; } ALOGI("Driver Caps:\n" " Driver: \"%s\"\n" " Card: \"%s\"\n" " Bus: \"%s\"\n" " Version: %d.%d\n" " Capabilities: %08x\n", caps.driver, caps.card, caps.bus_info, (caps.version >> 16) & 0xff, (caps.version >> 24) & 0xff, caps.capabilities); v4l2_format format; format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; format.fmt.pix.pixelformat = V4L2_PIX_FMT_MJPEG; format.fmt.pix.width = 1280; format.fmt.pix.height = 720; LOG(INFO) << __FILE__ << ":" << __LINE__ << " Requesting format: " << ((char*)&format.fmt.pix.pixelformat)[0] << ((char*)&format.fmt.pix.pixelformat)[1] << ((char*)&format.fmt.pix.pixelformat)[2] << ((char*)&format.fmt.pix.pixelformat)[3] << "(" << std::hex << std::setw(8) << format.fmt.pix.pixelformat << ")"; if (ioctl(fd, VIDIOC_S_FMT, &format) < 0) { LOG(ERROR) << __FILE__ << ":" << __LINE__ << " VIDIOC_S_FMT failed " << strerror(errno); } format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; if (ioctl(fd, VIDIOC_G_FMT, &format) == 0) { LOG(INFO) << "Current output format: " << "fmt=0x" << std::hex << format.fmt.pix.pixelformat << ", " << std::dec << format.fmt.pix.width << " x " << format.fmt.pix.height << ", pitch=" << format.fmt.pix.bytesperline; if (format.fmt.pix.pixelformat == V4L2_PIX_FMT_MJPEG) { ALOGI("V4L2_PIX_FMT_MJPEG detected"); } if (format.fmt.pix.pixelformat == V4L2_PIX_FMT_YUYV) { ALOGI("V4L2_PIX_FMT_YUYV detected"); } } else { LOG(ERROR) << "VIDIOC_G_FMT failed"; } return 0; } int init_mmap(int fd) { struct v4l2_requestbuffers req{}; req.count = 1; req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; req.memory = V4L2_MEMORY_MMAP; if (-1 == xioctl(fd, VIDIOC_REQBUFS, &req)) { perror("Requesting Buffer"); return 1; } struct v4l2_buffer buf{}; buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; buf.memory = V4L2_MEMORY_MMAP; buf.index = 0; if (-1 == xioctl(fd, VIDIOC_QUERYBUF, &buf)) { perror("Querying Buffer"); return 1; } buffer = (uint8_t *)mmap(NULL, buf.length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, buf.m.offset); bufferLength = buf.length; ALOGI("Length: %d\nAddress: %p\n", buf.length, buffer); ALOGI("Image Length: %d\n", buf.bytesused); return 0; } size_t capture_image(int fd) { struct v4l2_buffer buf{}; buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; buf.memory = V4L2_MEMORY_MMAP; buf.index = 0; if (-1 == xioctl(fd, VIDIOC_QBUF, &buf)) { perror("Query Buffer"); return 0; } if (-1 == xioctl(fd, VIDIOC_STREAMON, &buf.type)) { perror("Start Capture"); return 0; } fd_set fds; FD_ZERO(&fds); FD_SET(fd, &fds); struct timeval tv{}; tv.tv_sec = 2; int r = select(fd + 1, &fds, NULL, NULL, &tv); if (-1 == r) { perror("Waiting for Frame"); return 0; } if (-1 == xioctl(fd, VIDIOC_DQBUF, &buf)) { perror("Retrieving Frame"); return 0; } return buf.bytesused; } bool initGetPhoto() { fd = open("/dev/video0", O_RDWR); if (fd == -1) { perror("Opening video device"); return false; } if (print_caps(fd)) return false; if (init_mmap(fd)) return false; return true; } bool closeGetPhoto() { int type = V4L2_BUF_TYPE_VIDEO_CAPTURE; if (ioctl(fd, VIDIOC_STREAMOFF, &type) == -1) { perror("VIDIOC_STREAMOFF"); } // Tell the L4V2 driver to release our streaming buffers v4l2_requestbuffers bufrequest; bufrequest.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; bufrequest.memory = V4L2_MEMORY_MMAP; bufrequest.count = 0; ioctl(fd, VIDIOC_REQBUFS, &bufrequest); close(fd); return true; } void getPhotoTask() { ALOGI("getPhotoTask starting "); ALOGI("HTTPServer starting "); httplib::Server svr; svr.Get("/", [](const httplib::Request &, httplib::Response &res) { ALOGI("HTTPServer New request /"); bool result = initGetPhoto(); ALOGI("initGetPhoto %b", result); size_t imgSize = capture_image(fd); ALOGI("capture_image %zu", imgSize); closeGetPhoto(); res.set_content((char *)buffer, imgSize, "image/jpeg"); }); ALOGI("HTTPServer listen"); svr.listen("0.0.0.0", 8081); }

How the code works

1. Initialization : The initGetPhoto() function opens the video device (/dev/video0) and sets up the necessary format and memory mappings for capturing images using the v4l2 interface.

2. Image Capture : The capture_image() function captures an image from the video stream. It uses select() to wait for the frame and then dequeues the buffer containing the image.

3. HTTP Server : The getPhotoTask() function starts an HTTP server using the cpp-httplib library. When a request is received, the server initializes the camera, captures an image, and sends it as a JPEG response.

4. Cleanup : After capturing the image and sending it, the closeGetPhoto() function stops the video stream, releases the buffers, and closes the video device.

This setup ensures that each image is captured on demand, allowing the application to control when the camera is active and minimizing unnecessary resource usage.

Conclusion

In this article, we walked through the process of creating an Android companion app that allows users to control HVAC functions and retrieve images from a car's camera system using a simple HTTP interface. The application was developed in Android Studio, where we designed a user-friendly interface and implemented functionality to adjust the vehicle's temperature and capture images remotely. On the server side, we extended the EVS Driver by incorporating a custom thread to handle HTTP requests and capture images using v4l2, providing a basic yet effective solution for remote vehicle interaction.

This project serves as a conceptual demonstration of integrating smartphone-based controls with automotive systems, but it’s important to recognize that there is significant potential for improvement and expansion. For instance, enhancing the data handling layer to provide more robust error checking, utilizing the HTTP/2 protocol for faster and more efficient communication, and creating a more seamless integration with the EVS Driver could greatly improve the performance and reliability of the system.

In its current form, this solution offers a foundational approach that could be expanded into a more sophisticated application, capable of supporting a wider range of automotive functions and delivering a more polished user experience. Future developments could also explore more advanced security features, improved data formats, and tighter integration with the broader ecosystem of Android Automotive OS to fully leverage the capabilities of modern vehicles.

written by

Michał Jaskurzyński

Software development

How to build real-time notification service using Server-Sent Events (SSE)

Most of the communication on the Internet comes directly from the clients to the servers. The client usually sends a request, and the server responds to that request. It is known as a client-server model, and it works well in most cases. However, there are some scenarios in which the server needs to send messages to the clients. In such cases, we have a couple of options: we can use short and long polling, webhooks, websockets, or event streaming platforms like Kafka. However, there is another technology, not popularized enough, which in many cases, is just perfect for the job. This technology is the Server-Sent Events (SSE) standard.

Learn more about services provided by Grape Up

You are at Grape Up blog, where our experts share their expertise gathered in projects delivered for top enterprises. See how we work.

Enabling the automotive industry to build software-defined vehicles
Empowering insurers to create insurance telematics platforms
Providing AI & advanced analytics consulting

What are Server-Sent Events?

SSE definition states that it is an http standard that allows a web application to handle a unidirectional event stream and receive updates whenever the server emits data. In simple terms, it is a mechanism for unidirectional event streaming.

Browsers support

It is currently supported by all major browsers except Internet Explorer.

Message format

The events are just a stream of UTF-8 encoded text data in a format defined by the Specification. The important aspect here is that the format defines the fields that the SSE message should have, but it does not mandate a specific type for the payload, leaving the freedom of choice to the users.

{
"id": "message id <optional>",
"event": "event type <optional>",
"data": "event data –plain text, JSON, XML… <mandatory>"
}

SSE Implementation

For the SSE to work, the server needs to tell the client that the response’s content-type is text/eventstream . Next, the server receives a regular HTTP request, and leaves the HTTP connection open until no more events are left or until the timeout occurs. If the timeout occurs before the client receives all the events it expects, it can use the built-in reconnection mechanism to reestablish the connection.

Simple endpoint (Flux):

The simplest implementation of the SSE endpoint in Spring can be achieved by:

Specifying the produced media type as text/event-stream,
Returning Flux type, which is a reactive representation of a stream of events in Java.

@GetMapping(path = "/stream-flux", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> streamFlux() {
return Flux.interval(Duration.ofSeconds(1))
.map(sequence -> "Flux - " + LocalTime.now().toString());

ServerSentEvent class:

Spring introduced support for SSE specification in version 4.2 together with a ServerSentEvent class. The benefit here is that we can skip the text/event-stream media type explicit specification, as well as we can add metadata such as id or event type.

@GetMapping("/sse-flux-2")
public Flux<ServerSentEvent> sseFlux2() {
return Flux.interval(Duration.ofSeconds(1))
.map(sequence -> ServerSentEvent.builder()
.id(String.valueOf(sequence))
.event("EVENT_TYPE")
.data("SSE - " + LocalTime.now().toString())
.build());
}

SseEmitter class:

However, the full power of SSE comes with the SseEmitter class. It allows for asynchronous processing and publishing of the events from other threads. What is more, it is possible to store the reference to SseEmitter and retrieve it on subsequent client calls. This provides a huge potential for building powerful notification scenarios.

@GetMapping("/sse-emitter")
public SseEmitter sseEmitter() {
SseEmitter emitter = new SseEmitter();
Executors.newSingleThreadExecutor().execute(() -> {
try {
for (int i = 0; true; i++) {
SseEmitter.SseEventBuilder event = SseEmitter.event()
.id(String.valueOf(i))
.name("SSE_EMITTER_EVENT")
.data("SSE EMITTER - " + LocalTime.now().toString());
emitter.send(event);
Thread.sleep(1000);
}
} catch (Exception ex) {
emitter.completeWithError(ex);
}
});
return emitter;
}

Client example:

Here is a basic SSE client example written in Javascript. It simply defines an EventSource and subscribes to the message event stream in two different ways.

// Declare an EventSource
const eventSource = new EventSource('http://some.url');
// Handler for events without an event type specified
eventSource.onmessage = (e) => {
// Do something - event data etc will be in e.data
};
// Handler for events of type 'eventType' only
eventSource.addEventListener('eventType', (e) => {
// Do something - event data will be in e.data,
// message will be of type 'eventType'
});

SSE vs. Websockets

When it comes to SSE, it is often compared to Websockets due to usage similarities between both of the technologies.

Both are capable of pushing data to the client,
Websockets are bidirectional – SSE unidirectional,
In practice, everything that can be done with SSE, and can also be achieved with Websockets,
SSE can be easier,
SSE is transported over a simple HTTP connection,
Websockets require full duplex-connection and servers to handle the protocol,
Some enterprise firewalls with packet inspection have trouble dealing with Websockets – for SSE that’s not the case,
SSE has a variety of features that Websockets lack by design, e.g., automatic reconnection, event ids,
Only Websockets can send both binary and UTF-8 data, SSE is limited to UTF-8,
SSE suffers from a limitation to the maximum number of open connections (6 per browser + domain). The issue was marked as Won’t fix in Chrome and Firefox.

Use Cases:

Notification Service Example:

A controller providing a subscribe to events and a publish events endpoints.

@Slf4j
@RestController
@RequestMapping("/events")
@RequiredArgsConstructor
public class EventController {
public static final String MEMBER_ID_HEADER = "MemberId";

private final EmitterService emitterService;
private final NotificationService notificationService;

@GetMapping
public SseEmitter subscribeToEvents(@RequestHeader(name = MEMBER_ID_HEADER) String memberId) {
log.debug("Subscribing member with id {}", memberId);
return emitterService.createEmitter(memberId);
}

@PostMapping
@ResponseStatus(HttpStatus.ACCEPTED)
public void publishEvent(@RequestHeader(name = MEMBER_ID_HEADER) String memberId, @RequestBody EventDto event) {
log.debug("Publishing event {} for member with id {}", event, memberId);
notificationService.sendNotification(memberId, event);
}
}

A service for sending the events:

@Service
@Primary
@AllArgsConstructor
@Slf4j
public class SseNotificationService implements NotificationService {

private final EmitterRepository emitterRepository;
private final EventMapper eventMapper;

@Override
public void sendNotification(String memberId, EventDto event) {
if (event == null) {
log.debug("No server event to send to device.");
return;
}
doSendNotification(memberId, event);
}

private void doSendNotification(String memberId, EventDto event) {
emitterRepository.get(memberId).ifPresentOrElse(sseEmitter -> {
try {
log.debug("Sending event: {} for member: {}", event, memberId);
sseEmitter.send(eventMapper.toSseEventBuilder(event));
} catch (IOException | IllegalStateException e) {
log.debug("Error while sending event: {} for member: {} - exception: {}", event, memberId, e);
emitterRepository.remove(memberId);
}
}, () -> log.debug("No emitter for member {}", memberId));
}
}

To sum up, Server-Sent Events standard is a great technology when it comes to a unidirectional stream of data and often can save us a lot of trouble compared to more complex approaches such as Websockets or distributed streaming platforms.

A full notification service example implemented with the use of Server-Sent Events can be found on my github: https://github.com/mkapiczy/server-sent-events

If you're looking to build a scalable, real-time notification system or need expert guidance on modern software solutions, Grape Up can help . Our engineering teams help enterprises design, develop, and optimize their software infrastructure.

Get in touch to discuss your project and see how we can support your business.

Sources:

https://www.baeldung.com/spring-server-sent-events
https://www.w3.org/TR/eventsource/
https://stackoverflow.com/questions/5195452/websockets-vs-server-sent-events-eventsource
https://www.telerik.com/blogs/websockets-vs-server-sent-events
https://simonprickett.dev/a-look-at-server-sent-events/

written by

Grape up Expert

Automotive

Software development

Generative AI for connected cars: Solution-oriented chatbots for personalized user support

Generative AI is becoming a major player in automotive innovation. The market is already valued at USD 480.22 million in 2024 , and it’s expected to grow to USD 3,900.03 million by 2034, with a steady annual growth rate of 23.3%. Moreover, by 2025, the global automobile sector will invest $11.1 billion in cognitive and AI technologies. These numbers show how quickly the industry is picking up on this technology’s potential.

GenAI is making its mark across various areas. From manufacturing optimization to autonomous driving, its impact is undeniable. Predictive maintenance systems identify issues early, AI-powered tools optimize vehicle development, and talking to in-car assistants is starting to feel like a scene out of a sci-fi movie.

Speaking of sci-fi, pop culture has always loved the idea of talking cars. There is K.I.T.T. (Knight Industries Two Thousand), of course, but also all Transformers and tons of cartoons, starting with Lightning McQueen. Is it just pure fiction? Not at all (except McQueen, for many reasons 😊)! Early attempts at smarter cars started with examples like a 2004 Honda offering voice-controlled navigation and Ford’s 2007 infotainment system. Fast forward to now, and we have a VW Golf with a GPT-based assistant that’s more conversational than ever.

But honestly, the most resourceful one is K.I.T.T. – it activates all onboard systems, diagnoses itself, and uses company resources (there is an episode when K.I.T.T. withdraws money from the company bank account using an ATM). In 1982, when the show first aired, it was just pure science fiction. But what about now? Is it more science or fiction? With Generative AI growing rapidly in automotive, we have to revisit that question.

Let’s break it down!

Prerequisites

Let’s assume we would like to create a solution-oriented chatbot connected with a car. By “solution-oriented,” I mean one that is really useful, able not only to change the attractive interior lighting but also to truly solve owners’ issues.

The idea is to use Generative AI, a large language model with its abilities in reasoning, problem-solving, and language processing.

Therefore, the first question is – where should the model be planted – in the cloud or a car?

For the first option, you need a constant Internet connection (which is usually not guaranteed in cars). In contrast, the second option typically involves a smaller and less versatile model, and you still need a lot of resources (hardware, power) to run it. The truth lies, as usual, in between (cloud model if available, local one otherwise), but today we’ll focus on the cloud model only.

The next step is to consider the user-facing layer. The perfect one is integrated into the car, isn’t it? Well, in most cases, yes, but there are some drawbacks.

The first issue is user-oriented – if you want to interact with your car when being outside of it, your mobile phone is probably the most convenient option (or a smartwatch, like Michael from Knight Rider). Also, infotainment systems are comprehensively tested and usually heavily sealed into cars, so introducing such a bot is very time-consuming. Therefore, the mobile phone is our choice.

We don’t want to focus on this application today, however. Depending on the target operating system, it probably should use speech-to-text recognition and text-to-speech generation and stream data both ways for a better user experience.

The core part is the chatbot backend – a regular application connecting the frontend and the LLM. It should be able to call external APIs and use two sources of knowledge – live car data and company-owned data sources.

Basics

Let’s gather the components. There is a customer-facing layer – the mobile application; then there is our main backend application, the LLM, of course, and some services to provide data and functionalities.

The diagram above is conceptual, of course. The backend is probably cloud-hosted, too, and cloud services linked to car services form the essence of the “connected cars” pattern.

The main concept for the application is “tool calling” – the LLM ability to call predefined functions with structuralized arguments. That’s why the backend is surrounded by different services. In a perfect world, those should be separated microservices designed for different use cases. However, this architecture is not scenario-based. There is no “if-else-if” ladder or so. The LLM determines how to utilize the tools based on its own decision-making process.

The sample conversation schema might look like the one presented below.

As you can see, the chatbot service calls the LLM, and the LLM returns command “call function A.” Then, the service calls the function and returns the response to the LLM (not the user!).

This approach is very flexible as functions (a.k.a. tools) might execute actions and return useful data. Also, the LLM may decide to use a function based on another function result. In the case above, it can, for example, use one function to check the climate control system status and discover that it’s running in the “eco mode”. Then, it might decide to call the “set mode” function with the argument “max AC” to change the mode. After that, the LLM can return an answer to the user with a message like “It should be fixed now”.

To build such an application, all you need to call the LLM like that (OpenAI GPT-4o example):

{ "model": "gpt-4o", "messages": [ { "role": "user", "content": "My AC is ineffective! Fix it!" } ], "tools": [ { "type": "function", "function": { "name": "get AC status", "description": "Return current status of the climate control system" } }, { "type": "function", "function": { "name": "set AC mode", "description": "Sets up the specified mode for the climate control system", "parameters": { "type": "object", "properties": { "mode": { "type": "string", "description": "Desired mode", "enum": ["ECO", “NORMAL”, "MAX AC"] } }, "required": ["mode"] } } } ], "tool_choice": "auto" }

As you can see, the response schema does not bother us here – the assumption is that the LLM is able to understand any reasonable response.

Dive

The subtitle should be a “deep dive”, but honestly, we’re just scratching the surface today. Nevertheless, let’s focus a little bit more.

So far, we have the user-facing application and the backend service. Now, let’s make it useful.

The AC example mentioned above is perfectly valid, but how can it be achieved? Let’s say there is an API for interaction with the AC in the car. It’s typical for all PHEVs and EVs and available for some HEVs, too, when you can turn on your AC remotely via the mobile app. However, the real value lies in the connected car

There is no IP address of the car hardcoded in the application. Usually, there is a digital twin in the cloud (a cloud service that represents the car). The application calls the twin, and the twin notifies the vehicle. There should also be some pub/sub queue in between to handle connectivity tier disruptions. Also, the security layer is extremely important. We don’t want anybody even to play the radio at max volume during a quiet night ride, not to mention turning off the lights or engaging breaks.

Which brings us to the list of possible actions.

Let’s assume all systems in the car are somehow connected, maybe using a common bus or a more modern ethernet-like network. Still, some executors, such as brakes, should be isolated from the system.

So, there is no “brake API” to stop a car. However, it may be beneficial for mechanics to execute some "dangerous" actions programmatically, e.g., to increase the pressure in the braking system without actually pressing the pedal. If this is the case, such functionalities should be accessible exclusively through a local connection without the need for digital twin integration. Therefore, we can assume there are two systems in the car – local and cloud-integrated, no matter the nature of the isolation (physical, network, or software). Let’s focus on the connected car aspect.

I believe the system should be able to change the vehicle settings, even if there is a risk that the driver could be surprised by an unauthorized change in the steering feel while taking a turn. This way, the chatbot might be useful and reduce support load by adjusting car settings based on the user's preferences. To avoid misusage, we can instruct the chatbot by prompt engineering to confirm each change with the user before execution and, of course, implement best-in-class security for all components. We can also allow certain operations only if the car is parked.

Which brings us back to the list of possible actions.

For the sake of this article, let’s assume the chatbot can change various car settings. Examples include:

Climate control settings
Driver assistant sensitivity and specific functions toggles
Navigation System settings, like route type or other functions toggles
360 camera system settings, like brightness adjustment
Sound system settings like equalizer
Wiper settings
Notifications settings
Active steering system settings

This list is not complete, and the best thing is – it doesn’t need to be, as adding new functions (tool definition + API availability) might be a part of the future system OVA update.

What about reading real-time data? Should we connect to the car directly and read the status? Let’s leave this option for another article 😉 and focus on communication via the cloud.

There are two possibilities.

We can provide more tools to get data per source/component (a reminder – LLM decides to call for data, which then triggers an API call, and the LLM processes the received response). Alternatively, we could implement a single tool, “get vehicle data,” that collects and merges all data available from all data sources.

For the latter approach, two ways are available – do we really need a tool? Maybe we should inject the current state into each conversation, as it’s probably beneficial to have the current state anyway to solve all cases?

Let me give the standard consultant reply to those questions.

It depends.

More data in the context means extended response time and a higher bill. Also, some cases don’t need the entire context or don’t need it at all. On the other hand, if you let the LLM decide which part of the context is necessary (which function to call), it will also affect the time and cost.

The next part is the “cost” of collecting data. Some data sources might be slow or consume valuable in-car resources to calculate (maybe not appropriate) data.

My advice would be to start collecting all data in the background during session creation and attach it to the session progressively when ready. Additionally, provide the LLM with the capability to wait for data if it has not yet been supplied. This should be implemented as a function to handle instances where the user requests data-related action, but the necessary data is not yet available

The second part is splitting the data. You can use a much smaller LLM to decide which context data is necessary to answer the current question and remove unnecessary information to keep it concise.

The last part is company data. All LLM models are trained based on thousands of manuals available over the Internet, but this is usually not sufficient. Your chatbot should have the car manual at its fingertips, so the general knowledge is not enough. Therefore, you need to provide it. You can execute the RAG procedure and add some content to the context for each prompt, but my advice would be to hide it behind a function, too. Then, the LLM may decide when to call it to get some extra information if needed.

Use cases of Generative AI for connected cars

OK, we have all the components in place – there is the frontend, the backend, the model, and the toolbox. Can it be useful? As you would expect!

In the examples below, I will skip the part where the LLM commands the backend to perform an action, and the backend calls an API. I will also omit the process of the API response traveling back through the backend to the LLM. Additionally, I will assume that no new data is added to the context, so the LLM must decide which data it needs. Besides that, make sure that your context contains some system prompts to better explain all possibilities to the assistant.

Here are some examples.

User: I think my Lane Keeping Assist activates too frequently. Can you check it?
Assistant: [get driver assistance settings]
Assistant: [diagnose sensors]
Assistant: Everything looks good, but I can see that the Lane Keeping Assist mode is set to “fast”. Do you want me to lower the setting to “medium”?
User: Yes, please.
Assistant: [change LKA settings]
Assistant: It should be better now.

In the example above, the user has a problem with one of the driver assistance components. The system proposes and, after confirmation, executes settings changes, so the driver doesn’t need to find a proper setting in the on-board computer.

Check another case.

User: Whenever I’m parking the car in my garage, I can barely see anything in my rear camera. There is not much light there, but maybe you help me in some other way?
Assistant: [get camera settings]
Assistant: I can increase the brightness of the camera, but then it might be too bright for other places. What do you think?
User: Yes, please do that.
Assistant: [change settings]
Assistant: Here you are; let me know if you want me to roll back the change.

Usually, there is no brightness setting available for the user. To adjust it, the user would need to schedule a service department meeting, spend some time, and pay for the facility. The assistant may do it in a few seconds for free.

User: There are so many notifications in the car, so I can’t hear the important ones!
Assistant: [check last notifications]
Assistant: If you find some of my notifications unnecessary, I can mute some of them. What do you think?
User: Yes, please turn off the speeding avoidance warning.
Assistant: I can’t turn it off due to legal requirements, but I can replace the sound with vibration. feedback on the acceleration pedal. Do you want me to do that?
User: Sounds great!
Assistant: [adjust settings]
Assistant: It’s done. I hope you’ll be less distracted.

In this case, the assistant solves the issue of annoying notifications. Note the alternative solution suggestion and knowledge about the legal aspect. We can achieve it with proper function descriptions.

User: It seems my car can’t connect to the Internet.
Assistant: [check the connection] -> no connection to the car from the cloud service
Assistant: It seems your car is disconnected. Can you please check the cellular network signal icon on the infotainment main screen?
User: I can see the signal is good.
Assistant: [check last logs]
Assistant: I can see there were some issues with your connection before. Please try to open the settings in the car infotainment and restart the communication module. I’ll guide you through the process.
User: OK, what should I do?
Assistant: [check user manual]
Assistant: Go to “Settings”, then “Connectivity”, then “Cellular” and then click “Restart module”. Confirm when asked.
User: OK
Assistant: [check the connection]
Assistant: I can see the connection is restored. Can you confirm that? If not, I can schedule a service meeting for you.
User: No need; it works now!

In the last example, the Assistant guides the user using logs, the manual, and after checking real-time conditions. Without the LLM, the user would need to visit the dealer or at least use online support assistance.

Final words

With modern technology, we can bring some science fiction concepts to real life. Maybe WARP speed is not possible yet, but at least a car itself can give us solutions to problems recently solved only in much more expensive ways.

The beauty of it is – it’s really simple. Of course, there is a user-facing application to be made, the security to be ensured, and a bunch of functions to be implemented, but with modern connected cars, those elements are in place anyway. The tricky, novel part is a very simple application that uses the LLM as the brain power of the system.

As you can see the “ Attention is all you need ” paper that started the LLM revolution has allowed humanity to bring to life concepts present in our culture for decades. On the other hand, would this article have been ever written if its authors hadn’t watched the K.I.T.T. in their childhood? We will never know.

written by

Damian Petrecki

Legacy modernization

Software development

LLM comparison: Find the best fit for legacy system rewrites

Legacy systems often struggle with performance, are vulnerable to security issues, and are expensive to maintain. Despite these challenges, over 65% of enterprises still rely on them for critical operations.

At the same time, modernization is becoming a pressing business need, with the application modernization services market valued at $17.8 billion in 2023 and expected to grow at a CAGR of 16.7%.

This growth highlights a clear trend: businesses recognize the need to update outdated systems to keep pace with industry demands.

The journey toward modernization varies widely. While 75% of organizations have started modernization projects, only 18% have reached a state of continuous improvement.

Data source: https://www.redhat.com/en/resources/app-modernization-report

For many, the process remains challenging, with a staggering 74% of companies failing to complete their legacy modernization efforts. Security and efficiency are the primary drivers, with over half of surveyed companies citing these as key motivators.

Given these complexities, the question arises: Could Generative AI simplify and accelerate this process?

With the surging adoption rates of AI technology, it’s worth exploring if Generative AI has a role in rewriting legacy systems.

This article explores LLM comparison, evaluating GenAI tools' strengths, weaknesses, and potential risks. The decision to use them ultimately lies with you.

Here's what we'll discuss:

Why Generative AI?
The research methodology
Generative AI tools: six contenders for LLM comparison
- OpenAI backed by ChatGPT-4o
- Claude-3-sonnet
- Claude-3-opus
- Claude-3-haiku
- Gemini 1.5 Flash
- Gemini 1.5 Pro
Comparison summary

Why Generative AI?

Traditionally, updating outdated systems has been a labor-intensive and error-prone process. Generative AI offers a solution by automating code translation, ensuring consistency and efficiency. This accelerates the modernization of legacy systems and supports cross-platform development and refactoring.

As businesses aim to remain competitive, using Generative AI for code transformation is crucial, allowing them to fully use modern technologies while reducing manual rewrite risks.

Here are key reasons to consider its use:

Uncovering dependencies and business logic - Generative AI can dissect legacy code to reveal dependencies and embedded business logic, ensuring essential functionalities are retained and improved in the updated system.

Decreased development time and expenses - automation drastically reduces the time and resources required for system re-writing. Quicker development cycles and fewer human hours needed for coding and testing decrease the overall project cost.

Consistency and accuracy - manual code translation is prone to human error. AI models ensure consistent and accurate code conversion, minimizing bugs and enhancing reliability.

Optimized performance - Generative AI facilitates the creation of optimized code from the beginning, incorporating advanced algorithms that enhance efficiency and adaptability, often lacking in older systems.

The LLM comparison research methodology

It could be tough to compare different Generative AI models to each other. It’s hard to find the same criteria for available tools. Some are web-based, some are restricted to a specific IDE, some offer a “chat” feature, and others only propose a code.

As our goal was the re-writing of existing projects , we aimed to create an LLM comparison based on the following six main challenges while working with existing code:

Analyzing project architecture - understanding the architecture is crucial for maintaining the system's integrity during re-writing. It ensures the new code aligns with the original design principles and system structure.

Analyzing data flows - proper analysis of data flows is essential to ensure that data is processed correctly and efficiently in the re-written application. This helps maintain functionality and performance.

Generating historical b acklog - this involves querying the Generative AI to create Jira (or any other tracking system) tickets that could potentially be used to rebuild the system from scratch. The aim is to replicate the workflow of the initial project implementation. These "tickets" should include component descriptions and acceptance criteria.

Converting code from one programming language to another - language conversion is often necessary to leverage modern technologies. Accurate translation preserves functionality and enables integration with contemporary systems.

Generating new code - the ability to generate new code, such as test cases or additional features, is important for enhancing the application's capabilities and ensuring comprehensive testing.

Privacy and security of a Generative AI tool - businesses are concerned about sharing their source codebase with the public internet. Therefore, work with Generative AI must occur in an isolated environment to protect sensitive data.

Source projects overview

To test the capabilities of Generative AI, we used two projects:

Simple CRUD application - The project utilizes .Net Core as its framework, with Entity Framework Core serving as the ORM and SQL Server as the relational database. The target application is a backend system built with Java 17 and Spring Boot 3.

Microservice-based application - The application is developed with .Net Core as its framework, Entity Framework Core as the ORM, and the Command Query Responsibility Segregation (CQRS) pattern for handling entity operations. The target system includes a microservice-based backend built with Java 17 and Spring Boot 3, alongside a frontend developed using the React framework

Generative AI tools: six contenders for LLM comparison

In this article, we will compare six different Generative AI tools used in these example projects:

OpenAI backed by ChatGPT-4o with a context of 128k tokens
Claude-3-sonnet - context of 200k tokens
Claude-3-opus - context of 200k tokens
Claude-3-haiku - context of 200k tokens
Gemini 1.5 Flash - context of 1M tokens
Gemini 1.5 Pro - context of 2M tokens

OpenAI

OpenAI's ChatGPT-4o represents an advanced language model that showcases the leading edge of artificial intelligence technology. Known for its conversational prowess and ability to manage extensive contexts, it offers great potential for explaining and generating code.

Analyzing project architecture

ChatGPT faces challenges in analyzing project architecture due to its abstract nature and the high-level understanding required. The model struggles with grasping the full context and intricacies of architectural design, as it lacks the ability to comprehend abstract concepts and relationships not explicitly defined in the code.

Analyzing data flows

ChatGPT performs better at analyzing data flows within a program. It can effectively trace how data moves through a program by examining function calls, variable assignments, and other code structures. This task aligns well with ChatGPT's pattern recognition capabilities, making it a suitable application for the model.

Generating historical backlog

When given a project architecture as input, OpenAI can generate high-level epics that capture the project's overall goals and objectives. However, it struggles to produce detailed user stories suitable for project management tools like Jira, often lacking the necessary detail and precision for effective use.

Converting code from one programming language to another

ChatGPT performs reasonably well in converting code, such as from C# to Java Spring Boot, by mapping similar constructs and generating syntactically correct code. However, it encounters limitations when there is no direct mapping between frameworks, as it lacks the deep semantic understanding needed to translate unique framework-specific features.

Generating new code

ChatGPT excels in generating new code, particularly for unit tests and integration tests. Given a piece of code and a prompt, it can generate tests that accurately verify the code's functionality, showcasing its strength in this area.

Privacy and security of the Generative AI tool

OpenAI's ChatGPT, like many cloud-based AI services, typically operates over the internet. However, there are solutions to using it in an isolated private environment without sharing code or sensitive data on the public internet. To achieve this, on-premise deployments such as Azure OpenAI can be used, a service offered by Microsoft where OpenAI models can be accessed within Azure's secure cloud environment.

Best tip

Use Reinforcement Learning from Human Feedback (RLHF): If possible, use RLHF to fine-tune GPT-4. This involves providing feedback on the AI's outputs, which it can then use to improve future outputs. This can be particularly useful for complex tasks like code migration.

Overall

OpenAI's ChatGPT-4o is a mature and robust language model that provides substantial support to developers in complex scenarios. It excels in tasks like code conversion between programming languages, ensuring accurate translation while maintaining functionality.

Possibilities 3/5
Correctness 3/5
Privacy 5/5
Maturity 4/5

Overall score: 4/5

Claude-3-sonnet

Claude-3-Sonnet is a language model developed by Anthropic, designed to provide advanced natural language processing capabilities. Its architecture is optimized for maintaining context over extended interactions, offering a balance of intelligence and speed.

Analyzing project architecture

Claude-3-Sonnet excels in analyzing and comprehending the architecture of existing projects. When presented with a codebase, it provides detailed insights into the project's structure, identifying components, modules, and their interdependencies. Claude-3-Sonnet offers a comprehensive breakdown of project architecture, including class hierarchies, design patterns, and architectural principles employed.

Analyzing data flows

It struggles to grasp the full context and nuances of data flows, particularly in complex systems with sophisticated data transformations and conditional logic. This limitation can pose challenges when rewriting projects that heavily rely on intricate data flows or involve sophisticated data processing pipelines, necessitating manual intervention and verification by human developers.

Generating historical backlog

Claude-3-Sonnet can provide high-level epics that cover main functions and components when prompted with a project's architecture. However, they lack detailed acceptance criteria and business requirements. While it may propose user stories to map to the epics, these stories will also lack the details needed to create backlog items. It can help capture some user goals without clear confirmation points for completion.

Converting code from one programming language to another

Claude-3-Sonnet showcases impressive capabilities in converting code, such as translating C# code to Java Spring Boot applications. It effectively translates the logic and functionality of the original codebase into a new implementation, leveraging framework conventions and best practices. However, limitations arise when there is no direct mapping between frameworks, requiring additional manual adjustments and optimizations by developers.

Generating new code

Claude-3-Sonnet demonstrates remarkable proficiency in generating new code, particularly in unit and integration tests. The AI tool can analyze existing codebases and automatically generate comprehensive test suites covering various scenarios and edge cases.

Privacy and security of the Generative AI tool

Unfortunately, Anthropic's privacy policy is quite confusing. Before January 2024, they used clients’ data to train their models. The updated legal document ostensibly provides protections and transparency for Anthropic's commercial clients, but it’s recommended to consider the privacy of your data while using Claude.

Best tip

Be specific and detailed : provide the GenerativeAI with specific and detailed prompts to ensure it understands the task accurately. This includes clear descriptions of what needs to be rewritten, any constraints, and desired outcomes.

Overall

The model's ability to generate coherent and contextually relevant content makes it a valuable tool for developers and businesses seeking to enhance their AI-driven solutions. However, the model might have difficulty fully grasping intricate data flows, especially in systems with complex transformations and conditional logic.

Possibilities 3/5
Correctness 3/5
Privacy 3/5
Maturity 3/5

Overall score: 3/5

Claude-3-opus

Claude-3-Opus is another language model by Anthropic, designed for handling more extensive and complex interactions. This version of Claude models focuses on delivering high-quality code generation and analysis with high precision.

Analyzing project architecture

With its advanced natural language processing capabilities, it thoroughly examines the codebase, identifying various components, their relationships, and the overall structure. This analysis provides valuable insights into the project's design, enabling developers to understand the system's organization better and make decisions about potential refactoring or optimization efforts.

Analyzing data flows

While Claude-3-Opus performs reasonably well in analyzing data flows within a project, it may lack the context necessary to fully comprehend all possible scenarios. However, compared to Claude-3-sonnet, it demonstrates improved capabilities in this area. By examining the flow of data through the application, it can identify potential bottlenecks, inefficiencies, or areas where data integrity might be compromised.

Generating historical backlog

By providing the project architecture as an input prompt, it effectively creates high-level epics that encapsulate essential features and functionalities. One of its key strengths is generating detailed and precise acceptance criteria for each epic. However, it may struggle to create granular Jira user stories. Compared to other Claude models, Claude-3-Opus demonstrates superior performance in generating historical backlog based on project architecture.

Converting code from one programming language to another

Claude-3-Opus shows promising capabilities in converting code from one programming language to another, particularly in converting C# code to Java Spring Boot, a popular Java framework for building web applications. However, it has limitations when there is no direct mapping between frameworks in different programming languages.

Generating new code

The AI tool demonstrates proficiency in generating both unit tests and integration tests for existing codebases. By leveraging its understanding of the project's architecture and data flows, Claude-3-Opus generates comprehensive test suites, ensuring thorough coverage and improving the overall quality of the codebase.

Privacy and security of the Generative AI tool

Like other Anthropic models, you need to consider the privacy of your data. For specific details about Anthropic's data privacy and security practices, it would be better to contact them directly.

Best tip

Break down the existing project into components and functionality that need to be recreated. Reducing input complexity minimizes the risk of errors in output.

Overall

Claude-3-Opus's strengths are analyzing project architecture and data flows, converting code between languages, and generating new code, which makes the development process easier and improves code quality. This tool empowers developers to quickly deliver high-quality software solutions.

Possibilities 4/5
Correctness 4/5
Privacy 3/5
Maturity 4/5

Overall score: 4/5

Claude-3-haiku

Claude-3-Haiku is part of Anthropic's suite of Generative AI models, declared as the fastest and most compact model in the Claude family for near-instant responsiveness. It excels in answering simple queries and requests with exceptional speed.

Analyzing project architecture

Claude-3-Haiku struggles with analyzing project architecture. The model tends to generate overly general responses that closely resemble the input data, limiting its ability to provide meaningful insights into a project's overall structure and organization.

Analyzing data flows

Similar to its limitations in project architecture analysis, Claude-3-Haiku fails to effectively group components based on their data flow relationships. This lack of precision makes it difficult to clearly understand how data moves throughout the system.

Generating historical backlog

Claude-3-Haiku is unable to generate Jira user stories effectively. It struggles to produce user stories that meet the standard format and detail required for project management. Additionally, its performance generating high-level epics is unsatisfactory, lacking detailed acceptance criteria and business requirements. These limitations likely stem from its training data, which focused on short forms and concise prompts, restricting its ability to handle more extensive and detailed inputs.

Converting code from one programming language to another

Claude-3-Haiku proved good at converting code between programming languages, demonstrating an impressive ability to accurately translate code snippets while preserving original functionality and structure.

Generating new code

Claude-3-Haiku performs well in generating new code, comparable to other Claude-3 models. It can produce code snippets based on given requirements or specifications, providing a useful starting point for developers.

Privacy and security of the Generative AI tool

Similar to other Anthropic models, you need to consider the privacy of your data, although according to official documentation, Claude 3 Haiku prioritizes enterprise-grade security and robustness. Also, keep in mind that security policies may vary for different Anthropic models.

Best tip

Be aware of Claude-3-haiku capabilities : Claude-3-haiku is a natural language processing model trained on short form. It is not designed for complex tasks like converting a project from one programming language to another.

Overall

Its fast response time is a notable advantage, but its performance suffers when dealing with larger prompts and more intricate tasks. Other tools or manual analysis may prove more effective in analyzing project architecture and data flows. However, Claude-3-Haiku can be a valuable asset in a developer's toolkit for straightforward code conversion and generation tasks.

Possibilities 2/5
Correctness 2/5
Privacy 3/5
Maturity 2/5

Overall score: 2/5

Gemini 1.5 Flash

Gemini 1.5 Flash represents Google's commitment to advancing AI technology; it is designed to handle a wide range of natural language processing tasks, from text generation to complex data analysis. Google presents Gemini Flash as a lightweight, fast, and cost-efficient model featuring multimodal reasoning and a breakthrough long context window of up to one million tokens.

Analyzing project architecture

Gemini Flash's performance in analyzing project architecture was found to be suboptimal. The AI tool struggled to provide concrete and actionable insights, often generating abstract and high-level observations instead.

Analyzing data flows

It effectively identified and traced the flow of data between different components and modules, offering developers valuable insights into how information is processed and transformed throughout the system. This capability aids in understanding the existing codebase and identifying potential bottlenecks or inefficiencies. However, the effectiveness of data flow analysis may vary depending on the project's complexity and size.

Generating historical backlog

Gemini Flash can synthesize meaningful epics that capture overarching goals and functionalities required for the project by analyzing architectural components, dependencies, and interactions within a software system. However, it may fall short of providing granular acceptance criteria and detailed business requirements. The generated epics often lack the precision and specificity needed for effective backlog management and task execution, and it struggles to generate Jira user stories.

Converting code from one programming language to another

Gemini Flash showed promising results in converting code from one programming language to another, particularly when translating from C# to Java Spring Boot. It successfully mapped and transformed language-specific constructs, such as syntax, data types, and control structures. However, limitations exist, especially when dealing with frameworks or libraries that do not have direct equivalents in the target language.

Generating new code

Gemini Flash excels in generating new code, including test cases and additional features, enhancing application reliability and functionality. It analyzed the existing codebase and generated test cases that cover various scenarios and edge cases.

Privacy and security of the Generative AI tool

Google was one of the first in the industry to publish an AI/ML privacy commitment , which outlines our belief that customers should have the highest level of security and control over their data stored in the cloud. That commitment extends to Google Cloud Generative AI products. You can set up a Gemini AI model in Google Cloud and use an encrypted TLS connection over the internet to connect from your on-premises environment to Google Cloud.

Best tip

Use prompt engineering: Starting by providing necessary background information or context within the prompt helps the model understand the task's scope and nuances. It's beneficial to experiment with different phrasing and structures; refining prompts iteratively based on the quality of the outputs. Specifying any constraints or requirements directly in the prompt can further tailor the model's output to meet your needs.

Overall

By using its AI capabilities in data flow analysis, code translation, and test creation, developers can optimize their workflow and concentrate on strategic tasks. However, it is important to remember that Gemini Flash is optimized for high-speed processing, which makes it less effective for complex tasks.

Possibilities 2/5
Correctness 2/5
Privacy 5/5
Maturity 2/5

Overall score: 2/5

Gemini 1.5 Pro

Gemini 1.5 Pro is the largest and most capable model created by Google, designed for handling highly complex tasks. While it is the slowest among its counterparts, it offers significant capabilities. The model targets professionals and developers needing a reliable assistant for intricate tasks.

Analyzing project architecture

Gemini Pro is highly effective in analyzing and understanding the architecture of existing programming projects, surpassing Gemini Flash in this area. It provides detailed insights into project structure and component relationships.

Analyzing data flows

The model demonstrates proficiency in analyzing data flows, similar to its performance in project architecture analysis. It accurately traces and understands data movement throughout the codebase, identifying how information is processed and exchanged between modules.

Generating historical backlog

By using project architecture as an input, it creates high-level epics that encapsulate main features and functionalities. While it may not generate specific Jira user stories, it excels at providing detailed acceptance criteria and precise details for each epic.

Converting code from one programming language to another

The model shows impressive results in code conversion, particularly from C# to Java Spring Boot. It effectively maps and transforms syntax, data structures, and constructs between languages. However, limitations exist when there is no direct mapping between frameworks or libraries.

Generating now code

Gemini Pro excels in generating new code, especially for unit and integration tests. It analyzes the existing codebase, understands functionality and requirements, and automatically generates comprehensive test cases.

Privacy and security of the Generative AI tool

Similarly to other Gemini models, Gemini Pro is packed with advanced security and data governance features, making it ideal for organizations with strict data security requirements.

Best tip

Manage context: Gemini Pro incorporates previous prompts into its input when generating responses. This use of historical context can significantly influence the model's output and lead to different responses. Include only the necessary information in your input to avoid overwhelming the model with irrelevant details.

Overall

Gemini Pro shows remarkable capabilities in areas such as project architecture analysis, data flow understanding, code conversion, and new code generation. However, there may be instances where the AI encounters challenges or limitations, especially with complex or highly specialized codebases. As such, while Gemini Pro offers significant advantages, developers should remain mindful of its current boundaries and use human expertise when necessary.

Possibilities 4/5
Correctness 3/5
Privacy 5/5
Maturity 3/5

Overall score: 4/5

LLM comparison summary

Embrace AI-driven approach to legacy code modernization

Generative AI offers practical support for rewriting legacy systems. While tools like GPT-4o and Claude-3-opus can’t fully automate the process, they excel in tasks like analyzing codebases and refining requirements. Combined with advanced platforms for data analysis and workflows, they help create a more efficient and precise redevelopment process.

This synergy allows developers to focus on essential tasks, reducing project timelines and improving outcomes.

written by

Viktar Reut

Finance

Software development

Transition towards data-driven organization in the insurance industry: Comparison of data streaming platforms

Insurance has always been an industry that relied heavily on data. But these days, it is even more so than in the past. The constant increase of data sources like wearables, cars, home sensors, and the amount of data they generate presents a new challenge. The struggle is in connecting to all that data, processing and understanding it to make data-driven decisions .

And the scale is tremendous. Last year the total amount of data created and consumed in the world was 59 zettabytes, which is the equivalent of 59 trillion gigabytes. The predictions are that by 2025 the amount will reach 175 zettabytes.

On the other hand, we’ve got customers who want to consume insurance products similarly to how they consume services from e-tailers like Amazon.

The key to meeting the customer expectations lies in the ability to process the data in near real-time and streamline operations to ensure that customers get the products they need when they want them. And this is where the data streaming platforms come to help.

Traditional data landscape

In the traditional landscape businesses often struggled with siloed data or data that was in various incompatible formats. Some of the commonly used solutions that should be mentioned here are:

Big Data systems like Cassandra that let users store a very large amount of data.
Document databases such as Elasticsearch that provide a rich interactive query model.
And relational databases like Oracle and PostgreSQL

That means there were databases with good query mechanisms, Big Data systems capable of handling huge volumes of data, and messaging systems for near-real-time message processing.

But there was no single solution that could handle it all, so the need for a new type of solution became apparent. One that would be capable of processing massive volumes of data in real-time , processing the data from a specific time window while being able to scale out and handle ordered messages.

Data streaming platforms- pros & cons and when should they be used

Data streaming is a continuous stream of data that can be processed, stored, analyzed, and acted upon as it's generated in real-time. Data streams are generated by all types of sources, in various formats and volumes.

But what benefits does deploying data streaming platforms bring exactly?

First of all, they can process the data in real-time.
Data in the stream is an ordered, replayable, and fault-tolerant sequence of immutable records.
In comparison to regular databases, scaling does not require complex synchronization of data access.
Because the producers and consumers are loosely coupled with each other and act independently, it’s easy to add new consumers or scale down.
Resiliency because of the replayability of stream and the decoupling of consumers and producers.

But there are also some downsides:

Tools like Kafka (specifically event streaming platforms) lack features like message prioritization which means data can’t be processed in a different order based on its importance.
Error handling is not easy and it’s necessary to prepare a strategy for it. Examples of those strategies are fail fast, ignore the message, or send to dead letter queue.
Retry logic doesn’t come out of the box.
Schema policy is necessary. Despite being loosely coupled, producers and consumers are still coupled by schema contract. Without this policy in place, it’s really difficult to maintain the working system and handle updates. Data streaming platforms compared to traditional databases require additional tools to query the data in the stream, and it won't be so efficient as querying a database.

Having covered the advantages and disadvantages of streaming technology, it’s important to consider when implementing a streaming platform is a valid decision and when other solutions might be a better choice.

In what cases data streaming platforms can be used:

Whenever there is a need to process data in real-time, i.e., feeding data to Machine Learning and AI systems.
When it’s necessary to perform log analysis, check sensor and data metrics.
For fraud detection and telemetry.
To do low latency messaging or event sourcing.

When data streaming platforms are not the ideal solution:

The volume of events or messages is low, i.e., several thousand a day.
When there is a need for random access to query the data for specific records.
When it’s mostly historical data that is used for reporting and visualization.
For using large payloads like big pictures, videos, or documents, or in general binary large objects.

Example architecture deployed on AWS

On the left-hand side, there are integrations points with vehicles. The way how they are integrated may vary depending on OEM or make and model. However, despite the protocol they use in the end, they will deliver data to our platform. The stream can receive the data in various formats, in this case, depending on the car manufacturer. The data is processed and then sent to the normalized events. From where it can be sent using a firehose to AWS S3 storage for future needs, i.e., historical data analysis or feeding Machine Learning models . After normalization, it is also sent to the telemetry stack, where the vehicle location and information about acceleration, braking, and cornering speed is extracted and then made available to clients through an API.

Tool comparison

There are many tools available that support data streaming. This comparison is divided into three categories- ease of use, stream processing, and ordering & schema registry and will focus on Apache Kafka as the most popular tool currently in use and RocketMQ and Apache Pulsar as more niche but capable alternatives.

It is important to note that these tools are open-source, so having a qualified and experienced team is necessary to perform implementation and maintenance.

Ease of use

It is worth noticing that commonly used tools have the biggest communities of experts. That leads to constant development, and it becomes easier for businesses to find talent with the right skills and experience. Kafka has the largest community as Rocket and Pulsar are less popular.
The tools are comprised of several services. One of them is usually a management tool that can significantly improve user experience. It is built in for Pulsar and Rocket but unfortunately, Kafka is missing it.
Kafka has built-in connectors that help integrate data sources in an easy and quick way.
Pulsar also has an integration mechanism that can connect to different data sources, but Rocket has none.
The number of client libraries has to do with the popularity of the tool. And the more libraries there are, the easier the tool is to use. Kafka is widely used, and so it has many client libraries. Rocket and Pulsar are less popular, so the number of libraries available is much smaller.
It’s possible to use these tools as a managed service. In that scenario, Kafka has the best support as it is offered by all major public cloud providers- AWS, GCP, and Azure. Rocket is offered by Alibaba Cloud, Pulsar by several niche companies.
Requirement for extra services for the tools to work. Kafka requires ZooKeeper, Rocket doesn’t require any additional services and Pulsar requires both Zookeeper and BooKKeeper to manage additionally.

Stream processing

Kafka is a leader in this category as it has Kafka Streams. It is a built-in library that simplifies client applications implementation and gives developers a lot of flexibility. Rocket, on the other hand, has no built-in libraries, which means there is nothing to simplify the implementation and it does require a lot of custom work. Pulsar has Pulsar Functions which is a built-in function and can be helpful, but it’s basic and limited.

Ordering & schema registry

Message ordering is a crucial feature. Especially when there is a need to use services that are processing information based on transactions. Kafka offers just a single way of message ordering, and it’s through the use of keys. The keys are in messages that are assigned to a specific partition, and within the partition, the order is maintained.

Pulsar works similarly, either within partition with the use of keys or per producer in SinglePartition mode when the key is not provided.

RocketMQ works in a different way, as it ensures that the messages are always ordered. So if a use case requires that 100% of the messages are ordered then this is the tool that should be considered.

Schema registry is mainly used to validate and version the messages.

That’s an important aspect, as with asynchronous messaging, the common problem is that the message content is different from what the client app is expecting, and this can cause the apps to break.

Kafka has multiple implementations of schema registry thanks to its popularity and being hosted by major cloud providers. Rocket is building its schema registry, but it is not known when it will be ready. Pulsar does have its own schema registry, and it works like the one in Kafka.

Things to be aware of when implementing data streaming platform

Duplicates. Duplicates can’t be avoided, they will happen at some point due to problems with things like network availability. That’s why exactly-once delivery is a useful feature that ensures messages are delivered only once.
However, there are some issues with that. Firstly, a few of the out-of-the-box tools support exactly-once delivery and it needs to be set up before starting streaming. Secondly, exactly-once delivery can significantly slow down the stream. And lastly, end-user apps should recognize the messages they receive so that they don’t process duplicates.
“Black Fridays”. These are scenarios with a sudden increase in the volume of data to process. And to handle these spikes in data volume, it is necessary to plan the infrastructure capacity beforehand. Some of the tools that have auto-scaling natively will handle those out of the box, like Kinesis from AWS. But others that are custom built will crash without proper tuning.
Popular deployment strategies are also a thing to consider. Unfortunately, deploying data streaming platforms is not a straightforward operation, the popular deployment strategies like blue/green or canary deployment won’t work.
Messages should always be treated as a structured entity. As the stream will accept everything, that we put in it, it is necessary to determine right from the start what kind of data will be processed. Otherwise, the end user applications will eventually crash if they receive messages in an unexpected format.

Best practices while deploying data streaming platforms

Schema management. This links directly with the previous point about treating the messages as a structured entity. Schema promotes common data model and ensures backward/forward compatibility.
Data retention. This is about setting limits on how long the data is stored in the data stream. Storing the data too long and constantly adding new data to the stream will eventually cause that you run out of resources.
Capacity planning and autoscaling are directly connected to the “Black Fridays” scenario. During the setup, it is necessary to pay close attention to the capacity planning to make sure the environment will cope with sudden spikes in data volume. However, it’s also a good practice to plan for failure scenarios where autoscaling kicks in due to some other issue in the system and spins out of control.
If the customer data geo-location is important to the specific use case from the regulatory perspective, then it is important to set up separate streams for different locations and make sure they are handled by local data centers.
When it comes to disaster recovery, it is always wise to be prepared for unexpected downtime, and it’s easier to manage if there is the right toolset set up.

It used to be that people were responsible for the production of most data, but in the digital era, the exponential growth of IoT has caused the scales to shift, and now machine and sensor data is the majority. That data can help businesses build innovative products, services and make informed decisions.

To unlock the value in data, companies need to have a complex strategy in place. One of the key elements in that strategy is the ability to process data in real-time so choosing the tool for the streaming platform is extremely important.

The ability to process data as it arrives is becoming essential in the insurance industry. Streaming platforms help companies handle large data volumes efficiently, improving operations and customer service. Choosing the right tools and approach can make a big difference in performance and reliability.

written by

Daniel Bryła

Automotive

Data platforms

From data to decisions: The role of data platforms in automotive

Connected, autonomous, and electric cars are changing the automotive industry. Yet, the massive amount of data they generate often remains siloed across different systems, making management and collaboration challenging.

This article examines how data platforms unify information, connecting teams across departments - from engineering to customer support - to analyze trends, address operational challenges, and refine strategies for success.

How are data platforms transforming the automotive industry?

Data platforms resolve fragmentation issues by consolidating data from various sources into a unified system. This structure not only improves data accessibility within departments but also enables secure collaboration with trusted external partners

The impact of this approach is clear: improved safety through fewer accidents, better performance thanks to real-time analytics, and quicker development of features supporting solutions such as advanced driver assistance systems and personalized in-car experiences .

As the demand for effective data solutions accelerates, the global automotive data management market , valued at $1.58 billion in 2021, is projected to grow by 20.3% annually through 2030. This rapid development underscores how essential platforms are for addressing the increasing complexity of modern automotive operations, making them vital tools for staying competitive and meeting customer expectations.

Defining data platforms in automotive

Combined with a structured data architecture that defines how information is ingested, stored, and delivered, the platform acts as the operational backbone that transforms this architecture into a functional system. By removing duplications, cutting down on storage expenses, and making it easier to manage data , the platform helps OEMs spend less time on technical hassles and more time gaining meaningful insights that drive their business forward.

In an industry where data flows through multiple departments, this centralized approach ensures that knowledge is not only easily available but also readily applicable to innovative solutions.

Data platforms as the engine for data-driven insights

Unlike standalone systems that only store or display information, automotive data platforms support the processing and integration of information, making it analysis-ready.

Here's a closer look at how it works:

Data ingestion

Automotive platforms handle a variety of inputs, categorized into real-time and batch-processed data. Real-time information, such as CAN bus telemetry, GPS tracking, and ADAS sensor outputs, supports immediate diagnostics and safety decisions.

Batch processing, on the other hand, involves data that is collected over time and processed collectively at scheduled intervals. Examples include maintenance records, dealership transactions, and even unstructured feedback logs.

Many platforms offer hybrid processing to meet specific operational and analytical needs.

Moreover, there are some unique methods used in the automotive industry to gather data, including:

Over-the-air (OTA) updates: remotely deliver software or firmware updates to vehicles to improve performance, fix bugs, or add features without requiring a service visit.

Vehicle-to-Everything (V2X) communication: capture real-time data on traffic, infrastructure, and environmental conditions.

These industry-focused techniques enable companies to obtain data critical for operational and strategic insights.

Data processing and storage

Processing involves cleansing for reliability, normalizing for consistency, and transforming data to meet specific requirements, such as diagnostics or performance analytics. These steps ensure the information is accurate and tailored for its intended use.

The processed information is stored in centralized repositories: data warehouses for structured records (e.g., transactions) and data lakes for semi-structured or unstructured inputs (e.g., raw sensor data or feedback logs). Centralized storage allows quick, flexible access for teams across the organization.

Fundamental principles for a modern data platform

- Scalability and simplicity: Easily expandable to accommodate growing data needs.

- Flexibility and cost-efficiency: Adaptable to evolving requirements without high overhead costs.

- Real-time decision-making: Providing immediate access to critical information.

- Unified data access: Breaking down silos for a complete organizational view.

Data platforms in automotive: Key applications for efficiency and revenue

Many companies recognize the importance of data, but only a few use it effectively to gain meaningful insights about their business and customers. Better use of information can help your company drive more informed decisions about products and operations. Consider this:

-> Is data being used to improve the customer experience in tangible ways?

-> Are your teams focused on creating new solutions, or are they spending too much time preparing and organizing data?

Data platforms serve as the foundation for specific use cases:

Customer services and new revenue opportunities

Data on vehicle usage and driver behavior supports personalized services and drives innovative business models. Examples include:

Maintenance reminders : Platforms analyze usage data to alert drivers about upcoming service needs.

Third-party partnerships : For example, insurers can access driving behavior data through controlled platforms and offer tailored policies like pay-as-you-drive.

Infotainment : Secure data-sharing capabilities allow developers to design custom infotainment systems and other features, creating new revenue opportunities for companies.

Operational efficiency

Let’s look at where else the platforms are used to solve real-world challenges. It’s all about turning raw information into revenue-growing results.

In predictive maintenance , access to consistent sensor data helps identify patterns, reduce vehicle downtime, prevent unexpected breakdowns, and ensure proactive safety measures.

Ford’s data platform illustrates how unifying data from over 4,600 sources - including dealership feedback, repair records, and warranty services - can drive new business models. By centralizing diverse inputs, Ford demonstrates the potential for predictive insights to address customer needs and refine operational strategies.

In supply chain management , integrating data from manufacturing systems and inventory tools supports precise resource allocation and production scheduling.

Volkswagen 's collaboration with AWS and Siemens on the Industrial Cloud is a clear example of how data platforms optimize these operations. By connecting data from global plants and suppliers, Volkswagen has achieved more precise production scheduling and management.

Product development benefits from data unification that equips engineers with the visibility they need to resolve performance challenges faster, ensuring continuous improvement in vehicle designs. This integrated approach ensures better collaboration across teams. Aggregated data highlights frequent problems in vehicle components, while customer feedback guides the creation of features aligned with market demands, driving higher-quality outcomes and user satisfaction.

Fleet management also sees significant improvements through the use of data platforms. Real-time information collected from vehicles allows for improved route planning while reducing fuel consumption and delivery times. Additionally, vehicle usage data helps optimize fleet operations by preventing overuse and extending vehicle lifespans.

Regulatory compliance

Another key advantage of centralizing data is easier compliance with regulations such as GDPR and the EU Data Act. A unified system simplifies managing access, tracking usage, and securely sharing information. It also supports meeting safety and environmental standards by providing quick access to the data required for audits and reporting.

What’s next for automotive data platforms

While some data platforms' capabilities are already in place, the following represent emerging trends and transformational predictions that will define the future:

AI-powered personalization

Platforms are evolving to deliver even more sophisticated personalization. In the future, they’ll integrate data from multiple sources - vehicles, mobile apps, and smart home devices - to create a unified profile for each driver. This will enable predictive services, like suggesting vehicle configurations for specific trips or dynamically adjusting settings based on the driver’s schedule and habits.

Connected ecosystems

Future platforms may process data from smart cities, energy grids, and public transport systems, creating a holistic view for better decision-making. For example, they could optimize fleet operations by aligning vehicle usage with real-time energy availability and urban traffic flow predictions, expanding opportunities for sustainability and efficiency.

Real-time data processing

The next generation of platforms will handle larger volumes of information with greater speed, supporting developments like autonomous systems and advanced simulations. By combining historical data with real-world inputs, they will improve predictive capabilities; for instance, refining AI algorithms for better safety outcomes or optimizing fleet routes to reduce emissions and costs.

Enhanced cybersecurity

Looking ahead, data platforms will incorporate more advanced security measures, such as decentralized systems like blockchain to safeguard data integrity. They will also provide proactive threat detection, using AI to identify and mitigate risks before breaches occur. This will be critical as vehicles and ecosystems become increasingly connected.

These advancements will not only address current challenges but also redefine how vehicles interact with their environment, improving functionality, safety, and sustainability.

Ready to change your automotive data strategy?

As the industry evolves with connectivity, autonomy, and electrification, the need for dependable and flexible systems grows.

Need a secure, scalable platform designed for automotive requirements?

Whether you're creating one from scratch or improving an existing system, we can help provide solutions that improve operational efficiency and create new revenue opportunities.

‍

written by

Adam Kozłowski

written by

Marcin Wiśniewski

Software development

Spring AI framework overview – introduction to AI world for Java developers

In this article, we explain the fundamentals of integrating various AI models and employing different AI-related techniques within the Spring framework. We provide an overview of the capabilities of Spring AI and discuss how to utilize the various supported AI models and tools effectively.

Understanding Spring AI - Basic concepts

Traditionally, libraries for AI integration have primarily been written in Python, making knowledge of this language essential for their use. Additionally, their integration in applications written in other languages implies the writing of a boilerplate code to communicate with the libraries. Today, Spring AI makes it easier for Java developers to enable AI in Java-based applications.

Spring AI aims to provide a unified abstraction layer for integrating various AI LLM types and techniques (e.g., ETL, embeddings, vector databases) into Spring applications. It supports multiple AI model providers, such as OpenAI, Google Vertex AI, and Azure Vector Store, through standardized interfaces that simplify their integration by abstracting away low-level details. This is achieved by offering concrete implementations tailored to each specific AI provider.

Generating data: Integration with AI models

Spring AI API supports all main types of AI models, such as chat, image, audio, and embeddings. The API for the model is consistent across all model types. It consists of the following main components:

1) Model interfaces that provide similar methods for all AI model providers. Each model type has its own specific interface, such as ChatModel for chat AI models and ImageModel for image AI models. Spring AI provides its own implementation of each interface for every supported AI model provider.

2) Input prompt/request class that is used by the AI model (via model interface) providing user input (usually text) instructions, along with options for tuning the model’s behavior.

3) Response for output data produced by the model. Depending on the model type, it contains generated text, image, or audio (for Chat Image and Audio models correspondingly) or more specific data like floating-point arrays in the case of Embedding models.

All AI model interfaces are standard Spring beans that can be injected using auto-configuration or defined in Spring Boot configuration classes.

Chat models

The chat LLMs gnerate text in response to the user’s prompts. Spring AI has the following main API for interaction with this type of model.

The ChatModel interface allows sending a String prompt to a specific chat AI model service. For each supported AI chat model provider in Spring AI, there is a dedicated implementation of this interface.

The prompt class contains a list of text messages (queries, typically a user input) and a ChatOptions object. The ChatOptions interface is common for all the supported AI models. Additionally, every model implementation has its own specific options for class implementation.

The ChatResponse class encapsulates the output of the AI chat model, including a list of generated data and relevant metadata.

Furthermore, the chat model API has a ChatClient class , which is responsible for the entire interaction with the AI model. It encapsulates the ChatModel, enabling users to build and send prompts to the model and retrieve responses from it. ChatClient has multiple options for transforming the output of the AI model, which includes converting raw text response into a custom Java object or fetching it as a Flux-based stream.

Putting all these components together, let’s give an example code of Spring service class interacting with OpenAI chat API:

// OpenAI model implementation is available via auto configuration // when ‘org.springframework.ai:spring-ai-openai-spring-boot-starter' // is added as a dependency@Configurationpublic class ChatConfig{ // Defining chat client bean with OpenAI model@Bean ChatClient chatClient(ChatModel chatModel) {returnChatClient.builder(chatModel) .defaultSystem("Default system text") .defaultOptions( OpenAiChatOptions.builder() .withMaxTokens(123) .withModel("gpt-4-o") .build() ).build(); } }@Servicepublic class ChatService{ private finalChatClient chatClient; ...publicList<String> getResponses(String userInput) { varprompt = new Prompt( userInput, // Specifying options of concrete AI model optionsOpenAiChatOptions.builder() .withTemperature(0.4) .build() ); varresults = chatClient.prompt(prompt) .call() .chatResponse() .getResults();returnresults.stream() .map(chatResult -> chatResult.getOutput().getContent()) .toList(); } }

Image and Audio models

Image and Audio AI model APIs are similar to the chat model API; however, the framework does not provide a ChatClient equivalent for them.

For image models the main classes are represented by:

ImagePrompt that contains text query and ImageOptions
ImageModel for abstraction of the concrete AI model
ImageResponse containing a list of the ImageGeneration objects as a result of ImageModel invocation.

Below is the example Spring service class for generating images:

@Servicepublic class ImageGenerationService{// OpenAI model implementation is used for ImageModel via autoconfiguration // when ‘org.springframework.ai:spring-ai-openai-spring-boot-starter’ is // added as a dependencyprivate finalImageModel imageModel; ... publicList<Image> generateImages(String request) { varimagePrompt = new ImagePrompt(// Image description and prompt weightnew ImageMessage(request, 0.8f), // Specifying options of a concrete AI modelOpenAiImageOptions.builder() .withQuality("hd") .withStyle("natural") .withHeight(2048) .withWidth(2048) .withN(4) .build() ); varresults = imageModel .call(imagePrompt) .getResults(); returnresults.stream() .map(ImageGeneration::getOutput) .toList(); } }

When it comes to audio models there are two types of them supported by Spring AI: Transcription and Text-to-Speech.

The text-to-speech model is represented by the SpeechModel interface. It uses text query input to generate audio byte data with attached metadata.

In transcription models , there isn't a specific general abstract interface. Instead, each model is represented by a set of concrete implementations (as per different AI model providers). This set of implementations adheres to a generic "Model" interface, which serves as the root interface for all types of AI models.

Embedding models

1. The concept of embeddings

Let’s outline the theoretical concept of embeddings for a better understanding of how the embeddings API in Spring AI functions and what its purpose is.

Embeddings are numeric vectors created through deep learning by AI models. Each component of the vector corresponds to a certain property or feature of the data. This allows to define the similarities between data (like text, image or video) using mathematical operations on those vectors.

Just like 2D or 3D vectors represent a point on a plane or in a 3D space, the embedding vector represents a point in an N-dimensional space. The closer points (vectors) are to each other or, in other words, the shorter the distance between them is, the more similar the data they represent is. Mathematically the distance between vectors v1 and v2 may be defined as: sqrt(abs(v1 - v2)).

Consider the following simple example with living beings (e.g., their text description) as data and their features:

Is Animal (boolean) Size (range of 0…1) Is Domestic (boolean) Cat 1 0,1 1 Horse 1 0,7 1 Tree 0 1,0 0

In terms of the features above, the objects might be represented as the following vectors: “cat” -> [1, 0.1, 1] , “horse” -> [1, 0.7, 1] , “tree” -> [0, 1.0, 0]

For the most similar animals from our example, e.g. cat and horse, the distance between the corresponding vectors is sqrt(abs([1, 0.1, 1] - [1, 0.7, 1])) = 0,6 While comparing the most distinct objects, that is cat and tree gives us: sqrt(abs([1, 0.1, 1] - [0, 1.0, 0])) = 1,68

2. Embedding model API

The Embeddings API is similar to the previously described AI models such as ChatModel or ImageModel.

The Document class is used for input data. The class represents an abstraction that contains document identifier, content (e.g., image, sound, text, etc), metadata, and the embedding vector associated with the content.

The EmbeddingModel interface is used for communication with the AI model to generate embeddings. For each AI embedding model provider, there is a concrete implementation of that interface. This ensures smooth switching between various models or embedding techniques.

The EmbeddingResponse class contains a list of generated embedding vectors.

Storing data: Vector databases

Vector databases are specifically designed to efficiently handle data in vector format. Vectors are commonly used for AI processing. Examples include vector representations of words or text segments used in chat models, as well as image pixel information or embeddings.

Spring AI has a set of interfaces and classes that allow it to interact with vector databases of various database vendors. The primary interface of this API is the VectorStore , which is designed to search for similar documents using a specific similarity query known as SearchRequest.

It also has methods for adding and removing the Document objects. When adding to the VectorStore, the embeddings for documents are typically created by the VectorStore implementation using an EmbeddingMode l. The resulting embedding vector is assigned to the documents before they are stored in the underlying vector database.

Below is an example of how we can retrieve and store the embeddings using the input documents using the Azure AI Vector Store.

@Configurationpublic class VectorStoreConfig { ... @Bean publicVectorStore vectorStore(EmbeddingModel embeddingModel) {var searchIndexClient = ... //get azure search index client return new AzureVectorStore( searchIndexClient, embeddingModel, true,// Metadata fields to be used for the similarity search // Considering documents that are going to be stored in vector store // represent books/book descriptionsList.of(MetadataField.date("yearPublished"), MetadataField.text("genre"), MetadataField.text("author"), MetadataField.int32("readerRating"), MetadataField.int32("numberOfMainCharacters"))); } } @Servicepublic class EmbeddingService{private finalVectorStore vectorStore; ...public voidsave(List<Document> documents) {// The implementation of VectorStore uses EmbeddingModel to get embedding vector // for each document, sets it to the document object and then stores itvectorStore.add(documents); }publicList<Document> findSimilar(String query, double similarityLimit, Filter.Expression filter) {returnvectorStore.similaritySearch( SearchRequest.query(query)// used for embedding similarity search // only having equal or higher similarity.withSimilarityThreshold(similarityLimit)// search only documents matching filter criteria.withFilterExpression(filter) .withTopK(10)// max number of results); } publicList<Document> findSimilarGoodFantasyBook(String query) { vargoodFantasyFilterBuilder = new FilterExpressionBuilder(); vargoodFantasyCriteria = goodFantasyFilterBuilder.and( goodFantasyFilterBuilder.eq("genre", "fantasy"), goodFantasyFilterBuilder.gte("readerRating", 9) ).build(); returnfindSimilar(query, 0.9, goodFantasyCriteria); } }

Preparing data: ETL pipelines

The ETL, which stands for “Extract, Transform, Load” is a process of transforming raw input data (or documents) to make it applicable or more efficient for the further processing by AI models. As the name suggests, the ETL consists of three main stages: extracting the raw data from various data sources, transforming data into a structured format, and storing the structured data in the database.

In Spring AI the data used for ETL in every stage is represented by the Document class mentioned earlier. Here are the Spring AI components representing each stage in ETL pipeline:

DocumentReader – used for data extraction, implements Supplier<List<Document>>
DocumentTransformer – used for transformation, implements Function<List<Document>, List<Document>>
DocumentWriter – used for data storage, implements Consumer<List<Document>>

The DocumentReader interface has a separate implementation for each particular document type, e.g., JsonReader, TextReader, PagePdfDocumentReader, etc. Readers are temporary objects and are usually created in a place where we need to retrieve the input data, just like, e.g., InputStream objects. It is also worth mentioning that all the classes are designed to get their input data as a Resource object in their constructor parameter. And, while Resource is abstract and flexible enough to support various data sources, such an approach limits the reader class capabilities as it implies conversion of any other data sources like Stream to the Resource object.

The DocumentTransformer has the following implementations:

TokenTextSplitter – splits document into chunks using CL100K_BASE encoding, used for preparing input context data of AI model to fit the text into the model’s context window.

KeywordMetadataEnricher – uses a generative AI model for getting the keywords from the document and embeds them into the document’s metadata.

SummaryMetadataEnricher – enriches the Document object with its summary generated by a generative AI model.

ContentFormatTransformer – applies a specified ContentFormatter to each document to unify the format of the documents.

These transformers cover some of the most popular use cases of data transformation. However, if some specific behavior is required, we’ll have to provide a custom DocumentTransformer.

When it comes to the DocumentWriter , there are two main implementations: VectorStore, mentioned earlier, and FileDocumentWriter, which writes the documents into a single file. For real-world development scenarios the VectorStore seems the most suitable option. FileDocumentWriter is more suitable for simple or demo software where we don't want or need a vector database.

With all the information provided above, here is a clear example of what a simple ETL pipeline looks like when written using Spring AI:

public voidsaveTransformedData() {// Get resource e.g. using InputStreamResourceResource textFileResource = ... TextReader textReader = new TextReader(textFileResource);// Assume tokenTextSplitter instance in created as bean in configuration // Note that the read() and split() methods return List<Document> objectsvectorStore.write(tokenTextSplitter.split(textReader.read())); }

It is worth mentioning that the ETL API uses List<Document> only to transfer data between readers, transformers, and writers. This may limit their usage when the input document set is large, as it requires the loading of all the documents in memory at once.

Converting data: Structured output

While the output of AI models is usually raw data like text, image, or sound, in some cases, we may benefit from structuring that data. Particularly when the response includes a description of an object with features or properties that suggest an implicit structure within the output.

Spring AI offers a Structured Output API designed for chat models to transform raw text output into structured objects or collections. This API operates in two main steps: first, it provides the AI model with formatting instructions for the input data, and second, it converts the model's output (which is already formatted according to these instructions) into a specific object type. Both the formatting instructions and the output conversion are handled by implementations of the StructuredOutputConverter interface.

There are three converters available in Spring AI:

BeanOutputConverter<T> – instructs AI model to produce JSON output using JSON Schema of a specified class and converts it to the instances of the that class as an output.

MapOutputConverter – instructs AI model to produce JSON output and parses it into a Map<String, Object> object

ListOutputConverter – retrieves comma separated items form AI model creating a List<String> output

Below is an example code for generating a book info object using BeanOutputConverter:

public record BookInfo(String title, String author, int yearWritten, int readersRating) { } @Servicepublic class BookService{private final ChatClient chatClient; // Created in configuration of BeanOutputConverter<BookInfo> typeBookInfo> typeprivate final StructuredOutputConverter<BookInfo> bookInfoConverter; ...public final BookInfofindBook() {returnchatClient.prompt() .user(promptSpec -> promptSpec .text("Generate description of the best " + "fantasy book written by {author}.") .param("author", "John R. R. Tolkien")) .call() .entity(bookInfoConverter); } }

Production Readiness

To evaluate the production readiness of the Spring AI framework, let’s focus on the aspects that have an impact on its stability and maintainability.

Spring AI is a new framework. The project was started back in 2023. The first publicly available version, the 0.8.0 one, was released in February 2024. There were 6 versions released in total (including pre-release ones) during this period of time.

It’s an official framework of Spring Projects, so the community developing it should be comparable to other frameworks, like Spring JPA. If the framework development continues, it’s expected that the community will provide support on the same level as for other Spring-related frameworks.

The latest version, 1.0.0-M4, published in November, is still a release candidate/milestone. The development velocity, however, is quite good. Framework is being actively developed: according to the GitHub statistics, the commit rate is 5.2 commits per day, and the PR rate is 3.5 PRs per day. We may see it by comparing it to some older, well-developed frameworks, such as Spring Data JPA, which has 1 commit per day and 0.3 PR per day accordingly.

When it comes to bug fixing, there are about 80 bugs in total, with 85% of them closed on their official GitHub page. Since the project is quite new, these numbers may not be as representable as in other older Spring projects. For example, Spring Data JPA has almost 800 bugs with about 90% fixed.

Conclusion

Overall, the Spring AI framework looks very promising. It might become a game changer for AI-powered Java applications because of its integration with Spring Boot framework and the fact that it covers the vast majority of modern AI model providers and AI-related tools, wrapping them into abstract, generic, easy-to-use interfaces.

written by

Andrii Biehunov

Software development

Ragas and Langfuse integration - quick guide and overview

With large language models (LLMs) being used in a variety of applications today, it has become essential to monitor and evaluate their responses to ensure accuracy and quality. Effective evaluation helps improve the model's performance and provides deeper insights into its strengths and weaknesses. This article demonstrates how embeddings and LLM services can be used to perform end-to-end evaluations of an LLM's performance and send the resulting metrics as traces to Langfuse for monitoring.

This integrated workflow allows you to evaluate models against predefined metrics such as response relevance and correctness and visualize these metrics in Langfuse, making your models more transparent and traceable. This approach improves performance monitoring while simplifying troubleshooting and optimization by turning complex evaluations into actionable insights.

I will walk you through the setup, show you code examples, and discuss how you can scale and improve your AI applications with this combination of tools.

To summarize, we will explore the role of Ragas in evaluating the LLM model and how Langfuse provides an efficient way to monitor and track AI metrics.

Important : For this article, Ragas in version 0.1.21 and Python 3.12 were used.
If you would like to migrate to version 0.2.+ follow, then up the latest release documentation.

1. What is Ragas, and what is Langfuse?

1.1 What is Ragas?

So, what’s this all about? You might be wondering: "Do we really need to evaluate what a super-smart language model spits out? Isn’t it already supposed to be smart?" Well, yes, but here’s the deal: while LLMs are impressive, they aren’t perfect. Sometimes, they give great responses, and other times… not so much. We all know that with great power comes great responsibility. That’s where Ragas steps in.

Think of Ragas as your model’s personal coach . It keeps track of how well the model is performing, making sure it’s not just throwing out fancy-sounding answers but giving responses that are helpful, relevant, and accurate. The main goal? To measure and track your model's performance, just like giving it a score - without the hassle of traditional tests.

1.2 Why bother evaluating?

Imagine your model as a kid in a school. It might answer every question, but sometimes it just rambles, says something random, or gives you that “I don’t know” look in response to a tricky question. Ragas makes sure that your LLM isn’t just trying to answer everything for the sake of it. It evaluates the quality of each response, helping you figure out where the model is nailing it and where it might need a little more practice.

In other words, Ragas provides a comprehensive evaluation by allowing developers to use various metrics to measure LLM performance across different criteria , from relevance to factual accuracy. Moreover, it offers customizable metrics, enabling developers to tailor the evaluation to suit specific real-world applications.

1.3 What is Langfuse, and how can I benefit from it?

Langfuse is a powerful tool that allows you to monitor and trace the performance of your language models in real-time. It focuses on capturing metrics and traces, offering insights into your models' performance. With Langfuse, you can track metrics such as relevance, correctness, or any custom evaluation metric generated by tools like Ragas and visualize them to better understand your model's behavior.

In addition to tracing and metrics, Langfuse also offers options for prompt management and fine-tuning (non-self-hosted versions), enabling you to track how different prompts impact performance and adjust accordingly. However, in this article, I will focus on how tracing and metrics can help you gain better insights into your model’s real-world performance.

2. Combining Ragas and Langfuse

2.1 Real-life setup

Before diving into the technical analysis, let me provide a real-life example of how Ragas and Langfuse work together in an integrated system. This practical scenario will help clarify the value of this combination and how it applies in real-world applications, offering a clearer perspective before we jump into the code.

Imagine using this setup in a customer service chatbot , where every user interaction is processed by an LLM. Ragas evaluates the answers generated based on various metrics, such as correctness and relevance, while Langfuse tracks these metrics in real-time. This kind of integration helps improve chatbot performance, ensuring high-quality responses while also providing real-time feedback to developers.

In my current setup, the backend service handles all the interactions with the chatbot. Whenever a user sends a message, the backend processes the input and forwards it to the LLM to generate a response. Depending on the complexity of the question, the LLM may invoke external tools or services to gather additional context before formulating its answer. Once the LLM returns the answer, the Ragas framework evaluates the quality of the response.

After the evaluation, the backend service takes the scores generated by Ragas and sends them to Langfuse. Langfuse tracks and visualizes these metrics, enabling real-time monitoring of the model's performance, which helps identify improvement areas and ensures that the LLM maintains an elevated level of accuracy and quality during conversations.

This architecture ensures a continuous feedback loop between the chatbot, the LLM, and Ragas while providing insight into performance metrics via Langfuse for further optimization.

2.2 Ragas setup

Here’s where the magic happens. No great journey is complete without a smooth, well-designed API. In this setup, the API expects to receive the essential elements: question, context, expected contexts, answer, and expected answer. But why is it structured this way? Let me explain.

The question in our API is the input query you want the LLM to respond to, such as “What is the capital of France?” It's the primary element that triggers the model's reasoning process. The model uses this question to generate a relevant response based on its training data or any additional context provided.

The answer is the output generated by the LLM, which should directly respond to the question. For example, if the question is “What is the capital of France?” the answer would be “The capital of France is Paris.” This is the model's attempt to provide useful information based on the input question.

The expected answer represents the ideal response. It serves as a reference point to evaluate whether the model’s generated answer was correct. So, if the model outputs "Paris," and the expected answer was also "Paris," the evaluation would score this as a correct response. It's like the answer key for a test.

Context is where things get more interesting. It's the additional information the model can use to craft its answer. Imagine asking the question, “What were Albert Einstein’s contributions to science?” Here, the model might pull context from an external document or reference text about Einstein’s life and work. Context gives the model a broader foundation to answer questions that need more background knowledge.

Finally, the expected context is the reference material we expect the model to use. In our Einstein example, this could be a biographical document outlining his theory of relativity. We use the expected context to compare and see if the model is basing its answers on the correct information.

After outlining the core elements of the API, it’s important to understand how Retrieval-Augmented Generation (RAG) enhances the language model’s ability to handle complex queries. RAG combines the strength of pre-trained language models with external knowledge retrieval systems. When the LLM encounters specialized or niche queries, it fetches relevant data or documents from external sources, adding depth and context to its responses. The more complex the query, the more critical it is to provide detailed context that can guide the LLM to retrieve relevant information. In my example, I used a simplified context, which the LLM managed without needing external tools for additional support.

In this Ragas setup, the evaluation is divided into two categories of metrics: those that require ground truth and those where ground truth is optional . These distinctions shape how the LLM’s performance is evaluated.

Metrics that require ground truth depend on having a predefined correct answer or expected context to compare against. For example, metrics like answer correctness and context recall evaluate whether the model’s output closely matches the known, correct information. This type of metric is essential when accuracy is paramount, such as in customer support or fact-based queries. If the model is asked, "What is the capital of France?" and it responds with "Paris," the evaluation compares this to the expected answer, ensuring correctness.

On the other hand, metrics where ground truth is optional - like answer relevancy or faithfulness - don’t rely on direct comparison to a correct answer. These metrics assess the quality and coherence of the model's response based on the context provided, which is valuable in open-ended conversations where there might not be a single correct answer. Instead, the evaluation focuses on whether the model’s response is relevant and coherent within the context it was given.

This distinction between ground truth and non-ground truth metrics impacts evaluation by offering flexibility depending on the use case. In scenarios where precision is critical, ground truth metrics ensure the model is tested against known facts. Meanwhile, non-ground truth metrics allow for assessing the model’s ability to generate meaningful and coherent responses in situations where a definitive answer may not be expected. This flexibility is vital in real-world applications, where not all interactions require perfect accuracy but still demand high-quality, relevant outputs.

And now, the implementation part:

from typing import Optional from fastapi import FastAPI from pydantic import BaseModel from src.service.ragas_service import RagasEvaluator class QueryData(BaseModel): question: Optional[str] = None contexts: Optional[list[str]] = None expected_contexts: Optional[list[str]] = None answer: Optional[str] = None expected_answer: Optional[str] = None class EvaluationAPI: def __init__(self, app: FastAPI): self.app = app self.add_routes() def add_routes(self): @self.app.post("/api/ragas/evaluate_content/") async def evaluate_answer(data: QueryData): evaluator = RagasEvaluator() result = evaluator.process_data( question=data.question, contexts=data.contexts, expected_contexts=data.expected_contexts, answer=data.answer, expected_answer=data.expected_answer, ) return result

Now, let’s talk about configuration. In this setup, embeddings are used to calculate certain metrics in Ragas that require a vector representation of text, such as measuring similarity and relevancy between the model’s response and the expected answer or context. These embeddings provide a way to quantify the relationship between text inputs for evaluation purposes.

The LLM endpoint is where the model generates its responses. It’s accessed to retrieve the actual output from the model, which Ragas then evaluates. Some metrics in Ragas depend on the output generated by the model, while others rely on vectorized representations from embeddings to perform accurate comparisons.

import json import logging from typing import Any, Optional import requests from datasets import Dataset from langchain_openai.chat_models import AzureChatOpenAI from langchain_openai.embeddings import AzureOpenAIEmbeddings from ragas import evaluate from ragas.metrics import ( answer_correctness, answer_relevancy, answer_similarity, context_entity_recall, context_precision, context_recall, faithfulness, ) from ragas.metrics.critique import coherence, conciseness, correctness, harmfulness, maliciousness from src.config.config import Config logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) class RagasEvaluator: azure_model: AzureChatOpenAI azure_embeddings: AzureOpenAIEmbeddings def __init__(self) -> None: config = Config() self.azure_model = AzureChatOpenAI( openai_api_key=config.api_key, openai_api_version=config.api_version, azure_endpoint=config.api_endpoint, azure_deployment=config.deployment_name, model=config.embedding_model_name, validate_base_url=False, ) self.azure_embeddings = AzureOpenAIEmbeddings( openai_api_key=config.api_key, openai_api_version=config.api_version, azure_endpoint=config.api_endpoint, azure_deployment=config.embedding_model_name, )

The logic in the code is structured to separate the evaluation process into different metrics, which allows flexibility in measuring specific aspects of the LLM’s responses based on the needs of the scenario. Ground truth metrics come into play when the LLM’s output needs to be compared against a known, correct answer or context. For instance, metrics like answer correctness or context recall check if the model’s response aligns with what was expected. The run_individual_evaluations function manages these evaluations by verifying if both the expected answer and context are available for comparison.

On the other hand, non-ground truth metrics are used when there isn’t a specific correct answer to compare against. These metrics, such as faithfulness and answer relevancy, assess the overall quality and relevance of the LLM’s output. The collect_non_ground_metrics and run_non_ground_evaluation functions manage this type of evaluation by examining characteristics like coherence, conciseness, or harmfulness without needing a predefined answer. This split ensures that the model’s performance can be evaluated comprehensively in various situations.

def process_data( self, question: Optional[str] = None, contexts: Optional[list[str]] = None, expected_contexts: Optional[list[str]] = None, answer: Optional[str] = None, expected_answer: Optional[str] = None, ) -> Optional[dict[str, Any]]: results: dict[str, Any] = {} non_ground_metrics: list[Any] = [] # Run individual evaluations that require specific ground_truth results.update(self.run_individual_evaluations(question, contexts, answer, expected_answer, expected_contexts)) # Collect and run non_ground evaluations non_ground_metrics.extend(self.collect_non_ground_metrics(contexts, question, answer)) results.update(self.run_non_ground_evaluation(question, contexts, answer, non_ground_metrics)) return {"metrics": results} if results else None def run_individual_evaluations( self, question: Optional[str], contexts: Optional[list[str]], answer: Optional[str], expected_answer: Optional[str], expected_contexts: Optional[list[str]], ) -> dict[str, Any]: logger.info("Running individual evaluations with question: %s, expected_answer: %s", question, expected_answer) results: dict[str, Any] = {} # answer_correctness, answer_similarity if expected_answer and answer: logger.info("Evaluating answer correctness and similarity") results.update( self.evaluate_with_metrics( metrics=[answer_correctness, answer_similarity], question=question, contexts=contexts, answer=answer, ground_truth=expected_answer, ) ) # expected_context if question and expected_contexts and contexts: logger.info("Evaluating context precision") results.update( self.evaluate_with_metrics( metrics=[context_precision], question=question, contexts=contexts, answer=answer, ground_truth=self.merge_ground_truth(expected_contexts), ) ) # context_recall if expected_answer and contexts: logger.info("Evaluating context recall") results.update( self.evaluate_with_metrics( metrics=[context_recall], question=question, contexts=contexts, answer=answer, ground_truth=expected_answer, ) ) # context_entity_recall if expected_contexts and contexts: logger.info("Evaluating context entity recall") results.update( self.evaluate_with_metrics( metrics=[context_entity_recall], question=question, contexts=contexts, answer=answer, ground_truth=self.merge_ground_truth(expected_contexts), ) ) return results def collect_non_ground_metrics( self, context: Optional[list[str]], question: Optional[str], answer: Optional[str] ) -> list[Any]: logger.info("Collecting non-ground metrics") non_ground_metrics: list[Any] = [] if context and answer: non_ground_metrics.append(faithfulness) else: logger.info("faithfulness metric could not be used due to missing context or answer.") if question and answer: non_ground_metrics.append(answer_relevancy) else: logger.info("answer_relevancy metric could not be used due to missing question or answer.") if answer: non_ground_metrics.extend([harmfulness, maliciousness, conciseness, correctness, coherence]) else: logger.info("aspect_critique metric could not be used due to missing answer.") return non_ground_metrics def run_non_ground_evaluation( self, question: Optional[str], contexts: Optional[list[str]], answer: Optional[str], non_ground_metrics: list[Any], ) -> dict[str, Any]: logger.info("Running non-ground evaluations with metrics: %s", non_ground_metrics) if non_ground_metrics: return self.evaluate_with_metrics( metrics=non_ground_metrics, question=question, contexts=contexts, answer=answer, ground_truth="", # Empty as non_ground metrics do not require specific ground_truth ) return {} @staticmethod def merge_ground_truth(ground_truth: Optional[list[str]]) -> str: if isinstance(ground_truth, list): return " ".join(ground_truth) return ground_truth or "" class RagasEvaluator: azure_model: AzureChatOpenAI azure_embeddings: AzureOpenAIEmbeddings langfuse_url: str langfuse_public_key: str langfuse_secret_key: str def __init__(self) -> None: config = Config() self.azure_model = AzureChatOpenAI( openai_api_key=config.api_key, openai_api_version=config.api_version, azure_endpoint=config.api_endpoint, azure_deployment=config.deployment_name, model=config.embedding_model_name, validate_base_url=False, ) self.azure_embeddings = AzureOpenAIEmbeddings( openai_api_key=config.api_key, openai_api_version=config.api_version, azure_endpoint=config.api_endpoint, azure_deployment=config.embedding_model_name, )

2.3 Langfuse setup

To use Langfuse locally, you'll need to create both an organization and a project in your self-hosted instance after launching via Docker Compose. These steps are necessary to generate the public and secret keys required for integrating with your service. The keys will be used for authentication in your API requests to Langfuse's endpoints, allowing you to trace and monitor evaluation scores in real-time. The official documentation provides detailed instructions on how to get started with a local deployment using Docker Compose, which can be found here .

The integration is straightforward: you simply use the keys in the API requests to Langfuse’s endpoints, enabling real-time performance tracking of your LLM evaluations.

Let me present integration with Langfuse:

class RagasEvaluator: # previous code from above langfuse_url: str langfuse_public_key: str langfuse_secret_key: str def __init__(self) -> None: # previous code from above self.langfuse_url = "http://localhost:3000" self.langfuse_public_key = "xxx" self.langfuse_secret_key = "yyy"def send_scores_to_langfuse(self, trace_id: str, scores: dict[str, Any]) -> None: """ Sends evaluation scores to Langfuse via the /api/public/scores endpoint. """ url = f"{self.langfuse_url}/api/public/scores" auth_string = f"{self.langfuse_public_key}:{self.langfuse_secret_key}" auth_bytes = base64.b64encode(auth_string.encode('utf-8')).decode('utf-8') headers = { "Content-Type": "application/json", "Authorization": f"Basic {auth_bytes}" } # Iterate over scores and send each one for score_name, score_value in scores.items(): payload = { "traceId": trace_id, "name": score_name, "value": score_value, } logger.info("Sending score to Langfuse: %s", payload) response = requests.post(url, headers=headers, data=json.dumps(payload))

And the last part is to invoke that function in process_data. Simply just add:

if results: trace_id = "generated-trace-id" self.send_scores_to_langfuse(trace_id, results)

3. Test and results

Let's use the URL endpoint below to start the evaluation process:

http://0.0.0.0:3001/api/ragas/evaluate_content/

Here is a sample of the input data:

{ "question": "Did Gomez know about the slaughter of the Fire Mages?", "answer": "Gomez, the leader of the Old Camp, feigned ignorance about the slaughter of the Fire Mages. Despite being responsible for ordering their deaths to tighten his grip on the Old Camp, Gomez pretended to be unaware to avoid unrest among his followers and to protect his leadership position.", "expected_answer": "Gomez knew about the slaughter of the Fire Mages, as he ordered it to consolidate his power within the colony. However, he chose to pretend that he had no knowledge of it to avoid blame and maintain control over the Old Camp.", "contexts": [ "{\"Gomez feared the growing influence of the Fire Mages, believing they posed a threat to his control over the Old Camp. To secure his leadership, he ordered the slaughter of the Fire Mages, though he later denied any involvement.\"}", "{\"The Fire Mages were instrumental in maintaining the barrier that kept the colony isolated. Gomez, in his pursuit of power, saw them as an obstacle and thus decided to eliminate them, despite knowing their critical role.\"}", "{\"Gomez's decision to kill the Fire Mages was driven by a desire to centralize his authority. He manipulated the events to make it appear as though he was unaware of the massacre, thus distancing himself from the consequences.\"}" ], "expected_context": "Gomez ordered the slaughter of the Fire Mages to solidify his control over the Old Camp. However, he later denied any involvement to distance himself from the brutal event and avoid blame from his followers." }

And here is the result presented in Langfuse

Results: {'answer_correctness': 0.8177382234142327, 'answer_similarity': 0.9632605859646228, 'context_recall': 1.0, 'faithfulness': 0.8333333333333334, 'answer_relevancy': 0.9483433866761223, 'harmfulness': 0.0, 'maliciousness': 0.0, 'conciseness': 1.0, 'correctness': 1.0, 'coherence': 1.0}

As you can see, it is as simple as that.

4. Summary

In summary, I have built an evaluation system that leverages Ragas to assess LLM performance through various metrics. At the same time, Langfuse tracks and monitors these evaluations in real-time, providing actionable insights. This setup can be seamlessly integrated into CI/CD pipelines for continuous testing and evaluation of the LLM during development, ensuring consistent performance.

Additionally, the code can be adapted for more complex LLM workflows where external context retrieval systems are integrated. By combining this with real-time tracking in Langfuse, developers gain a robust toolset for optimizing LLM outputs in dynamic applications. This setup not only supports live evaluations but also facilitates iterative improvement of the model through immediate feedback on its performance.

However, every rose has its thorn. The main drawbacks of using Ragas include the costs and time associated with the separate API calls required for each evaluation. This can lead to inefficiencies, especially in larger applications with many requests. Ragas can be implemented asynchronously to improve performance, allowing evaluations to occur concurrently without blocking other processes. This reduces latency and makes more efficient use of resources.

Another challenge lies in the rapid pace of development in the Ragas framework. As new versions and updates are frequently released, staying up to date with the latest changes can require significant effort. Developers need to continuously adapt their implementation to ensure compatibility with the newest releases, which can introduce additional maintenance overhead.

written by

Bartłomiej Kuciński

Legacy modernization

Software development

How to migrate on-premise databases to AWS RDS with AWS DMS: Our guide

Migrating an on-premise MS SQL Server database to AWS RDS, especially for high-stakes applications handling sensitive information, can be challenging yet rewarding. This guide walks through the rationale for moving to the cloud, the key steps, the challenges you may face, and the potential benefits and risks.

Why Cloud?

When undertaking such a significant project, you might wonder why we would change something that was working well. Why shift from a proven on-premise setup to the cloud? It's a valid question. The rise in the popularity of cloud technology is no coincidence, and AWS offers several advantages that make the move worthwhile for us.

First, AWS's global reach and availability play a crucial role in choosing it. AWS operates in multiple regions and availability zones worldwide, allowing applications to deploy closer to users, reducing latency, and ensuring higher availability. In case of any issues at one data center, AWS's ability to automatically switch to another ensures minimal downtime - a critical factor, especially for our production environment.

Another significant reason for choosing AWS is the fully managed nature of AWS RDS . In an on-premise setup, you are often responsible for everything from provisioning to scaling, patching, and backing up the database. With AWS, these responsibilities are lifted. AWS takes care of backups, software patching, and even scaling based on demand, allowing the team to focus more on application development and less on infrastructure management.

Cost is another compelling factor. AWS's pay-as-you-go model eliminates the need to over-provision hardware, as is often done on-premise to handle peak loads. By paying only for resources used, particularly in development and testing environments, expenses are significantly reduced. Resources can be scaled up or down as needed, especially beneficial during periods of lower activity.

Source: https://www.peoplehr.com/blog/2015/06/12/saas-vs-on-premise-hr-systems-pros-cons-hidden-costs/

The challenges and potential difficulties

Migrating a database from on-premise to AWS RDS isn’t a simple task, especially when dealing with multiple environments like dev, UAT, staging, preprod, and production. Here are some of the possible issues that could arise during the process:

Complexity of the migration process : Migrating on-premise databases to AWS RDS involves several moving parts, from initial planning to execution. The challenge is not just about moving the data but ensuring that all dependencies, configurations, and connections between the database and applications remain intact. This requires a deep understanding of our infrastructure and careful planning to avoid disrupting production systems.

The complexity could increase with the need to replicate different environments - each with its unique configurations - without introducing inconsistencies. For example, the development environment might allow more flexibility, but production requires tight controls for security and reliability.

Data consistency and minimal downtime : Ensuring data consistency while minimizing downtime for a production environment might be one of the toughest aspects. For a business that operates continuously, even a few minutes of downtime could affect customers and operations. Although AWS DMS (Database Migration Service) supports live data replication to help mitigate downtime, careful timing of the migration might be necessary to avoid conflicts or data loss. Inconsistent data, even for a brief period, could lead to application failures or incorrect reports.

Additionally, setting up the initial full load of data followed by ongoing change data capture (CDC) could present a challenge. Close migration monitoring might be essential to ensure no changes are missed while data is being transferred.

Handling legacy systems : Some existing systems might not be fully compatible with cloud-native features, requiring certain services to be rewritten to work in synchronous or asynchronous manners to avoid potential timeout issues within an organization’s applications.

Security and compliance considerations : Security is a major concern throughout the migration process, especially when moving sensitive business data to the cloud. AWS offers robust security tools, but it’s necessary to ensure that everything is correctly configured to avoid potential vulnerabilities. This included setting up IAM roles, policies, and firewalls and managing infrastructure with relevant tools. Additionally, a secure connection between on-premise and cloud databases would likely be crucial to safeguard data migration using AWS DMS.

Managing the learning curve : For a team relatively new to AWS, the learning curve can be steep. AWS offers a vast array of services and features, each with its own set of best practices, pricing models, and configuration options. Learning to use services like RDS, DMS, IAM, and CloudWatch effectively could require time and experimentation with various configurations to optimize performance.

Coordination across teams : Migrating such a critical part of the infrastructure requires coordination across multiple teams - development, operations, security, and management. Each team has its priorities and concerns, making smooth communication and alignment of goals a potential challenge to ensure a unified approach.

What can be gained by migrating on-premise databases to AWS RDS

This journey isn’t fast or easy. So, is it worth it? Absolutely! The migration to AWS RDS provides significant benefits for database management. With the ability to scale databases up or down based on demand, performance is optimized, and over-provisioning resources is avoided. AWS RDS automates manual backups and database maintenance, allowing teams to focus on more strategic tasks. Additionally, the pay-as-you-go model helps manage and optimize costs more efficiently.

Risks and concerns

AWS is helpful and can make your work easier. However, it's important to be aware of the potential risks:

Vendor lock-in : Once you’re deep into AWS services, moving away can be difficult due to the reliance on AWS-specific technologies and configurations.

Security misconfigurations : While AWS provides strong security tools, a misconfiguration can expose sensitive data. It’s crucial to ensure access controls, encryption, and monitoring are set up correctly.

Unexpected costs : While AWS’s pricing can be cost-effective, it’s easy to incur unexpected costs, especially if you don’t properly monitor your resource usage or optimize your infrastructure.

Conclusion

Migrating on-premise databases to AWS RDS using AWS DMS is a learning experience. The cloud offers incredible opportunities for scalability, flexibility, and innovation, but it also requires a solid understanding of best practices to fully benefit from it. For organizations considering a similar migration, the key is to approach it with careful planning, particularly around data consistency, downtime minimization, and security.

For those just starting with AWS, don't be intimidated - AWS provides extensive documentation, and the community is always there to help. By embracing the cloud, we open the door to a more agile, scalable, and resilient future.

written by

Bartosz Szlapa

Manufacturing

EU Data Act

EU Data regulations decoded: Expert solutions for IoT compliance and growth

IoT manufacturers are continuously advancing the potential of connected devices. By 2025, the global expansion of IoT is projected to generate nearly 80 zettabytes of data annually (1), highlighting the immense scale and complexity of managing this volume.

However, with innovation comes the challenge of navigating Europe’s regulatory landscape .

Three key EU data regulations – the Data Governance Act (DGA) (2), the EU Data Act (3), and the General Data Protection Regulation (GDPR) (4) – outline how businesses must handle, share, and protect both personal and non-personal data.

This article explains how these regulations work together and how IoT manufacturers can comply while opening new business opportunities within this legal framework.

Explaining the EU Data Act

The EU Data Act, set to be fully implemented in 2025, seeks to ensure fairness and transparency in the data economy. It gives users and businesses the right to access and control data generated by IoT devices , promoting innovation and fair competition.

User control over data : The EU Data Act allows users (and businesses) to authorize the sharing of their device-generated data with third-party service providers. This requires IoT manufacturers to build systems that enable users to easily request and manage access to their data.
Mandatory data sharing : In certain cases, IoT manufacturers will be required to share data with other businesses when authorized by the user. For example, third-party service providers may need access to this data. In B2B scenarios, manufacturers can request reasonable compensation for providing the data.

This regulation is particularly relevant in industries like automotive and smart cities, where multiple stakeholders rely on shared data. A connected car manufacturer, for instance, must ensure users can authorize access to their vehicle data for services like maintenance or insurance.

Introduction to the Data Governance Act

The DGA, effective since September 2023, is all about creating a trustworthy, neutral data-sharing system. It focuses on two key areas: data intermediation services and data altruism .

Access to public sector data: The DGA allows businesses to reuse data from public sector bodies, such as healthcare, transportation, and environmental data. This provides access to high-quality data that can be used to develop new products, services, and innovations.

Example: A company developing AI-based healthcare solutions can use anonymized public health data to create more accurate models or treatments.

Data intermediation services : Intermediaries are neutral third parties that help exchange data between IoT manufacturers and other data users (like third-party service providers) under B2B, C2B, and data cooperative models.

The idea emerged as an alternative to big tech platforms monopolizing data-sharing. The goal? To provide a secure and transparent space where personal and non-personal data can be shared safely.

Example: A smart home manufacturer might team up with a data intermediary to help users share energy data with utility companies or researchers looking into energy efficiency.

Manufacturers cannot act as intermediaries directly, but they can partner with or establish separate entities to manage data exchanges. If they create these intermediaries, the entities must function independently from the core business. This separation ensures data is handled fairly and transparently without commercial bias.

The goal is to build trust - intermediaries are only there to facilitate secure, neutral connections between data holders and users without using the data for their own benefit.

Data altruism : This is all about voluntary data sharing for the public good. Think research or environmental projects. IoT manufacturers can give users the option to donate their data, opening the door to collaborations with research bodies or public organizations.

The DGA's core focus is building user trust by ensuring data transparency, security, and fairness, whether through neutral intermediaries or data shared for a greater cause.

Key GDPR rules every business should know

The GDPR, in effect since 2018, sets strict rules for how businesses collect, store, and process personal data, including data from IoT devices.

User consent and transparency : IoT manufacturers must obtain explicit user consent before collecting or processing personal data, such as health data from wearable devices or location data from connected cars. Transparency about how this data is used is also required.
Data security and privacy : Manufacturers must implement robust security measures to protect personal data and adhere to data minimization principles - only collecting what’s necessary. Additionally, they must uphold user rights, such as providing access to their data, supporting data portability, and allowing users to request erasure (the right to be forgotten).

For example, wearable device manufacturers need to ensure the security of personal data and offer users the ability to request the deletion of their data if they no longer wish for it to be stored.

How the DGA, EU Data Act, and GDPR work together

These three EU data regulations create a well-rounded framework for managing both personal and non-personal data in the IoT space.

The DGA : The Data Governance Act creates neutral, secure data-sharing ecosystems, promoting transparency and fairness when multiple parties exchange data through trusted intermediaries.

The EU Data Act : This regulation complements the DGA by giving users control over the data generated by their devices, allowing them to request that it be shared with third-party service providers. In certain B2B cases, the data holder may request fair compensation for providing access to the data.

The GDPR : The GDPR adds strong protections for personal data. When personal information is involved, it ensures that users’ privacy and rights are respected.

Example:

Imagine a smart agriculture company that manufactures sensors to monitor soil and weather conditions.

Under the DGA, the company can work with neutral intermediaries to securely share aggregated environmental data with researchers studying climate change, maintaining transparency and fairness in the exchange.

At the same time, the EU Data Act allows farmers who use these sensors to maintain control over their data and request that it be shared with third-party services like equipment manufacturers or crop analytics firms. In certain B2B cases, the smart agriculture company can ask for fair compensation for sharing aggregated data insights.

If personal data is involved - such as specific information about a farm or farmer - the GDPR governs how this data is processed and shared, requiring user consent and protecting the farmer’s privacy throughout the process.

How IoT manufacturers adapt to EU data regulations

Implement robust data protection measures: Secure personal data with strong encryption, access controls, and anonymization. Obtain explicit user consent, ensure compliance with access and erasure requests, and support data portability. Processes for timely responses to data requests and identity verification are crucial.

Build systems for data access and sharing: Create mechanisms for users to easily share or revoke access to their data and establish clear frameworks for data sharing with third parties, including compensation rules where appropriate. Ensure these practices align with competition laws.

Partner with or create independent data intermediaries: Collaborate with neutral data intermediaries to handle data exchanges between parties securely and without bias or create an independent entity within your organization to fulfill this role, following the EU Data Governance Act’s guidelines.

Adopt privacy-by-design principles : Integrate privacy and security measures into the design phase of your products and services. This means designing IoT devices and platforms with built-in security and privacy features, such as anonymization, data minimization, and encryption, from the outset rather than adding these measures later.

Focus on data interoperability and standardization: Adopt standardized data formats to ensure that your IoT devices and platforms can communicate and exchange data seamlessly with other systems. This not only helps with regulatory compliance but also avoids vendor lock-in and enhances competitiveness by allowing your products to integrate more easily with third-party services.

The role of an IT enabler in navigating EU data regulatory landscape

Given today’s complex regulatory landscape, IoT manufacturers need a technology partner to stay compliant and create business opportunities. An IT enabler provides the tools, expertise, and infrastructure to help companies meet legal and compliance EU data regulations requirements efficiently. Here are the key areas where you’ll need support:

Regulatory compliance : Navigating complex frameworks requires a deep understanding of these regulations to ensure legal obligations are met. An IT enabler helps interpret laws, builds compliance-focused solutions, and keeps your business up to date with evolving regulations.

Technology solutions : To comply with privacy laws, businesses must implement secure data handling, processing, and sharing systems. Your IT partner offers scalable technology solutions to manage and protect personal and non-personal data.

Data exchanges : IoT manufacturers must enable secure, compliant data exchanges with external partners, including neutral data intermediaries and third-party services. An IT enabler designs and implements systems to facilitate these data exchanges while also ensuring transparency and fairness.

Operational simplicity : Compliance with regulations should not burden your core operations. An IT partner simplifies regulatory processes through automation, effective governance, and streamlined workflows.

Ongoing maintenance and updates : Once solutions are built and implemented, they require ongoing maintenance to comply with new laws and standards. A software development consultancy provides long-term support and regular updates to ensure your systems evolve alongside regulatory changes.

Customizable solutions : Every IoT manufacturer has unique business needs, and regulatory compliance often depends on industry-specific nuances. An sofwtare development consulting partner can develop custom-built solutions that not only meet legal standards but also align with your specific operational and business goals.

Integration with existing systems : Rather than replacing your entire IT infrastructure, an IT enabler integrates new compliance solutions with your existing systems, ensuring a smooth transition with minimal disruption.

At Grape Up , we provide the solutions, expertise, and long-term support to help you navigate these challenges and stay ahead in the regulatory landscape.

Need guidance on complex EU data regulations? We offer expert consulting to guide you.

Looking for secure data-sharing platforms? Our products ensure safe exchanges with third parties while keeping your business compliant.

Whether it’s managing compliance, data security, or third-party integrations, we provide the tools and expertise to support your needs.

.......................

Source:

https://www.researchgate.net/figure/nternet-of-Things-IoT-connected-devices-from-2015-to-2025-in-billions_fig1_325645304#:~:text=1%2C%20By%20the%20year%202025,of%2079%20zettabytes%20%5B12%5D%20.
https://digital-strategy.ec.europa.eu/en/policies/data-governance-act
https://digital-strategy.ec.europa.eu/en/policies/data-act
https://gdpr-info.eu/

written by

Adam Kozłowski

written by

Marcin Wiśniewski

Automotive

Software development

Android AAOS 14 - EVS network camera

The automotive industry has been rapidly evolving with technological advancements that enhance the driving experience and safety. Among these innovations, the Android Automotive Operating System (AAOS) has stood out, offering a versatile and customizable platform for car manufacturers.

The Exterior View System (EVS) is a comprehensive camera-based system designed to provide drivers with real-time visual monitoring of their vehicle's surroundings. It typically includes multiple cameras positioned around the vehicle to eliminate blind spots and enhance situational awareness, significantly aiding in maneuvers like parking and lane changes. By integrating with advanced driver assistance systems, EVS contributes to increased safety and convenience for drivers.

For more detailed information about EVS and its configuration, we highly recommend reading our article "Android AAOS 14 - Surround View Parking Camera: How to Configure and Launch EVS (Exterior View System)." This foundational article provides essential insights and instructions that we will build upon in this guide.

The latest Android Automotive Operating System , AAOS 14, presents new possibilities, but it does not natively support Ethernet cameras. In this article, we describe our implementation of an Ethernet camera integration with the Exterior View System (EVS) on Android.

Our approach involves connecting a USB camera to a Windows laptop and streaming the video using the Real-time Transport Protocol (RTP). By employing the powerful FFmpeg software, the video stream will be broadcast and described in an SDP (Session Description Protocol) file, accessible via an HTTP server. On the Android side, we'll utilize the FFmpeg library to receive and decode the video stream, effectively bringing the camera feed into the AAOS 14 environment.

This article provides a step-by-step guide on how we achieved this integration of the EVS network camera, offering insights and practical instructions for those looking to implement a similar solution. The following diagram provides an overview of the entire process:

Building FFmpeg Library for Android

To enable RTP camera streaming on Android, the first step is to build the FFmpeg library for the platform. This section describes the process in detail, using the ffmpeg-android-maker project. Follow these steps to successfully build and integrate the FFmpeg library with the Android EVS (Exterior View System) Driver.

Step 1: Install Android SDK

First, install the Android SDK. For Ubuntu/Debian systems, you can use the following commands:

sudo apt update && sudo apt install android-sdk

The SDK should be installed in /usr/lib/android-sdk .

Step 2: Install NDK

Download the Android NDK (Native Development Kit) from the official website:

https://developer.android.com/ndk/downloads

After downloading, extract the NDK to your desired location.

Step 3: Build FFmpeg

Clone the ffmpeg-android-maker repository and navigate to its directory:

git clone https://github.com/Javernaut/ffmpeg-android-maker.gitcd ffmpeg-android-maker

Set the environment variables to point to the SDK and NDK:

export ANDROID_SDK_HOME=/usr/lib/android-sdk export ANDROID_NDK_HOME=/path/to/ndk/

Run the build script:

./ffmpeg-android-maker.sh

This script will download FFmpeg source code and dependencies, and compile FFmpeg for various Android architectures.

Step 4: Copy Library Files to EVS Driver

After the build process is complete, copy the .so library files from build/ffmpeg/ to the EVS Driver directory in your Android project:

cp build/ffmpeg/*.so /path/to/android/project/packages/services/Car/cpp/evs/sampleDriver/aidl/

Step 5: Add Libraries to EVS Driver Build Files

Edit the Android.bp file in the aidl directory to include the prebuilt FFmpeg libraries:

cc_prebuilt_library_shared { name: "rtp-libavcodec", vendor: true, srcs: ["libavcodec.so"], strip: { none: true, }, check_elf_files: false, } cc_prebuilt_library { name: "rtp-libavformat", vendor: true, srcs: ["libavformat.so"], strip: { none: true, }, check_elf_files: false, } cc_prebuilt_library { name: "rtp-libavutil", vendor: true, srcs: ["libavutil.so"], strip: { none: true, }, check_elf_files: false, } cc_prebuilt_library_shared { name: "rtp-libswscale", vendor: true, srcs: ["libswscale.so"], strip: { none: true, }, check_elf_files: false, }

Add prebuilt libraries to EVS Driver app:

cc_binary { name: "android.hardware.automotive.evs-default", defaults: ["android.hardware.graphics.common-ndk_static"], vendor: true, relative_install_path: "hw", srcs: [ ":libgui_frame_event_aidl", "src/*.cpp" ], shared_libs: [ "rtp-libavcodec", "rtp-libavformat", "rtp-libavutil", "rtp-libswscale", "android.hardware.graphics.bufferqueue@1.0", "android.hardware.graphics.bufferqueue@2.0", android.hidl.token@1.0-utils, ....] }

By following these steps, you will have successfully built the FFmpeg library for Android and integrated it into the EVS Driver.

EVS Driver RTP Camera Implementation

In this chapter, we will demonstrate how to quickly implement RTP support for the EVS (Exterior View System) driver in Android AAOS 14. This implementation is for demonstration purposes only. For production use, the implementation should be optimized, adapted to specific requirements, and all possible configurations and edge cases should be thoroughly tested. Here, we will focus solely on displaying the video stream from RTP.

The main files responsible for capturing and decoding video from USB cameras are implemented in the EvsV4lCamera and VideoCapture classes. To handle RTP, we will copy these classes and rename them to EvsRTPCamera and RTPCapture . RTP handling will be implemented in RTPCapture . We need to implement four main functions:

bool open(const char* deviceName, const int32_t width = 0, const int32_t height = 0); void close(); bool startStream(std::function<void(RTPCapture*, imageBuffer*, void*)> callback = nullptr); void stopStream();

We will use the official example from the FFmpeg library, https://github.com/FFmpeg/FFmpeg/blob/master/doc/examples/demux_decode.c, which decodes the specified video stream into RGBA buffers. After adapting the example, the RTPCapture.cpp file will look like this:

#include "RTPCapture.h" #include <android-base/logging.h> #include <errno.h> #include <error.h> #include <fcntl.h> #include <memory.h> #include <stdio.h> #include <stdlib.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <unistd.h> #include <cassert> #include <iomanip> #include <stdio.h> #include <stdlib.h> #include <iostream> #include <fstream> #include <sstream> static AVFormatContext *fmt_ctx = NULL; static AVCodecContext *video_dec_ctx = NULL, *audio_dec_ctx; static int width, height; static enum AVPixelFormat pix_fmt; static enum AVPixelFormat out_pix_fmt = AV_PIX_FMT_RGBA; static AVStream *video_stream = NULL, *audio_stream = NULL; static struct SwsContext *resize; static const char *src_filename = NULL; static uint8_t *video_dst_data[4] = {NULL}; static int video_dst_linesize[4]; static int video_dst_bufsize; static int video_stream_idx = -1, audio_stream_idx = -1; static AVFrame *frame = NULL; static AVFrame *frame2 = NULL; static AVPacket *pkt = NULL; static int video_frame_count = 0; int RTPCapture::output_video_frame(AVFrame *frame) { LOG(INFO) << "Video_frame: " << video_frame_count++ << " ,scale height: " << sws_scale(resize, frame->data, frame->linesize, 0, height, video_dst_data, video_dst_linesize); if (mCallback) { imageBuffer buf; buf.index = video_frame_count; buf.length = video_dst_bufsize; mCallback(this, &buf, video_dst_data[0]); } return 0; } int RTPCapture::decode_packet(AVCodecContext *dec, const AVPacket *pkt) { int ret = 0; ret = avcodec_send_packet(dec, pkt); if (ret < 0) { return ret; } // get all the available frames from the decoder while (ret >= 0) { ret = avcodec_receive_frame(dec, frame); if (ret < 0) { if (ret == AVERROR_EOF || ret == AVERROR(EAGAIN)) { return 0; } return ret; } // write the frame data to output file if (dec->codec->type == AVMEDIA_TYPE_VIDEO) { ret = output_video_frame(frame); } av_frame_unref(frame); if (ret < 0) return ret; } return 0; } int RTPCapture::open_codec_context(int *stream_idx, AVCodecContext **dec_ctx, AVFormatContext *fmt_ctx, enum AVMediaType type) { int ret, stream_index; AVStream *st; const AVCodec *dec = NULL; ret = av_find_best_stream(fmt_ctx, type, -1, -1, NULL, 0); if (ret < 0) { fprintf(stderr, "Could not find %s stream in input file '%s'\n", av_get_media_type_string(type), src_filename); return ret; } else { stream_index = ret; st = fmt_ctx->streams[stream_index]; /* find decoder for the stream */ dec = avcodec_find_decoder(st->codecpar->codec_id); if (!dec) { fprintf(stderr, "Failed to find %s codec\n", av_get_media_type_string(type)); return AVERROR(EINVAL); } /* Allocate a codec context for the decoder */ *dec_ctx = avcodec_alloc_context3(dec); if (!*dec_ctx) { fprintf(stderr, "Failed to allocate the %s codec context\n", av_get_media_type_string(type)); return AVERROR(ENOMEM); } /* Copy codec parameters from input stream to output codec context */ if ((ret = avcodec_parameters_to_context(*dec_ctx, st->codecpar)) < 0) { fprintf(stderr, "Failed to copy %s codec parameters to decoder context\n", av_get_media_type_string(type)); return ret; } av_opt_set((*dec_ctx)->priv_data, "preset", "ultrafast", 0); av_opt_set((*dec_ctx)->priv_data, "tune", "zerolatency", 0); /* Init the decoders */ if ((ret = avcodec_open2(*dec_ctx, dec, NULL)) < 0) { fprintf(stderr, "Failed to open %s codec\n", av_get_media_type_string(type)); return ret; } *stream_idx = stream_index; } return 0; } bool RTPCapture::open(const char* /*deviceName*/, const int32_t /*width*/, const int32_t /*height*/) { LOG(INFO) << "RTPCapture::open"; int ret = 0; avformat_network_init(); mFormat = V4L2_PIX_FMT_YUV420; mWidth = 1920; mHeight = 1080; mStride = 0; /* open input file, and allocate format context */ if (avformat_open_input(&fmt_ctx, "http://192.168.1.59/stream.sdp", NULL, NULL) < 0) { LOG(ERROR) << "Could not open network stream"; return false; } LOG(INFO) << "Input opened"; isOpened = true; /* retrieve stream information */ if (avformat_find_stream_info(fmt_ctx, NULL) < 0) { LOG(ERROR) << "Could not find stream information"; return false; } LOG(INFO) << "Stream info found"; if (open_codec_context(&video_stream_idx, &video_dec_ctx, fmt_ctx, AVMEDIA_TYPE_VIDEO) >= 0) { video_stream = fmt_ctx->streams[video_stream_idx]; /* allocate image where the decoded image will be put */ width = video_dec_ctx->width; height = video_dec_ctx->height; pix_fmt = video_dec_ctx->sw_pix_fmt; resize = sws_getContext(width, height, AV_PIX_FMT_YUVJ422P, width, height, out_pix_fmt, SWS_BICUBIC, NULL, NULL, NULL); LOG(ERROR) << "RTPCapture::open pix_fmt: " << video_dec_ctx->pix_fmt << ", sw_pix_fmt: " << video_dec_ctx->sw_pix_fmt << ", my_fmt: " << pix_fmt; ret = av_image_alloc(video_dst_data, video_dst_linesize, width, height, out_pix_fmt, 1); if (ret < 0) { LOG(ERROR) << "Could not allocate raw video buffer"; return false; } video_dst_bufsize = ret; } av_dump_format(fmt_ctx, 0, src_filename, 0); if (!audio_stream && !video_stream) { LOG(ERROR) << "Could not find audio or video stream in the input, aborting"; ret = 1; return false; } frame = av_frame_alloc(); if (!frame) { LOG(ERROR) << "Could not allocate frame"; ret = AVERROR(ENOMEM); return false; } frame2 = av_frame_alloc(); pkt = av_packet_alloc(); if (!pkt) { LOG(ERROR) << "Could not allocate packet"; ret = AVERROR(ENOMEM); return false; } return true; } void RTPCapture::close() { LOG(DEBUG) << __FUNCTION__; } bool RTPCapture::startStream(std::function<void(RTPCapture*, imageBuffer*, void*)> callback) { LOG(INFO) << "startStream"; if(!isOpen()) { LOG(ERROR) << "startStream failed. Stream not opened"; return false; } stop_thread_1 = false; mCallback = callback; mCaptureThread = std::thread([this]() { collectFrames(); }); return true; } void RTPCapture::stopStream() { LOG(INFO) << "stopStream"; stop_thread_1 = true; mCaptureThread.join(); mCallback = nullptr; } bool RTPCapture::returnFrame(int i) { LOG(INFO) << "returnFrame" << i; return true; } void RTPCapture::collectFrames() { int ret = 0; LOG(INFO) << "Reading frames"; /* read frames from the file */ while (av_read_frame(fmt_ctx, pkt) >= 0) { if (stop_thread_1) { return; } if (pkt->stream_index == video_stream_idx) { ret = decode_packet(video_dec_ctx, pkt); } av_packet_unref(pkt); if (ret < 0) break; } } int RTPCapture::setParameter(v4l2_control&) { LOG(INFO) << "RTPCapture::setParameter"; return 0; } int RTPCapture::getParameter(v4l2_control&) { LOG(INFO) << "RTPCapture::getParameter"; return 0; } std::set<uint32_t> RTPCapture::enumerateCameraControls() { LOG(INFO) << "RTPCapture::enumerateCameraControls"; std::set<uint32_t> ctrlIDs; return std::move(ctrlIDs); } void* RTPCapture::getLatestData() { LOG(INFO) << "RTPCapture::getLatestData"; return nullptr; } bool RTPCapture::isFrameReady() { LOG(INFO) << "RTPCapture::isFrameReady"; return true; } void RTPCapture::markFrameConsumed(int i) { LOG(INFO) << "RTPCapture::markFrameConsumed frame: " << i; } bool RTPCapture::isOpen() { LOG(INFO) << "RTPCapture::isOpen"; return isOpened; }

Next, we need to modify EvsRTPCamera to use our RTPCapture class instead of VideoCapture . In EvsRTPCamera.h , add:

#include "RTPCapture.h"

And replace:

VideoCapture mVideo = {};

with:

RTPCapture mVideo = {};

In EvsRTPCamera.cpp , we also need to make changes. In the forwardFrame(imageBuffer* pV4lBuff, void* pData) function, replace:

mFillBufferFromVideo(bufferDesc, (uint8_t*)targetPixels, pData, mVideo.getStride());

with:

memcpy(targetPixels, pData, pV4lBuff->length);

This is because the VideoCapture class provides a buffer from the camera in various YUYV pixel formats. The mFillBufferFromVideo function is responsible for converting the pixel format to RGBA. In our case, RTPCapture already provides an RGBA buffer. This is done in the

int RTPCapture::output_video_frame(AVFrame *frame) function using sws_scale from the FFmpeg library.

Now we need to ensure that our RTP camera is recognized by the system. The EvsEnumerator class and its enumerateCameras function are responsible for detecting cameras. This function adds all video files from the /dev/ directory.

To add our RTP camera, we will append the following code at the end of the enumerateCameras function:

if (addCaptureDevice("rtp1")) { ++captureCount; }

This will add a camera with the ID "rtp1" to the list of detected cameras, making it visible to the system.

The final step is to modify the EvsEnumerator: :openCamera function to direct the camera with the ID "rtp1" to the RTP implementation. Normally, when opening a USB camera, an instance of the EvsV4lCamera class is created:

pActiveCamera = EvsV4lCamera::Create(id.data());

In our example, we will hardcode the ID check and create the appropriate object:

if (id == "rtp1") { pActiveCamera = EvsRTPCamera::Create(id.data()); } else { pActiveCamera = EvsV4lCamera::Create(id.data()); }

With this implementation, our camera should start working. Now we need to build the EVS Driver application and push it to the device along with the FFmpeg libraries:

mmma packages/services/Car/cpp/evs/sampleDriver/ adb push out/target/product/rpi4/vendor/bin/hw/android.hardware.automotive.evs-default /vendor/bin/hw/

Launching the RTP Camera

To stream video from your camera, you need to install FFmpeg ( https://www.ffmpeg.org/download.html#build-windows ) and an HTTP server on the computer that will be streaming the video.

Start FFmpeg (example on Windows):

ffmpeg -f dshow -video_size 1280x720 -i video="USB Camera" -c copy -f rtp rtp://192.168.1.53:8554

where:

-video_size is video resolution
"USB Camera" is the name of the camera as it appears in the Device Manager

"-c copy" means that individual frames from the camera (in JPEG format) will be copied to the RTP stream without changes. Otherwise, FFmpeg would need to decode and re-encode the image, introducing unnecessary delays.
"rtp://192.168.1.53:8554": 192.168.1.53 is the IP address of our Android device. You should adjust this accordingly. Port 8554 can be left as the default.

After starting FFmpeg, you should see output similar to this on the console:

Here, we see the input, output, and SDP sections. In the input section, the codec is JPEG, which is what we need. The pixel format is yuvj422p, with a resolution of 1920x1080 at 30 fps. The stream parameters in the output section should match.

Next, save the SDP section to a file named stream.sdp on the HTTP server. Our EVS Driver application needs to fetch this file, which describes the stream.

In our example, the Android device should access this file at: http://192.168.1.59/stream.sdp

The exact content of the file should be:

v=0 o=- 0 0 IN IP4 127.0.0.1 s=No Name c=IN IP4 192.168.1.53 t=0 0 a=tool:libavformat 61.1.100 m=video 8554 RTP/AVP 26

Now, restart the EVS Driver application on the Android device:

killall android.hardware.automotive.evs-default

Then, configure the EVS app to use the camera "rtp1". For detailed instructions on how to configure and launch the EVS (Exterior View System), refer to the article "Android AAOS 14 - Surround View Parking Camera: How to Configure and Launch EVS (Exterior View System)".

Performance Testing

In this chapter, we will measure and compare the latency of the video stream from a camera connected via USB and RTP.

How Did We Measure Latency?

Setup Timer: Displayed a timer on the computer screen showing time with millisecond precision.
Camera Capture: Pointed the EVS camera at this screen so that the timer was also visible on the Android device screen.
Snapshot Comparison: Took photos of both screens simultaneously. The time displayed on the Android device was delayed compared to the computer screen. The difference in time between the computer and the Android device represents the camera's latency.

This latency is composed of several factors:

Camera Latency: The time the camera takes to capture the image from the sensor and encode it into the appropriate format.
Transmission Time: The time taken to transmit the data via USB or RTP.
Decoding and Display: The time to decode the video stream and display the image on the screen.

Latency Comparison

Below are the photos showing the latency:

USB Camera

RTP Camera

From these measurements, we found that the average latency for a camera connected via USB to the Android device is 200ms , while the latency for the camera connected via RTP is 150ms . This result is quite surprising.

The reasons behind these results are:

The EVS implementation on Android captures video from the USB camera in YUV and similar formats, whereas FFmpeg streams RTP video in JPEG format.
The USB camera used has a higher latency in generating YUV images compared to JPEG. Additionally, the frame rate is much lower. For a resolution of 1280x720, the YUV format only supports 10 fps, whereas JPEG supports the full 30 fps.

All camera modes can be checked using the command:

ffmpeg -f dshow -list_options true -i video="USB Camera"

Conclusion

This article has taken you through the comprehensive process of integrating an RTP camera into the Android EVS (Exterior View System) framework, highlighting the detailed steps involved in both the implementation and the performance evaluation.

We began our journey by developing new classes, EvsRTPCamera and RTPCapture , which were specifically designed to handle RTP streams using FFmpeg. This adaptation allowed us to process and stream real-time video effectively. To ensure our system recognized the RTP camera, we made critical adjustments to the EvsEnumerator class. By customizing the enumerateCameras and openCamera functions, we ensured that our RTP camera was correctly instantiated and recognized by the system.

Next, we focused on building and deploying the EVS Driver application, including the necessary FFmpeg libraries, to our target Android device. This step was crucial for validating our implementation in a real-world environment. We also conducted a detailed performance evaluation to measure and compare the latency of video feeds from USB and RTP cameras. Using a timer displayed on a computer screen, we captured the timer with the EVS camera and compared the time shown on both the computer and Android screens. This method allowed us to accurately determine the latency introduced by each camera setup.

Our performance tests revealed that the RTP camera had an average latency of 150ms, while the USB camera had a latency of 200ms. This result was unexpected but highly informative. The lower latency of the RTP camera was largely due to the use of the JPEG format, which our particular USB camera handled less efficiently due to its slower YUV processing. This significant finding underscores the RTP camera's suitability for applications requiring real-time video performance, such as automotive surround view parking systems, where quick response times are essential for safety and user experience.

written by

Michał Jaskurzyński

Legacy modernization

Modernizing legacy applications with generative AI: Lessons from R&D Projects

As digital transformation accelerates, modernizing legacy applications has become essential for businesses to stay competitive. The application modernization market size, valued at USD 21.32 billion in 2023 , is projected to reach USD 74.63 billion by 2031 (1), reflecting the growing importance of updating outdated systems.

With 94% of business executives viewing AI as key to future success and 76% increasing their investments in Generative AI due to its proven value (2), it's clear that AI is becoming a critical driver of innovation. One key area where AI is making a significant impact is application modernization - an essential step for businesses aiming to improve scalability, performance, and efficiency.

Based on two projects conducted by our R&D team , we've seen firsthand how Generative AI can streamline the process of rewriting legacy systems.

Let’s start by discussing the importance of rewriting legacy systems and how GenAI-driven solutions are transforming this process.

Why re-write applications?

In the rapidly evolving software development landscape, keeping applications up-to-date with the latest programming languages and technologies is crucial. Rewriting applications to new languages and frameworks can significantly enhance performance, security, and maintainability. However, this process is often labor-intensive and prone to human error.

Generative AI offers a transformative approach to code translation by:

leveraging advanced machine learning models to automate the rewriting process
ensuring consistency and efficiency
accelerating modernization of legacy systems
facilitating cross-platform development and code refactoring

As businesses strive to stay competitive, adopting Generative AI for code translation becomes increasingly important. It enables them to harness the full potential of modern technologies while minimizing risks associated with manual rewrites.

Legacy systems, often built on outdated technologies, pose significant challenges in terms of maintenance and scalability. Modernizing legacy applications with Generative AI provides a viable solution for rewriting these systems into modern programming languages, thereby extending their lifespan and improving their integration with contemporary software ecosystems.

This automated approach not only preserves core functionality but also enhances performance and security, making it easier for organizations to adapt to changing technological landscapes without the need for extensive manual intervention.

Why Generative AI?

Generative AI offers a powerful solution for rewriting applications, providing several key benefits that streamline the modernization process.

Modernizing legacy applications with Generative AI proves especially beneficial in this context for the following reasons:

Identifying relationships and business rules: Generative AI can analyze legacy code to uncover complex dependencies and embedded business rules, ensuring critical functionalities are preserved and enhanced in the new system.
Enhanced accuracy: Automating tasks like code analysis and documentation, Generative AI reduces human errors and ensures precise translation of legacy functionalities, resulting in a more reliable application.
Reduced development time and cost: Automation significantly cuts down the time and resources needed for rewriting systems. Faster development cycles and fewer human hours required for coding and testing lower the overall project cost.
Improved security: Generative AI aids in implementing advanced security measures in the new system, reducing the risk of threats and identifying vulnerabilities, which is crucial for modern applications.
Performance optimization: Generative AI enables the creation of optimized code from the start, integrating advanced algorithms that improve efficiency and adaptability, often missing in older systems.

By leveraging Generative AI, organizations can achieve a smooth transition to modern system architectures, ensuring substantial returns in performance, scalability, and maintenance costs.

In this article, we will explore:

the use of Generative AI for rewriting a simple CRUD application
the use of Generative AI for rewriting a microservice-based application
the challenges associated with using Generative AI

For these case studies, we used OpenAI's ChatGPT-4 with a context of 32k tokens to automate the rewriting process, demonstrating its advanced capabilities in understanding and generating code across different application architectures.

We'll also present the benefits of using a data analytics platform designed by Grape Up's experts. The platform utilizes Generative AI and neural graphs to enhance its data analysis capabilities, particularly in data integration, analytics, visualization, and insights automation.

Project 1: Simple CRUD application

The source CRUD project was used as an example of a simple CRUD application - one written utilizing .Net Core as a framework, Entity Framework Core for the ORM, and SQL Server for a relational database. The target project containes a backend application created using Java 17 and Spring Boot 3.

Steps taken to conclude the project

Rewriting a simple CRUD application using Generative AI involves a series of methodical steps to ensure a smooth transition from the old codebase to the new one. Below are the key actions undertaken during this process:

initial architecture and data flow investigation - conducting a thorough analysis of the existing application's architecture and data flow.
generating target application skeleton - creating the initial skeleton of the new application in the target language and framework.
converting components - translating individual components from the original codebase to the new environment, ensuring that all CRUD operations were accurately replicated.
generating tests - creating automated tests for the backend to ensure functionality and reliability.

Throughout each step, some manual intervention by developers was required to address code errors, compilation issues, and other problems encountered after using OpenAI's tools.

Initial architecture and data flows’ investigation

The first stage in rewriting a simple CRUD application using Generative AI is to conduct a thorough investigation of the existing architecture and data flow. This foundational step is crucial for understanding the current system's structure, dependencies, and business logic.

This involved:

codebase analysis
data flow mapping – from user inputs to database operations and back
dependency identification
business logic extraction – documenting the core business logic embedded within the application

While OpenAI's ChatGPT-4 is powerful, it has some limitations when dealing with large inputs or generating comprehensive explanations of entire projects. For example:

OpenAI couldn’t read files directly from the file system
Inputting several project files at once often resulted in unclear or overly general outputs

However, OpenAI excels at explaining large pieces of code or individual components. This capability aids in understanding the responsibilities of different components and their data flows. Despite this, developers had to conduct detailed investigations and analyses manually to ensure a complete and accurate understanding of the existing system.

This is the point at which we used our data analytics platform. In comparison to OpenAI, it focuses on data analysis. It's especially useful for analyzing data flows and project architecture, particularly thanks to its ability to process and visualize complex datasets. While it does not directly analyze source code, it can provide valuable insights into how data moves through a system and how different components interact.

Moreover, the platform excels at visualizing and analyzing data flows within your application. This can help identify inefficiencies, bottlenecks, and opportunities for optimization in the architecture.

Generating target application skeleton

As with OpenAI's inability to analyze the entire project, the attempt to generate the skeleton of the target application was also unsuccessful, so the developer had to manually create it. To facilitate this, Spring Initializr was used with the following configuration:

Java: 17
Spring Boot: 3.2.2
Gradle: 8.5

Attempts to query OpenAI for the necessary Spring dependencies faced challenges due to significant differences between dependencies for C# and Java projects. Consequently, all required dependencies were added manually.

Additionally, the project included a database setup. While OpenAI provided a series of steps for adding database configuration to a Spring Boot application, these steps needed to be verified and implemented manually.

Converting components

After setting up the backend, the next step involved converting all project files - Controllers, Services, and Data Access layers - from C# to Java Spring Boot using OpenAI.

The AI proved effective in converting endpoints and data access layers, producing accurate translations with only minor errors, such as misspelled function names or calls to non-existent functions.

In cases where non-existent functions were generated, OpenAI was able to create the function bodies based on prompts describing their intended functionality. Additionally, OpenAI efficiently generated documentation for classes and functions.

However, it faced challenges when converting components with extensive framework-specific code. Due to differences between frameworks in various languages, the AI sometimes lost context and produced unusable code.

Overall, OpenAI excelled at:

converting data access components
generating REST APIs

However, it struggled with:

service-layer components
framework-specific code where direct mapping between programming languages was not possible

Despite these limitations, OpenAI significantly accelerated the conversion process, although manual intervention was required to address specific issues and ensure high-quality code.

Generating tests

Generating tests for the new code is a crucial step in ensuring the reliability and correctness of the rewritten application. This involves creating both unit tests and integration tests to validate individual components and their interactions within the system.

To create a new test, the entire component code was passed to OpenAI with the query: "Write Spring Boot test class for selected code."

OpenAI performed well at generating both integration tests and unit tests; however, there were some distinctions:

For unit tests , OpenAI generated a new test for each if-clause in the method under test by default.
For integration tests , only happy-path scenarios were generated with the given query.
Error scenarios could also be generated by OpenAI, but these required more manual fixes due to a higher number of code issues.

If the test name is self-descriptive, OpenAI was able to generate unit tests with a lower number of errors.

Project 2: Microservice-based application

As an example of a microservice-based application, we used the Source microservice project - an application built using .Net Core as the framework, Entity Framework Core for the ORM, and a Command Query Responsibility Segregation (CQRS) approach for managing and querying entities. RabbitMQ was used to implement the CQRS approach and EventStore to store events and entity objects. Each microservice could be built using Docker, with docker-compose managing the dependencies between microservices and running them together.

The target project includes:

a microservice-based backend application created with Java 17 and Spring Boot 3
a frontend application using the React framework
Docker support for each microservice
docker-compose to run all microservices at once

Project stages

Similarly to the CRUD application rewriting project, converting a microservice-based application using Generative AI requires a series of steps to ensure a seamless transition from the old codebase to the new one. Below are the key steps undertaken during this process:

initial architecture and data flows’ investigation - conducting a thorough analysis of the existing application's architecture and data flow.
rewriting backend microservices - selecting an appropriate framework for implementing CQRS in Java, setting up a microservice skeleton, and translating the core business logic from the original language to Java Spring Boot.
generating a new frontend application - developing a new frontend application using React to communicate with the backend microservices via REST APIs.
generating tests for the frontend application - creating unit tests and integration tests to validate its functionality and interactions with the backend.
containerizing new applications - generating Docker files for each microservice and a docker-compose file to manage the deployment and orchestration of the entire application stack.

Throughout each step, developers were required to intervene manually to address code errors, compilation issues, and other problems encountered after using OpenAI's tools. This approach ensured that the new application retains the functionality and reliability of the original system while leveraging modern technologies and best practices.

Initial architecture and data flows’ investigation

The first step in converting a microservice-based application using Generative AI is to conduct a thorough investigation of the existing architecture and data flows. This foundational step is crucial for understanding:

the system’s structure
its dependencies
interactions between microservices

Challenges with OpenAI
Similar to the process for a simple CRUD application, at the time, OpenAI struggled with larger inputs and failed to generate a comprehensive explanation of the entire project. Attempts to describe the project or its data flows were unsuccessful because inputting several project files at once often resulted in unclear and overly general outputs.

OpenAI’s strengths
Despite these limitations, OpenAI proved effective in explaining large pieces of code or individual components. This capability helped in understanding:

the responsibilities of different components
their respective data flows

Developers can create a comprehensive blueprint for the new application by thoroughly investigating the initial architecture and data flows. This step ensures that all critical aspects of the existing system are understood and accounted for, paving the way for a successful transition to a modern microservice-based architecture using Generative AI.

Again, our data analytics platform was used in project architecture analysis. By identifying integration points between different application components, the platform helps ensure that the new application maintains necessary connections and data exchanges.

It can also provide a comprehensive view of your current architecture, highlighting interactions between different modules and services. This aids in planning the new architecture for efficiency and scalability. Furthermore, the platform's analytics capabilities support identifying potential risks in the rewriting process.

Rewriting backend microservices

Rewriting the backend of a microservice-based application involves several intricate steps, especially when working with specific architectural patterns like CQRS (Command Query Responsibility Segregation) and event sourcing . The source C# project uses the CQRS approach, implemented with frameworks such as NServiceBus and Aggregates , which facilitate message handling and event sourcing in the .NET ecosystem.

Challenges with OpenAI
Unfortunately, OpenAI struggled with converting framework-specific logic from C# to Java. When asked to convert components using NServiceBus, OpenAI responded:

"The provided C# code is using NServiceBus, a service bus for .NET, to handle messages. In Java Spring Boot, we don't have an exact equivalent of NServiceBus, but here's how you might convert the given C# code to Java Spring Boot..."

However, the generated code did not adequately cover the CQRS approach or event-sourcing mechanisms.

Choosing Axon framework
Due to these limitations, developers needed to investigate suitable Java frameworks. After thorough research, the Axon Framework was selected, as it offers comprehensive support for:

domain-driven design
CQRS
event sourcing

Moreover, Axon provides out-of-the-box solutions for message brokering and event handling and has a Spring Boot integration library , making it a popular choice for building Java microservices based on CQRS.

Converting microservices
Each microservice from the source project could be converted to Java Spring Boot using a systematic approach, similar to converting a simple CRUD application. The process included:

analyzing the data flow within each microservice to understand interactions and dependencies
using Spring Initializr to create the initial skeleton for each microservice
translating the core business logic, API endpoints, and data access layers from C# to Java
creating unit and integration tests to validate each microservice’s functionality
setting up the event sourcing mechanism and CQRS using the Axon Framework, including configuring Axon components and repositories for event sourcing

Manual Intervention
Due to the lack of direct mapping between the source project's CQRS framework and the Axon Framework, manual intervention was necessary. Developers had to implement framework-specific logic manually to ensure the new system retained the original's functionality and reliability.

Generating a new frontend application

The source project included a frontend component written using aspnetcore-https and aspnetcore-react libraries, allowing for the development of frontend components in both C# and React.

However, OpenAI struggled to convert this mixed codebase into a React-only application due to the extensive use of C#.

Consequently, it proved faster and more efficient to generate a new frontend application from scratch, leveraging the existing REST endpoints on the backend.

Similar to the process for a simple CRUD application, when prompted with “Generate React application which is calling a given endpoint” , OpenAI provided a series of steps to create a React application from a template and offered sample code for the frontend.

OpenAI successfully generated React components for each endpoint
The CSS files from the source project were reusable in the new frontend to maintain the same styling of the web application.
However, the overall structure and architecture of the frontend application remained the developer's responsibility.

Despite its capabilities, OpenAI-generated components often exhibited issues such as:

mixing up code from different React versions, leading to code failures.
infinite rendering loops.

Additionally, there were challenges related to CORS policy and web security:

OpenAI could not resolve CORS issues autonomously but provided explanations and possible steps for configuring CORS policies on both the backend and frontend
It was unable to configure web security correctly.
Moreover, since web security involves configurations on the frontend and multiple backend services, OpenAI could only suggest common patterns and approaches for handling these cases, which ultimately required manual intervention.

Generating tests for the frontend application

Once the frontend components were completed, the next task was to generate tests for these components. OpenAI proved to be quite effective in this area. When provided with the component code, OpenAI could generate simple unit tests using the Jest library.

OpenAI was also capable of generating integration tests for the frontend application, which are crucial for verifying that different components work together as expected and that the application interacts correctly with backend services.

However, some manual intervention was required to fix issues in the generated test code. The common problems encountered included:

mixing up code from different React versions, leading to code failures.
dependencies management conflicts, such as mixing up code from different test libraries.

Containerizing new application

The source application contained Dockerfiles that built images for C# applications. OpenAI successfully converted these Dockerfiles to a new approach using Java 17 , Spring Boot , and Gradle build tools by responding to the query:

"Could you convert selected code to run the same application but written in Java 17 Spring Boot with Gradle and Docker?"

Some manual updates, however, were needed to fix the actual jar name and file paths.

Once the React frontend application was implemented, OpenAI was able to generate a Dockerfile by responding to the query:

"How to dockerize a React application?"

Still, manual fixes were required to:

replace paths to files and folders
correct mistakes that emerged when generating multi-staged Dockerfiles , requiring further adjustments

While OpenAI was effective in converting individual Dockerfiles, it struggled with writing docker-compose files due to a lack of context regarding all services and their dependencies.

For instance, some microservices depend on database services, and OpenAI could not fully understand these relationships. As a result, the docker-compose file required significant manual intervention.

Conclusion

Modern tools like OpenAI's ChatGPT can significantly enhance software development productivity by automating various aspects of code writing and problem-solving. Leveraging large language models, such as OpenAI over ChatGPT can help generate large pieces of code, solve problems, and streamline certain tasks.

However, for complex projects based on microservices and specialized frameworks, developers still need to do considerable work manually, particularly in areas related to architecture, framework selection, and framework-specific code writing.

What Generative AI is good at:

converting pieces of code from one language to another - Generative AI excels at translating individual code snippets between different programming languages, making it easier to migrate specific functionalities.
generating large pieces of new code from scratch - OpenAI can generate substantial portions of new code, providing a solid foundation for further development.
generating unit and integration tests - OpenAI is proficient in creating unit tests and integration tests, which are essential for validating the application's functionality and reliability.
describing what code does - Generative AI can effectively explain the purpose and functionality of given code snippets, aiding in understanding and documentation.
investigating code issues and proposing possible solutions - Generative AI can quickly analyze code issues and suggest potential fixes, speeding up the debugging process.
containerizing application - OpenAI can create Dockerfiles for containerizing applications, facilitating consistent deployment environments.

At the time of project implementation, Generative AI still had several limitations .

OpenAI struggled to provide comprehensive descriptions of an application's overall architecture and data flow, which are crucial for understanding complex systems.
It also had difficulty identifying equivalent frameworks when migrating applications, requiring developers to conduct manual research.
Setting up the foundational structure for microservices and configuring databases were tasks that still required significant developer intervention.
Additionally, OpenAI struggled with managing dependencies, configuring web security (including CORS policies), and establishing a proper project structure, often needing manual adjustments to ensure functionality.

Benefits of using the data analytics platform:

data flow visualization: It provides detailed visualizations of data movement within applications, helping to map out critical pathways and dependencies that need attention during re-writing.
architectural insights : The platform offers a comprehensive analysis of system architecture, identifying interactions between components to aid in designing an efficient new structure.
integration mapping: It highlights integration points with other systems or components, ensuring that necessary integrations are maintained in the re-written application.
risk assessment: The platform's analytics capabilities help identify potential risks in the transition process, allowing for proactive management and mitigation.

By leveraging GenerativeAI’s strengths and addressing its limitations through manual intervention, developers can achieve a more efficient and accurate transition to modern programming languages and technologies. This hybrid approach to modernizing legacy applications with Generative AI currently ensures that the new application retains the functionality and reliability of the original system while benefiting from the advancements in modern software development practices.

It's worth remembering that Generative AI technologies are rapidly advancing, with improvements in processing capabilities. As Generative AI becomes more powerful, it is increasingly able to understand and manage complex project architectures and data flows. This evolution suggests that in the future, it will play a pivotal role in rewriting projects.

Do you need support in modernizing your legacy systems with expert-driven solutions?

.................

Sources:

https://www.verifiedmarketresearch.com/product/application-modernization-market/
https://www2.deloitte.com/content/dam/Deloitte/us/Documents/deloitte-analytics/us-ai-institute-state-of-ai-fifth-edition.pdf

written by

Viktar Reut

Automotive

EU Data Act

Building EU-compliant connected car software under the EU Data Act

The EU Data Act is about to change the rules of the game for many industries, and automotive OEMs are no exception. With new regulations aimed at making data generated by connected vehicles more accessible to consumers and third parties, OEMs are experiencing a major shift. So, what does this mean for the automotive space?

First, it means rethinking how data is managed, shared, and protected . OEMs must now meet new requirements for data portability, security, and privacy, using software compliant with the EU Data Act.

This guide will walk you through how they can prepare to not just survive but thrive under the new regulations.

The EU Data Act deadlines OEMs can’t miss

- Chapter II (B2B and B2C data sharing) has a deadline of September 2025.
- Article 3 (accessibility by design) has a deadline of September 2026.
- Chapter IV (contractual terms between businesses) has a deadline of September 2027.

Compliance requirements for automotive OEMs

The EU Data Act establishes specific obligations for automotive OEMs to ensure secure, transparent, and fair data sharing with both consumers (B2C) and third-party businesses (B2B). The following key provisions outline the requirements that OEMs must fulfill to comply with the Act.

B2C obligations

Data accessibility for users:
- Connected products, such as vehicles, must be built in a way that makes data generated by their use accessible in a structured, machine-readable format. This requirement applies from the manufacturing stage, meaning the design process must incorporate data accessibility features.
User control over data:
- Users should have the ability to control how their data is used, including the right to share it with third parties of their choice. This requires OEMs to implement systems that allow consumers to grant and revoke access to their data seamlessly.
Transparency in data practices:
- OEMs are required to provide clear and transparent information about the nature and volume of collected data and the way to access it.
- When a user requests to make data available to a third party, the OEM must inform them about:

a) The identity of the third party

b) The purpose of data use

c) The type of data that will be shared

d) The right of the user to withdraw consent for the third party to access the data

B2B obligations

1. Fair access to data:

OEMs must ensure that data generated by connected products is accessible to third parties at the user’s request under fair, reasonable, and non-discriminatory conditions.
This means that data sharing cannot be restricted to certain partners or proprietary platforms; it must be available to a broad range of businesses, including independent repair shops, insurers, and fleet managers.

2. Compliance with security and privacy regulations:

While sharing non-personal data, OEMs must still comply with relevant data security and privacy regulations. This means that data must be protected from unauthorized access and that any data-sharing agreements are in line with the EU Data Act and GDPR.

3. Protection of trade secrets

OEMs have a right and obligation to protect their trade secrets and should only disclose them when necessary to meet the agreed purpose. This means identifying protected data, agreeing on confidentiality measures with third parties, and suspending data sharing if these measures are not properly followed or if sharing would cause significant economic harm.

Understanding the specific obligations is only the first step for automotive OEMs. Based on this information, they can build software compliant with the EU Data Act. To navigate these new requirements effectively, OEMs need to adopt an approach that not only meets regulatory demands but also strengthens their competitive edge.

Thriving under the EU Data Act: smart investments and privacy-first strategies

Automotive OEMs must take a strategic approach to both their software and operational frameworks, balancing compliance requirements with innovation and customer trust. The key is to prioritize solutions that improve data accessibility and governance while minimizing costs. This starts with redesigning connected products and services to align with the Act’s data-sharing mandates and creating solutions to handle data requests efficiently.

Another critical focus is balancing privacy concerns with data-sharing obligations . OEMs must handle non-personal data responsibly under the EU Data Act while managing personal data in accordance with GDPR. This includes providing transparency about data usage and giving customers control over their data.

To achieve this balance, OEMs should identify which data needs to be shared with third parties and integrate privacy considerations across all stages of product development and data management. Transparent communication about data use, supported by clear policies and customer controls, helps to reinforce this trust.

Opportunities under the EU Data Act

The EU Data Act presents compliance challenges, but it also offers significant opportunities for OEMs that are prepared to adapt. By meeting the Act’s requirements for fair data sharing, OEMs can expand their services and build new partnerships. While direct monetization from data access fees is limited, there are numerous opportunities to leverage shared data to develop new value-added services and improve operational efficiency.

Next steps for automotive OEMs

To move to implementation, OEMs must take targeted actions that address the compliance requirements outlined earlier. These steps lay the groundwork for integrating broader strategies and turning compliance efforts into opportunities for operational improvement and future growth.

Integrate data accessibility into vehicle design

Start integrating data accessibility into vehicle design now to comply by 2026. This involves adapting both front and back-end components of products and services to enable secure and seamless data access and transfer according to the EU Data Act.

Provide user and third-party access to generated data

Introduce easy-to-use mechanisms that let users request access to their data or share it with chosen third parties. Access control should be straightforward, involving simple user identification and making data accessible to authorized users upon request. Develop dedicated data-sharing solutions, applications, or portals that enable third parties to request access to data with user consent.

Implement trade secret protection measures

OEMs should protect their trade secrets by identifying which vehicle data is commercially sensitive. Implement measures like data encryption and access controls to safeguard this information when sharing data. Clearly communicate your approach to protecting trade secrets without disclosing the sensitive information itself.

Implement transparent and secure data handling

Provide clear information to users about what data is collected, how it is used, and with whom it is shared. Transparent data practices help build trust and align with users' data rights under the EU Data Act.

Remember about the non-personal data that is being collected, too. Apply appropriate measures to preserve data quality and prevent its unauthorized access, transfer, or use.

Enable data interoperability and portability

The Act sets out essential requirements to facilitate the interoperability of data and data-sharing mechanisms, with a strong emphasis on data portability. OEMs need to make their data systems compatible with third-party services, allowing data to be easily transferred between platforms.

For example, if a car owner wants to switch from an OEM-provided app to a third-party app for vehicle diagnostics, OEMs must not create technical, contractual, or organizational barriers that would make this switch difficult.

Prepare the data

Choose a data format that fulfills the EU Data Act’s requirement for data to be shared in a “commonly used and machine-readable format.” This approach supports data accessibility and usability across different platforms and services.

Moving forward with confidence

The EU Data Act is bringing new obligations but also offering valuable opportunities. Navigating these changes may seem challenging, but with the right approach, they can become a catalyst for growth.

‍

written by

Adam Kozłowski

written by

Marcin Wiśniewski

Automotive

Software development

AAOS 14 - Surround view parking camera: How to configure and launch exterior view system

EVS - park mode

The Android Automotive Operating System (AAOS) 14 introduces significant advancements, including a Surround View Parking Camera system. This feature, part of the Exterior View System (EVS), provides a comprehensive 360-degree view around the vehicle, enhancing parking safety and ease. This article will guide you through the process of configuring and launching the EVS on AAOS 14 .

Structure of the EVS system in Android 14

The Exterior View System (EVS) in Android 14 is a sophisticated integration designed to enhance driver awareness and safety through multiple external camera feeds. This system is composed of three primary components: the EVS Driver application, the Manager application, and the EVS App. Each component plays a crucial role in capturing, managing, and displaying the images necessary for a comprehensive view of the vehicle's surroundings.

EVS driver application

The EVS Driver application serves as the cornerstone of the EVS system, responsible for capturing images from the vehicle's cameras. These images are delivered as RGBA image buffers, which are essential for further processing and display. Typically, the Driver application is provided by the vehicle manufacturer, tailored to ensure compatibility with the specific hardware and camera setup of the vehicle.

To aid developers, Android 14 includes a sample implementation of the Driver application that utilizes the Linux V4L2 (Video for Linux 2) subsystem. This example demonstrates how to capture images from USB-connected cameras, offering a practical reference for creating compatible Driver applications. The sample implementation is located in the Android source code at packages/services/Car/cpp/evs/sampleDriver .

Manager application

The Manager application acts as the intermediary between the Driver application and the EVS App. Its primary responsibilities include managing the connected cameras and displays within the system.

Key Tasks :

Camera Management : Controls and coordinates the various cameras connected to the vehicle.
Display Management : Manages the display units, ensuring the correct images are shown based on the input from the Driver application.
Communication : Facilitates communication between the Driver application and the EVS App, ensuring a smooth data flow and integration.

EVS app

The EVS App is the central component of the EVS system, responsible for assembling the images from the various cameras and displaying them on the vehicle's screen. This application adapts the displayed content based on the vehicle's gear selection, providing relevant visual information to the driver.

For instance, when the vehicle is in reverse gear (VehicleGear::GEAR_REVERSE), the EVS App displays the rear camera feed to assist with reversing maneuvers. When the vehicle is in park gear (VehicleGear::GEAR_PARK), the app showcases a 360-degree view by stitching images from four cameras, offering a comprehensive overview of the vehicle’s surroundings. In other gear positions, the EVS App stops displaying images and remains in the background, ready to activate when the gear changes again.

The EVS App achieves this dynamic functionality by subscribing to signals from the Vehicle Hardware Abstraction Layer (VHAL), specifically the VehicleProperty::GEAR_SELECTION . This allows the app to adjust the displayed content in real-time based on the current gear of the vehicle.

Communication interface

Communication between the Driver application, Manager application, and EVS App is facilitated through the IEvsEnumerator HAL interface. This interface plays a crucial role in the EVS system, ensuring that image data is captured, managed, and displayed accurately. The IEvsEnumerator interface is defined in the Android source code at hardware/interfaces/automotive/evs/1.0/IEvsEnumerator.hal .

EVS subsystem update

Evs source code is located in: packages/services/Car/cpp/evs. Please make sure you use the latest sources because there were some bugs in the later version that cause Evs to not work.

cd packages/services/Car/cpp/evs git checkout main git pull mm adb push out/target/product/rpi4/vendor/bin/hw/android.hardware.automotive.evs-default /vendor/bin/hw/ adb push out/target/product/rpi4/system/bin/evs_app /system/bin/

EVS driver configuration

To begin, we need to configure the EVS Driver. The configuration file is located at /vendor/etc/automotive/evs/evs_configuration_override.xml .

Here is an example of its content:

In this configuration, two cameras are defined: /dev/video0 (rear) and /dev/video2 (front). Both cameras have one stream defined with a resolution of 1280 x 720, a frame rate of 30, and an RGBA format.

Additionally, there is one display defined with a resolution of 1280 x 800, a frame rate of 30, and an RGBA format.

Configuration details

The configuration file starts by specifying the number of cameras available to the EVS system. This is done within the <system> tag, where the <num_cameras> tag sets the number of cameras to 2.

Each camera device is defined within the <camera> tag. For example, the rear camera ( /dev/video0 ) is defined with various capabilities such as brightness, contrast, auto white balance, and more. These capabilities are listed under the <supported_controls> tag. Similarly, the front camera ( /dev/video2 ) is defined with the same set of controls.

Both cameras also have their supported stream configurations listed under the <stream> tag. These configurations specify the resolution, format, and frame rate of the video streams.

The display device is defined under the <display> tag. The display configuration includes supported input stream configurations, specifying the resolution, format, and frame rate.

EVS driver operation

When the EVS Driver starts, it reads this configuration file to understand the available cameras and display settings. It then sends this configuration information to the Manager application. The EVS Driver will wait for requests to open and read from the cameras, operating according to the defined configurations.

EVS app configuration

Configuring the EVS App is more complex. We need to determine how the images from individual cameras will be combined to create a 360-degree view. In the repository, the file packages/services/Car/cpp/evs/apps/default/res/config.json.readme contains a description of the configuration sections:

{ "car" : { // This section describes the geometry of the car "width" : 76.7, // The width of the car body "wheelBase" : 117.9, // The distance between the front and rear axles "frontExtent" : 44.7, // The extent of the car body ahead of the front axle "rearExtent" : 40 // The extent of the car body behind the rear axle }, "displays" : [ // This configures the dimensions of the surround view display { // The first display will be used as the default display "displayPort" : 1, // Display port number, the target display is connected to "frontRange" : 100, // How far to render the view in front of the front bumper "rearRange" : 100 // How far the view extends behind the rear bumper } ], "graphic" : { // This maps the car texture into the projected view space "frontPixel" : 23, // The pixel row in CarFromTop.png at which the front bumper appears "rearPixel" : 223 // The pixel row in CarFromTop.png at which the back bumper ends }, "cameras" : [ // This describes the cameras potentially available on the car { "cameraId" : "/dev/video32", // Camera ID exposed by EVS HAL "function" : "reverse,park", // Set of modes to which this camera contributes "x" : 0.0, // Optical center distance right of vehicle center "y" : -40.0, // Optical center distance forward of rear axle "z" : 48, // Optical center distance above ground "yaw" : 180, // Optical axis degrees to the left of straight ahead "pitch" : -30, // Optical axis degrees above the horizon "roll" : 0, // Rotation degrees around the optical axis "hfov" : 125, // Horizontal field of view in degrees "vfov" : 103, // Vertical field of view in degrees "hflip" : true, // Flip the view horizontally "vflip" : true // Flip the view vertically } ] }

The EVS app configuration file is crucial for setting up the system for a specific car. Although the inclusion of comments makes this example an invalid JSON, it serves to illustrate the expected format of the configuration file. Additionally, the system requires an image named CarFromTop.png to represent the car.

In the configuration, units of length are arbitrary but must remain consistent throughout the file. In this example, units of length are in inches.

The coordinate system is right-handed: X represents the right direction, Y is forward, and Z is up, with the origin located at the center of the rear axle at ground level. Angle units are in degrees, with yaw measured from the front of the car, positive to the left (positive Z rotation). Pitch is measured from the horizon, positive upwards (positive X rotation), and roll is always assumed to be zero. Please keep in mind that, unit of angles are in degrees, but they are converted to radians during configuration reading. So, if you want to change it in EVS App source code, use radians.

This setup allows the EVS app to accurately interpret and render the camera images for the surround view parking system.

The configuration file for the EVS App is located at /vendor/etc/automotive/evs/config_override.json . Below is an example configuration with two cameras, front and rear, corresponding to our driver setup:

{ "car": { "width": 76.7, "wheelBase": 117.9, "frontExtent": 44.7, "rearExtent": 40 }, "displays": [ { "_comment": "Display0", "displayPort": 0, "frontRange": 100, "rearRange": 100 } ], "graphic": { "frontPixel": -20, "rearPixel": 260 }, "cameras": [ { "cameraId": "/dev/video0", "function": "reverse,park", "x": 0.0, "y": 20.0, "z": 48, "yaw": 180, "pitch": -10, "roll": 0, "hfov": 115, "vfov": 80, "hflip": false, "vflip": false }, { "cameraId": "/dev/video2", "function": "front,park", "x": 0.0, "y": 100.0, "z": 48, "yaw": 0, "pitch": -10, "roll": 0, "hfov": 115, "vfov": 80, "hflip": false, "vflip": false } ] }

Running EVS

Make sure all apps are running:

ps -A | grep evs automotive_evs 3722 1 11007600 6716 binder_thread_read 0 S evsmanagerd graphics 3723 1 11362488 30868 binder_thread_read 0 S android.hardware.automotive.evs-default automotive_evs 3736 1 11068388 9116 futex_wait 0 S evs_app

To simulate reverse gear you can call:

evs_app --test --gear reverse

And park:

evs_app --test --gear park

EVS app should be displayed on the screen.

Troubleshooting

When configuring and launching the EVS (Exterior View System) for the Surround View Parking Camera in Android AAOS 14, you may encounter several issues.

To debug that, you can use logs from EVS system:

logcat EvsDriver:D EvsApp:D evsmanagerd:D *:S

Multiple USB cameras - image freeze

During the initialization of the EVS system, we encountered an issue with the image feed from two USB cameras. While the feed from one camera displayed smoothly, the feed from the second camera either did not appear at all or froze after displaying a few frames.

We discovered that the problem lay in the USB communication between the camera and the V4L2 uvcvideo driver. During the connection negotiation, the camera reserved all available USB bandwidth. To prevent this, the uvcvideo driver needs to be configured with the parameter quirks=128 . This setting allows the driver to allocate the USB bandwidth based on the actual resolution and frame rate of the camera.

To implement this solution, the parameter should be set in the bootloader, within the kernel command line, for example:

console=ttyS0,115200 no_console_suspend root=/dev/ram0 rootwait androidboot.hardware=rpi4 androidboot.selinux=permissive uvcvideo.quirks=128

After applying this setting, the image feed from both cameras should display smoothly, resolving the freezing issue.

Green frame around camera image

In the current implementation of the EVS system, the camera image is surrounded by a green frame, as illustrated in the following image:

To eliminate this green frame, you need to modify the implementation of the EVS Driver. Specifically, you should edit the GlWrapper.cpp file located at cpp/evs/sampleDriver/aidl/src/ .

In the void GlWrapper::renderImageToScreen() function, change the following lines:

-0.8, 0.8, 0.0f, // left top in window space 0.8, 0.8, 0.0f, // right top -0.8, -0.8, 0.0f, // left bottom 0.8, -0.8, 0.0f // right bottom

-1.0, 1.0, 0.0f, // left top in window space 1.0, 1.0, 0.0f, // right top -1.0, -1.0, 0.0f, // left bottom 1.0, -1.0, 0.0f // right bottom

After making this change, rebuild the EVS Driver and deploy it to your device. The camera image should now be displayed full screen without the green frame.

Conclusion

In this article, we delved into the intricacies of configuring and launching the EVS (Exterior View System) for the Surround View Parking Camera in Android AAOS 14. We explored the critical components that make up the EVS system: the EVS Driver, EVS Manager, and EVS App, detailing their roles and interactions.

The EVS Driver is responsible for providing image buffers from the vehicle's cameras, leveraging a sample implementation using the Linux V4L2 subsystem to handle USB-connected cameras. The EVS Manager acts as an intermediary, managing camera and display resources and facilitating communication between the EVS Driver and the EVS App. Finally, the EVS App compiles the images from various cameras, displaying a cohesive 360-degree view around the vehicle based on the gear selection and other signals from the Vehicle HAL.

Configuring the EVS system involves setting up the EVS Driver through a comprehensive XML configuration file, defining camera and display parameters. Additionally, the EVS App configuration, outlined in a JSON file, ensures the correct mapping and stitching of camera images to provide an accurate surround view.

By understanding and implementing these configurations, developers can harness the full potential of the Android AAOS 14 platform to enhance vehicle safety and driver assistance through an effective Surround View Parking Camera system. This comprehensive setup not only improves the parking experience but also sets a foundation for future advancements in automotive technology.

written by

Michał Jaskurzyński

Stay updated with our newsletter

Subscribe for fresh insights and industry analysis.