About us
Our services

Capabilities

Legacy Modernization
Data Platforms
AI & Advanced Analytics

Industries

Automotive
Finance
Manufacturing

Solutions

Databoostr

Data Sharing & Monetization Platform

Cloudboostr

Multicloud Enterprise Kubernetes

Looking for something else?

Contact us for tailored solutions and expert guidance.

Contact
Case studies
Resources

Resources

Blog

Read our blog and stay informed about the industry’s latest trends and technology.

Ready to find your breaking point?

Stay updated with our newsletter.

Subscribe

Insights

Ebooks

Explore our resources and learn about building modern software solutions from experts and practitioners.

Read more
Careers
Contact
Blog
Finance

How to enable data-driven innovation for the mobility insurance

Grape up Expert
November 24, 2025
•
5 min read

Table of contents

Heading 2
Heading 3
Heading 4
Heading 5
Heading 6

Schedule a consultation with automotive software experts

Contact us

 Digitalization has changed the way we shop, work, learn and take care of our health or travel. Cars are no longer used just to get from A to B. They are jam-packed with technology that connects us to the world, enhances safety, prevents breakdowns, and even provides entertainment. With the rise of the Internet of Things and artificial intelligence, a vehicle is no longer understood solely in terms of its performance and sleek design. It has become software on wheels, a gateway to new worlds - not just physical, but also virtual. And if the nature of insurance itself is changing, then the company offering insurance must keep up with these changes as well. Insurance needs digital innovation, as much as any other market area.


These days customers are looking for customization, personalization, and understanding their needs on an almost organic level. Data and advanced analytics allow us to effectively satisfy these needs. Thanks to them, it is possible to fine-tune the offer, not so much for a specific group, but for a particular person - their habits, daily schedule, interests, health restrictions, or aesthetic preferences. And in the case described by us - a person's  driving style and commuting patterns .

If you think about it, the insurer has the perfect tool in their hands. If they can tap into the potential of the  software-defined vehicle and equip it with the right applications, there will be nearly zero chance of inaccurate insurance risk estimates. Data doesn't lie and shows a factual, not imaginary picture of a driver's driving style and behavior on the road.

While in the traditional insurance model pricing is static and data is collected offline and not aligned with the driver's actual preferences, new technologies such as the  cloud, the IoT, and AI allow for these limitations to be effectively lifted.

With them, an offering is created that competes in the marketplace, generates new revenue streams within the company, and builds customer loyalty.

Data-driven innovation - easier said than done. Or maybe not?

The transformation of a vehicle from a traditionally understood mechanical device into a "smartphone on four wheels," as Akio Toyoda once said about modern vehicles, takes time and will not happen overnight. But year by year it already happens, and as the new car models distributed by the big corporations show, this process is actually underway.

        Read our article on the latest trends in the automotive industry    

The so-called software-defined vehicle that we are developing with our clients at Grape Up is a vehicle that moves through an ecosystem of numerous variables, accessed by different players and technologies.
Clearly, one such provider can be - and should be - the insurer whose products have been tied to the automotive market invariably since 1897, when a certain Gilbert J. Loomis, a resident of Dayton, Ohio, first purchased an automotive liability insurance policy.

However, for insurance companies to play an integral role in the use of vehicle-generated data, the driver must receive a precisely functioning and secure service from which they will derive real benefits. Without building specific technical competencies and  software-defined vehicle knowledge , the insurer cannot achieve these goals.

Only by creating this type of business unit from scratch in-house, or by partnering with software companies, will they be able to compete with insurtech startups like, e.g. Lemonade, which builds their businesses from the ground up  based on AI and data analytics .

The right technology partner will take care of:

  •  data security;
  •  selection of cloud and IoT technologies;
  •  and will ensure the reliability and scalability of the proposed solutions.

During this time, the insurer can focus on what they do best - developing insurance competencies and tweaking their offers.

How to choose the right technology partner?

Just as customers are looking for insurance that accommodates their driving and lifestyle, an insurance company should select a technology partner that has more than just technical skills to offer. After all, changing the model in which a traditional insurance company operates does not boil down to creating a digital sales channel on the Internet and launching a modern website. We are talking about a completely different scale of operations requiring the insurance company to be embedded in a completely new, rapidly developing environment.

Therefore they need a partner who naturally navigates the software-defined vehicle ecosystem, understands its specifics, and has experience in working with the automotive industry. Besides, it should be someone  knowledgeable about the specifics of the P&C insurance market and the challenges faced by the insurance client.

It is only at the intersection of these three areas: technology, automotive, and insurance, that competencies are built to effectively compete against modern insurtechs.

Like in the Japanese philosophy of ikigai, which explains how to find one's sense of purpose and give meaning to one's work, both companies can build valuable, useful solutions for users. They will bring satisfaction not only to customers but also to the insurance company, which will open a new revenue channel and meet the needs of the market.

Blog

Check related articles

Read our blog and stay informed about the industry's latest trends and solutions.

AI
Finance
Automotive

How AI is transforming automotive and car insurance

 The car insurance industry is experiencing a real revolution today. Insurers are more and more carefully targeting their offers using AI and machine learning features. Such innovations significantly enhance business efficiency, eliminate the risk of accidents and their consequences, and enable adaptation to modern realities.

Changes are needed today

Approximately $25 billion is "frozen" with insurers annually due to problems such as fraud, claims adjustment, delays in service garages, etc. However, customers are not always happy with the insurance amounts they receive and the fact that they often have to accept undervalued rates. The reason for this is that due to limited data, it is difficult to accurately identify the culprit of the incident. It is also often the case that compensation is based on rates lower than the actual value of the damage.

 Insurers today need to be aware of the ecosystem in which they operate . Clients are becoming more demanding and, according to an IBM Institute for Business Value (IBV) study, 50 percent of them prefer tailor-made products based on individual quotes. The very model of cooperation between businesses is also changing, as relations between insurance providers and car manufacturers are growing tighter. All of this is linked to the fact that  cars are becoming increasingly autonomous, allowing them to more closely monitor traffic incidents and driver behavior as well as manage risk. Estimates suggest there will be as many as one trillion connected devices by 2025, and by 2030 there will be an increasing percentage of vehicles with automated features (ADAS).

No wonder there's an increasing buzz about changes in the car insurance industry. And these are changes based on technology. The use of  artificial intelligence , machine learning, and  advanced data analytics in the cloud will allow for seamless adaptation to market expectations.

 CASE STUDY

 SARA Assicurazioni and Automobile Club Italia are already encouraging drivers to install ADAS systems in exchange for a 20% discount on their insurance premiums. Indeed, it has been demonstrated that such systems can slash the rate of liability claims for personal injury by 4-25% and by 7-22% for property damage.

Why is this so important for insurers who want to face the reality?

Artificial intelligence-based pricing models provide a significant reduction in the time needed to introduce new offerings and to make optimal decisions. The risk of being mispriced is also lowered, as is the time it takes to launch insurance products.

The new  AI-based insurance reality is happening as we speak. The digital-first companies like Lemonade, with their high flexibility in responding to market changes, are showing customers what solutions are feasible.  In doing so, they put pressure on those companies that still hesitate to test new models.

innovation in insurance

Areas of change in car insurance due to AI

Artificial intelligence and related technologies are having a huge impact on many aspects of  the insurance industry : quoting, underwriting, distribution, risk and claims management, and more.

Areas of change in car insurance due to AI

Changes in insurance distribution

Artificial intelligence algorithms smoothly create risk profiles so that the time required to purchase a policy is reduced to minutes. Smart contracts based on blockchain instantly authenticate payments from an online account. At the same time, contract processing and payment verification is also vastly streamlined, reducing insurers' client acquisition cost.

Advanced risk assessment and reliable pricing

Traditionally, insurance premiums are determined using the "cost-plus" method. This includes an actuarial assessment of the risk premium, a component for direct and indirect costs, and a margin. Yet it has quite a few drawbacks.

One of them is the inability to easily account for non-technical price determinants, as well as the inability to react quickly to shifting market conditions.

How is risk calculated? For car insurance companies, the assessment refers to accidents, road crashes, breakdowns, theft, and fatalities.

These days, all these aspects can be controlled by leveraging AI, coupled with IoT data that provides real-time insights. Customized pricing of policies, for instance,  can take into account GPS device dataon a vehicle’s location, speed, and distance traveled. This way, you can see whether the vehicle spends most of its time in the driveway or if, conversely, it frequently travels on highways, particularly at excessive speeds.

In addition, insurance companies can use a host of other  sensor and camera data, as well as reports and documents from previous claims. Having all this information gathered, algorithms are able to reliably determine risk profiles.

 CASE STUDY

 Ant Financial, a Chinese company that offers an ecosystem of merged digital products and services, specializes in creating highly detailed customer profiles. Their technology is based on artificial intelligence algorithms that assign car insurance points to each customer, similarly to credit scoring. They take into account such detailed factors as lifestyle and habits. Based on this, the app shows an individual score, assigning a product that matches the specific policyholder.

An in-depth analysis of claims

The cooperation between an insurance company and its client is based on the premise that both parties are pursuing to avoid potential losses. Unfortunately, sometimes accidents, breakdowns or thefts occur and a claims process must be implemented. Artificial intelligence, integrated IoT data, and  telematics come in handy irrespective of the type of claims we are handling.

  •  These technologies are suitable for, among other things, automatically generating not only damage information but also repair cost estimates.
  •  Machine learning techniques can estimate the average cost of claims for various client segments.
  •  Sending real-time alerts, in turn, enables the implementation of predictive maintenance.
  •  Once an image has been uploaded, an extensive database of parts and prices can be created.

The drivers themselves gain control as they can carry out the process of registering the damage from A to Z:  take a photo, upload it to the insurer's platform and get an instant quote for the repair costs. From now on, they are no longer reliant on workshop quotes, which were often highly overestimated in line with the principle: "the insurer will pay anyway".

Fraud prevention

29 billion dollars in annual losses These are losses to auto insurers that occur due to fraud. Fraudsters want to scam a company out of insurance money based on illegally orchestrated events. How to prevent this? The answer is AI.

 Analyzed data retrieved from cameras and sensors can reconstruct the details of a car accident with high precision. So, having an accident timeline generated by artificial intelligence facilitates accident investigation and claims management.

 CASE STUDY

 An advanced AI-based incident reconstruction has been tested lately on 200,000 vehicles as part of a collaboration between Israel's Project Nexar and a Japanese insurance company.

Assistance in the event of accidents

According to data from the OECD, car accident fatalities could be reduced by 44 percent if emergency medical services had access to real-time information about the injuries of involved parties.
Still, real-time assistance has great potential not only for public services but also in the context of auto insurance.

By leveraging AI to perform this,     insurers can provide drivers with quick and semi-automated responses during collisions and accidents    . For example, a chatbot can instruct the driver on how to behave, how to call for help, or how to help fellow passengers. All this is essential in the context of saving lives. At the same time, it is a way of reducing the consequences of an accident.

Transparent decision making (client perspective)

New technologies offer solutions to many problems not only for insurers but also for clients. The latter often complain about discrimination and unfair, from their point of view, calculations of policies and compensation.

"Smart automated gatekeepers" are superior in multiple ways to the imperfect solutions of traditional models. This is because, based on a number of reliable parameters, they facilitate the creation of more authoritative and personalized pricing policies. Data-rich and automated risk and damage assessments pay off for consumers because they have decision-making power based on how their actions affect insurance coverage.

The opportunities and future of AI in car insurance

McKinsey's analysis says that across functions and use cases AI investments are worth $1.1 trillion in potential annual value for the insurance industry.

The direction of changes is outlined in two ways: first by increasingly connected and  software-equipped vehicles with more sensors. Second, by the changing analytical skills of insurers. Data-driven vehicles will certainly affect more reliable and real-time consistent repair costs and, consequently, claims payments. And when it comes to planning offers and understanding the client, AI is an enabler of change for personalized, real-time service (24/7 virtual assistance) and for creating flexible policies. All signs indicate that such "abstract" parameters as education or earnings will cease to play a major role in this regard.

Tech impacting insurtech

As can be inferred from the diagram above, the greater the  impact of a given technology on an insurance company's business , the longer the time required for its implementation. Therefore, it is vital to consider the future on a macro scale, by planning the strategy not for 2 years, but for 10.

 The decisions you make today have a bearing on improving operational efficiency, minimizing costs, and opening up to individual client needs, which are becoming more and more coupled with digital technologies.

Read more
Finance
Automotive

How more connected vehicles on the road will impact the insurance industry

 By 2023, there will be over     350 million connected cars on the road    . What can the insurance industry do about it?     It turns out that quite a bit, as automotive companies, introducing the latest technological advances, are enabling new ways to mix driver behavior. This is of great importance in the context of creating offers, but not only. At stake is to maintain the position and competitiveness in the field of motor insurance.  

The automotive and car insurance industries are changing

The automotive market is already experiencing changes driven by innovative technologies. More often than not, these are based on the  software-defined vehicle (SDV) trend.

If the vehicle is equipped with embedded connectivity, it is able to provide very detailed vehicle and driver behavior data, such as:

● sudden acceleration or braking,
● taking sharp turns,
● peak activity times (nighttime drivers are more vulnerable),
● average speed and acceleration,
● performing dangerous maneuvers.

BBI & UBI and ADAS

Behavior-based (pay-how-you-drive) and usage-based insurance – UBI – (pay-as-you-drive) are the future of  car insurance programs . Meanwhile, as vehicles become smarter, more connected, and automated, insurers evaluate not only the driver's behavior but also the car s/he is driving.  This evaluation takes into account, among other things, the amount of advanced driver assistance systems (ADAS) that affect the safety of the vehicle's occupants.

Autonomous vehicles

And  Deloitte analysts note that self-driving (AV) cars, which are an interesting novelty now but will in time be a standard on par with human-driven vehicles, are also likely to force fundamental changes in insurers' product ranges, as in the risk assessment, pricing, and business models.

Connected cars

Change is already happening, and it will become even more pronounced in the years ahead. IoT Analytics predicts that by 2025, the total  number of IoT devices worldwide will exceed 27 billion. Plus,  experts predict that there will be 7.2 billion active smartphones and more than 400 million connected vehicles on the road during the same period.

This all clearly shows that we are in an entirely different reality than we were just a few or a dozen years ago.  Car insurers need to understand this if they want to maintain their foothold.

Telematics technologies are an obvious step into the future of the insurance industry

Insurance companies have been offering  usage-based and behavior-based products for years based on data from either additional devices or mobile apps. This is a fast-growing product area since  the UBI market is predicted     to be worth more than $105 billion in 2027    , up 23.61% annually.

 The best position in this arena is attained by businesses that started investing in telematics technology early and now can take pride in well-developed telematics products.

We are talking about brands such as  State Farm®, Nationwide, Allstate, and Progressive. Yet at the same time, companies that deemed telematics a passing trend and therefore didn't invest in it lost a very large amount of market share. The result? Now they have to catch up and race to keep up with the competition.

TSPs understand the potential of connected vehicle data

Insuring companies are not the only ones who recognize the importance of implementing their telematics-based solutions.  Telematics services providers understand that value as well, so they invest in building out new capabilities of their products.

This is the case with  GEICO , the second-largest auto insurer in the U.S. (right after Progressive). As Ajit Jain, vice president of Insurance Operations at  Berkshire Hathaway claims :  GEICO had clearly missed the business and were late in terms of appreciating the value of telematics. They have woken up to the fact that telematics plays a big role in matching rate to risk. They have a number of initiatives, and, hopefully, they will see the light of day before, not too long, and that'll allow them to catch up with their competitors, in terms of the issue of matching rate to risk .

Telematics companies see potential in partnering with the insurance industry

Insurance companies are not the only ones who recognize the importance of implementing new data-driven technology solutions. The relationship is two-way, as telematics industry representatives, in turn,  are willing to invest in collaboration with insurers and put the customer from this market sector first.

For example, Cambridge Mobile Telematics (CMT), the world's largest telematics provider, has recently announced the expansion of its proprietary DriveWell® telematics platform to networked vehicles. Their flagship software has previously collected sensor data from millions of IoT devices, including smartphones, tags, in-car cameras, third-party devices, etc. From now on, that scope continues to expand by specifically including connected vehicles to create a unified view of driver and vehicle behavioral risk.

This synergy of all acquired data is mainly dedicated to customers in the auto insurance industry, who gain insight into what is happening on the road and behind the wheel.  As Hari Balakrishnan, CTO and founder of CMT explains :  There is a wave of innovative IoT data sources coming that will be critical to understanding driving risk and lowering crash rates. CMT fuses these disparate data sources to produce a unified view of driving .

Current UBI solutions can be flawed

Existing methods of data collection for insurers also rely on modern technologies, but these can be unreliable.  All three methods have their drawbacks: devices plugged into the On-Board Diagnostic (OBD) system, smartphone apps and tags stuck to the windshield.

The first method provides insight into the driver's precise behavior data, downloaded directly from the engine control module (ECM). Weaknesses?  The fact that OBD-II devices are limited to the data found in the ECM, for example, while those from other vehicle components remain inaccessible.

In this respect, mobile apps are certainly better, providing insurers with a simple way to launch their own  telematics-based program . . In addition, data is collected every time the user drives the vehicle. The disadvantage, however, is that the software does not connect directly to the vehicle's systems. Therefore, the data points are subject to a margin of error, and it also happens that the automatic driving recognition fails and includes in the scoring journeys as a passenger in another car, for example.

Bluetooth-based tags, which is the last solution described here, are installed on the vehicle's windshield or rear window. Like mobile apps, the tags have no direct connection to the vehicle's systems and are therefore prone to bugs.

The conclusions are obvious

 Thus, there is a lot to suggest that if an insurer is looking for truly reliable technology, it should opt to use embedded telematics, or data. This is what enables dynamic and, above all, unconditional data collection to reliably assess the risk associated with individual clients.

 The data sent by connected cars is more accurate, more detailed, and in much larger quantities compared to other solutions. And this allows  insurance companies to better understand customers and their behavior and, based on this information, offer products that are better suited to their needs, as well as more profitable.

Industry insiders don't need much convincing about the advantages of telematics and connected cars over other driver data collection solutions.  Data from cars connected to the network are instantly obtainable. Of course, you can enrich it and give it context by using information from smartphones, but in most cases, it is not even necessary.  So why invest in something unreliable, which by definition has vulnerabilities and does not meet 100 percent of your needs, when you can opt for a more comprehensive technology that offers more features right from the start.

Considerable importance of connected car data for the insurance industry

Connected car data is the subsequent step in building the ultimate telematics-based products. It is acquired without the need to install additional components. All it takes is a vehicle user's consent to use the data, and then the insurance company obtains the data directly from the OEM.

3 steps to building products based on telematics data for the insurance industry

The information obtained from UBI vehicles can be used successfully and all stakeholders benefit: insurers, as they gain a better understanding of their customers and can better assess risk; OEMs, as it allows them to monetize the data; and finally consumers, who receive a better, more personalized offer this way. J.D. Power points out that 83% of policyholders who had positive claims experience renewed their policies, compared to only  10% who gave negative reviews .

In addition, such reliable data serves not only to improve the profitability of an insurance portfolio, but also to improve road safety. Insurers can offer incentives that will encourage their customers to continuously improve their driving style and increase their care for themselves and other road users.

Even now, market leaders who understand the value of investing in innovation are offering their customers the opportunity to share data from connected cars for UBI/BBI purposes. One example is the  State Farm® brand, which offers discounts based on driving behavior. The driver's on-the-road behavior ( sharp braking or no braking, rapid acceleration, swift turns) and driving mileage are automatically sent to the data manager after each trip, so be sure to enable data sharing and location services on your saved vehicle. This information is used to update your Drive Safe & Save discount each time you renew your policy.  The safer you drive, the more you can save .

Likewise, Ford Motor Company is increasingly shifting toward using driver data in UBI programs based on connected vehicles.  To that end, the automotive giant has partnered with a mobility and analytics brand. Their joint project is expected to empower drivers with more control over how much they pay for their car insurance. Drivers can voluntarily share their driving data from activated Ford vehicles with Arity's centralized telematics platform, and it will then be delivered via Arity's API. Drivesight® to insurers.  The obtained risk index can be used to price auto insurance by     any participating insurer    .

Currently, connected cars are only one option, as many insurance companies are still using, for example, mobile applications in parallel. However, we can already see that the trend of using CC data is present on the market and the number of companies offering such an option to their clients will grow. This is something to be reckoned with.

Significant benefits

For insurers, the benefits are tangible. According to Swiss Re,  with 20,000 claims handled per year, the average savings after implementing the above technologies     amounted to 10-30 USD per claim    .

Telematics also helps to curb so-called claims inflation. Increasingly advanced vehicles are equipped with complex components, which can be costly to replace. Fortunately, today's insurer has the ability to create its own strategy based on the changing cost of spare parts and damage history for major car models.  This enables them to develop new pricing that includes inflated compensation costs.

The sooner, the better

Leveraging data and analytics based on artificial intelligence is guaranteed to drive growth. Expanded sources of information  improve the customer experience and help streamline operational processes. The benefits are thus evident across the entire value chain.  We can confidently say that never before in history has technology been so intertwined with the insurance industry.

That's why all insurance companies should start working on incorporating connected car data into their programs now. The sooner they do, the better positioned they will be when such vehicles become mainstream on the road. After all,  the share of new vehicles with built-in connectivity     will reach 96% in 2030    .

That's what  Evangelos Avramakis, Head Digital Ecosystems R&D, Swiss Re Institute Research & Engagement advises insurance companies to do:  Starting small then scaling fast might be a good strategy (...) There is so much you can do with data. But you need to take a different approach, depending on whether you want to improve claims processing or create new products. Conversely, this is what Nelson Tham, eAdmin Expert Asia, P&C Business Management, thinks about implementations:  Whenever an SME thinks about digitalization, it intimidates them. But it need not be the case if we start small. They can begin by reviewing their internal processes, see how data flows, turn that into structured data, then analyze this data for more meaningful insights .

How the insurance industry should approach the subject?

Insurers should start by answering key questions like: where connected car data will deliver the most value for my business? What internal capabilities do we have and need? Do we have the required infrastructure, process and skills to leverage connected car data? What investments in technology are necessary to deliver on our goals?

Lastly, they need to consider whether they can better and faster achieve those goals by building required capabilities in-house or working with partners.

A good business and technology partner for the insurance industry is fundamental

Using  connected car data is not that straightforward. It requires know-how and the right technology background, as well as finding the right partner to collaborate with.

 A well-matched partner will help change the current operating model, by combining automotive and technology competencies and at the same time understanding the specifics of the insurance industry. Some processes simply have to be carried out in a comprehensive and holistic way.

 At GrapeUp, we help implement new approaches to an existing strategy. Operating at the intersection of automotive and insurance, we specialize in the technologies of tomorrow. Contact us if you want to boost your business performance.

Read more
Software development
AI

Leveraging AI to improve VIN recognition - how to accelerate and automate operations in the insurance industry

Here we share our approach to automatic Vehicle Identification Number (VIN) detection and recognition using Deep Neural Networks. Our solution is robust in many aspects such as accuracy, generalization, and speed, and can be integrated into many areas in the insurance and automotive sectors.

Our goal is to provide a solution allowing us to take a picture using a mobile app and read the VIN that is present in the image. With all the similarities to any other OCR application and common features, the differences are colossal.

Our objective is to create a reliable solution and to do so we jumped directly into analysis of the real domain images.

VINs are located in many places on a car and its parts. The most readable are those printed on side doors and windshields. Here we focus on VINs from windshields.

OCR doesn’t seem to be rocket science now, does it? Well, after some initial attempts, we realized we’re not able to use any available commercial tools with success, and the problem was much harder than we had thought.

How do you like this example of KerasOCR ?

Despite many details, like the fact that VINs don’t contain the characters ‘I’, ‘O’, ‘Q’, we have very specific distortions, proportions, and fonts.

Initial approach

How can we approach the problem? The most straightforward answer is to divide the system into two components:

VIN detection VIN recognition Cropping the characters from the big image Recognizing cropped characters

In the ideal world images like that:

Will be processed this way:

After we have the intuition how the problem looks like, we can we start solving it. Needless to say, there is no “VIN reading” task available on the internet, therefore we need to design every component of our solution from scratch. Let’s introduce the most important stages we’ve created, namely:

  • VIN detection
  • VIN recognition
  • Training data generation
  • Pipeline

VIN detection

Our VIN detection solution is based on two ideas:

  • Encouraging users to take a photo with VIN in the center of the picture - we make that easier by showing the bounding box.
  • Using Character Region Awareness for Text Detection (CRAFT) - a neural network to mark VIN precisely and be more error-prone.

CRAFT

The CRAFT architecture is trying to predict a text area in the image by simultaneously predicting the probability that the given pixel is the center of some character and predicting the probability that the given pixel is the center of the space between the adjacent characters. For the details, we refer to the original paper .

The image below illustrates the operation of the network:

Before actual recognition, it had sound like a good idea to simplify the input image vector to contain all the needed information and no redundant pixels. Therefore, we wanted to crop the characters’ area from the rest of the background.

We intended to encourage a user to take a photo with a good VIN size, angle, and perspective.

Our goal was to be prepared to read VINs from any source, i.e. side doors. After many tests, we think the best idea is to send the area from the bounding box seen by users and then try to cut it more precisely using VIN detection. Therefore, our VIN detector can be interpreted more like a VIN refiner.

It would be remiss if we didn’t note that CRAFT is exceptionally unusually excellent. Some say every precious minute communing with it is pure joy.

Once the text is cropped, we need to map it to a parallel rectangle. There are dozens of design dictions such as the affine transform, resampling, rectangle, resampling for text recognition, etc.

Having ideally cropped characters makes recognition easier. But it doesn’t mean that our task is completed.

VIN recognition

Accurate recognition is a winning condition for this project. First, we want to focus on the images that are easy to recognize – without too much noise, blur, or distortions.

Sequential models

The SOTA models tend to be sequential models with the ability to recognize the entire sequences of characters (words, in popular benchmarks) without individual character annotations. It is indeed a very efficient approach but it ignores the fact that collecting character bounding boxes for synthetic images isn’t that expensive.

As a result, we devaluated supposedly the most important advantage of the sequential models. There are more, but are they worth watching out all the traps that come with them?

First of all, training attention-based model is very hard in this case because of

AI

As you can see, the target characters we want to recognize are dependent on history. It could be possible only with a massive training dataset or careful tuning, but we omitted it.

As an alternative, we can use Connectionist Temporal Classification (CTC) models that in opposite predict labels independently of each other.

More importantly, we didn’t stop at this approach. We utilized one more algorithm with different characteristics and behavior.

YOLO

You Only Look Once is a very efficient architecture commonly used for fast and accurate object detection and recognition. Treating a character as an object and recognizing it after the detection seems to be a definitely worth trying approach to the project. We don’t have the problem and there are some interesting tweaks that can allow even more precise recognition in our case. Last but not least, we are able to have a bigger control of the system as much of the responsibility is transferred from the neural network.

However, the VIN recognition requires some specific design of YOLO. We used YOLO v2 because the latest architecture patterns are more complex in areas that do not fully address our problem.

  • We use 960 x 32 px input (so images cropped by CRAFT are usually resized to meet this condition). Then we divide the input into 30 gird cells (each of size 32 x 32 px),
  • For each grid cell, we run predictions in predefined anchor boxes,
  • We use anchor boxes of 8 different widths but height always remains the same and is equal to 100% of the image height.

As the results came, our approach proved to be effective in recognizing individual characters from VIN.

Metrics

Appropriate metrics becomes crucial in machine learning-based solutions as they drive your decisions and project dynamic. Fortunately, we think simple accuracy fulfills the demands of a precise system and we can omit the research in this area.

We just need to remember one fact: a typical VIN contains 17 characters, and it’s enough to miss one of them to classify the prediction as wrong. At any point of work, we measure Character Recognition Rate (CER) to understand the development better. CERs at a level 5% (5% of wrong characters) may result in accuracy lower than 75%.

About the models tuning

It's easy to notice that all OCR benchmark solutions have much bigger effective capacity that exceeds the complexity of our task despite being too general as well at the same time. That itself emphasizes the danger of overfitting and directs our focus to generalization ability.

It is important to distinguish hyperparameters tuning from architectural design. Apart from ensuring information flow through the network extracts correct features, we do not dive into extended hyperparameters tuning.

Training data generation

We skipped one important topic: the training data.

Often, we support our models with artificial data with reasonable success but this time the profit is huge. Cropped synthetized texts are so similar to the real images that we suppose we can base our models on them, and only finetune it carefully with real data.

Data generation is a laborious, tricky job. Some say your model is as good as your data. It feels like the craving and any mistake can break your material. Worse, you can spot it as late as after the training.

We have some pretty handy tools in arsenal but they are, again, too general. Therefore we had to introduce some modifications.

Actually, we were forced to generate more than 2M images. Obviously, there is no point nor possibility of using all of them. Training datasets are often crafted to resemble the real VINs in a very iterative process, day after day, font after font. Modeling a single General Motors font took us at least a few attempts.

But finally, we got there. No more T’s as 1’s, V’s as U’s, and Z’s as 2’s!

We utilized many tools. All have advantages and weaknesses and we are very demanding. We need to satisfy a few conditions:

  • We need a good variance in backgrounds. It’s rather hard to have a satisfying amount of windshields background, so we’d like to be able to reuse those that we have, and at the same time we don’t want to overfit to them, so we want to have some different sources. Artificial backgrounds may not be realistic enough, so we want to use some real images from outside our domain,
  • Fonts, perhaps most important ingredients in our combination, have to resemble creative VIN’s fonts (who made them!?) and cannot interfere with each other. At the same time, the number of car manufacturers is much higher than our collector’s impulses, so we have to be open to unknown shapes.

The below images are the example of VIN data generation for recognizers:

Putting everything together

It’s the art of AI to connect so many components into a working pipeline and not mess it up.

Moreover, we have a lot of traps here. Mind these images:

VIN labels often consist of separated strings, two rows, logos and bar codes present near the caption.

90% of end-to-end accuracy provided by our VIN reader

Under one second solely on mid-quality CPU, our solution has over 90% of end-to-end accuracy.

This result depends on the problem definition and test dataset. For example, we have to decide what to do with the images that are impossible to read by a human. Nevertheless, not regarding the dataset, we approached human-level performance which is a typical reference level in Deep Learning projects.

We also managed to develop a mobile offline version of our system with similar inference accuracy but a bit slower processing time.

App intelligence

While working on the tools designed for business , we can’t forget about the real use-case flow. With the above pipeline, we’re absolutely unresistant to photos that are impossible to read, even though we want it to be. Often similar situations happen due to:

  • incorrect camera focus,
  • light flashes,
  • dirt surfaces,
  • damaged VIN plate.

Usually, we can prevent these situations by asking users to change the angle or retake a photo, before we send it to the further processing engines.

However, the classification of these distortions is a pretty complex task! Nevertheless, we implemented a bunch of heuristics and classifiers that allow us to ensure that VIN, if recognized, is correct. For the details, you have to wait for the next post.

Last but not least, we’d like to mention that, as usual, there are a lot of additional components built around our VIN Reader . Apart from a mobile application, offline on-device recognition, we’ve implemented remote backend, pipelines, tools for tagging, semi-supervised labeling, synthesizers, and more.

https://youtu.be/oACNXmlUgtY

Read more
View all
Connect

Interested in our services?

Reach out for tailored solutions and expert guidance.

Stay updated with our newsletter

Subscribe for fresh insights and industry analysis.

About UsCase studiesContactCareers
Capabilities:
Legacy ModernizationData PlatformsArtificial Intelligence
Industries:
AutomotiveFinanceManufacturing
Solutions:
DataboostrCloudboostr
Resources
BlogInsights
© Grape Up 2025
Cookies PolicyPrivacy PolicyTerms of use
Grape Up uses cookies

This website uses cookies to improve its user experience and provide personalized content for you. We use cookies for web analytics and advertising. You can accept these cookies by clicking "OK" or go to Details in order to manage your cookies preferences more precisely. To learn more, check out our Privacy and Cookies Policy

Accept allDetails
Grape Up uses cookies

Essential website cookies are necessary to provide you with services available through the website, autosave your settings and preferences, and to enhance the performance and security of the website - you have the right not to accept them through your web browser's settings, but your access to some functionality and areas of our website may be restricted.

Analytics cookies: (our own and third-party : Google, HotJar) – you can accept these cookies below:

Marketing cookies (third-party cookies: Hubspot, Facebook, LinkedIn) – you can accept these cookies below:

Ok