About us
Our services

Capabilities

Legacy Modernization
Data Platforms
AI & Advanced Analytics

Industries

Automotive
Finance
Manufacturing

Solutions

Databoostr

Data Sharing & Monetization Platform

Cloudboostr

Multicloud Enterprise Kubernetes

Looking for something else?

Contact us for tailored solutions and expert guidance.

Contact
Case studies
Resources

Resources

Blog

Read our blog and stay informed about the industry’s latest trends and technology.

Ready to find your breaking point?

Stay updated with our newsletter.

Subscribe

Insights

Ebooks

Explore our resources and learn about building modern software solutions from experts and practitioners.

Read more
Careers
Contact
Blog
AI
Finance
Automotive

How AI is transforming automotive and car insurance

Grape up Expert
April 22, 2022
•
5 min read

Table of contents

Heading 2
Heading 3
Heading 4
Heading 5
Heading 6

Schedule a consultation with automotive software experts

Contact us

 The car insurance industry is experiencing a real revolution today. Insurers are more and more carefully targeting their offers using AI and machine learning features. Such innovations significantly enhance business efficiency, eliminate the risk of accidents and their consequences, and enable adaptation to modern realities.

Changes are needed today

Approximately $25 billion is "frozen" with insurers annually due to problems such as fraud, claims adjustment, delays in service garages, etc. However, customers are not always happy with the insurance amounts they receive and the fact that they often have to accept undervalued rates. The reason for this is that due to limited data, it is difficult to accurately identify the culprit of the incident. It is also often the case that compensation is based on rates lower than the actual value of the damage.

 Insurers today need to be aware of the ecosystem in which they operate . Clients are becoming more demanding and, according to an IBM Institute for Business Value (IBV) study, 50 percent of them prefer tailor-made products based on individual quotes. The very model of cooperation between businesses is also changing, as relations between insurance providers and car manufacturers are growing tighter. All of this is linked to the fact that  cars are becoming increasingly autonomous, allowing them to more closely monitor traffic incidents and driver behavior as well as manage risk. Estimates suggest there will be as many as one trillion connected devices by 2025, and by 2030 there will be an increasing percentage of vehicles with automated features (ADAS).

No wonder there's an increasing buzz about changes in the car insurance industry. And these are changes based on technology. The use of  artificial intelligence , machine learning, and  advanced data analytics in the cloud will allow for seamless adaptation to market expectations.

 CASE STUDY

 SARA Assicurazioni and Automobile Club Italia are already encouraging drivers to install ADAS systems in exchange for a 20% discount on their insurance premiums. Indeed, it has been demonstrated that such systems can slash the rate of liability claims for personal injury by 4-25% and by 7-22% for property damage.

Why is this so important for insurers who want to face the reality?

Artificial intelligence-based pricing models provide a significant reduction in the time needed to introduce new offerings and to make optimal decisions. The risk of being mispriced is also lowered, as is the time it takes to launch insurance products.

The new  AI-based insurance reality is happening as we speak. The digital-first companies like Lemonade, with their high flexibility in responding to market changes, are showing customers what solutions are feasible.  In doing so, they put pressure on those companies that still hesitate to test new models.

innovation in insurance

Areas of change in car insurance due to AI

Artificial intelligence and related technologies are having a huge impact on many aspects of  the insurance industry : quoting, underwriting, distribution, risk and claims management, and more.

Areas of change in car insurance due to AI

Changes in insurance distribution

Artificial intelligence algorithms smoothly create risk profiles so that the time required to purchase a policy is reduced to minutes. Smart contracts based on blockchain instantly authenticate payments from an online account. At the same time, contract processing and payment verification is also vastly streamlined, reducing insurers' client acquisition cost.

Advanced risk assessment and reliable pricing

Traditionally, insurance premiums are determined using the "cost-plus" method. This includes an actuarial assessment of the risk premium, a component for direct and indirect costs, and a margin. Yet it has quite a few drawbacks.

One of them is the inability to easily account for non-technical price determinants, as well as the inability to react quickly to shifting market conditions.

How is risk calculated? For car insurance companies, the assessment refers to accidents, road crashes, breakdowns, theft, and fatalities.

These days, all these aspects can be controlled by leveraging AI, coupled with IoT data that provides real-time insights. Customized pricing of policies, for instance,  can take into account GPS device dataon a vehicle’s location, speed, and distance traveled. This way, you can see whether the vehicle spends most of its time in the driveway or if, conversely, it frequently travels on highways, particularly at excessive speeds.

In addition, insurance companies can use a host of other  sensor and camera data, as well as reports and documents from previous claims. Having all this information gathered, algorithms are able to reliably determine risk profiles.

 CASE STUDY

 Ant Financial, a Chinese company that offers an ecosystem of merged digital products and services, specializes in creating highly detailed customer profiles. Their technology is based on artificial intelligence algorithms that assign car insurance points to each customer, similarly to credit scoring. They take into account such detailed factors as lifestyle and habits. Based on this, the app shows an individual score, assigning a product that matches the specific policyholder.

An in-depth analysis of claims

The cooperation between an insurance company and its client is based on the premise that both parties are pursuing to avoid potential losses. Unfortunately, sometimes accidents, breakdowns or thefts occur and a claims process must be implemented. Artificial intelligence, integrated IoT data, and  telematics come in handy irrespective of the type of claims we are handling.

  •  These technologies are suitable for, among other things, automatically generating not only damage information but also repair cost estimates.
  •  Machine learning techniques can estimate the average cost of claims for various client segments.
  •  Sending real-time alerts, in turn, enables the implementation of predictive maintenance.
  •  Once an image has been uploaded, an extensive database of parts and prices can be created.

The drivers themselves gain control as they can carry out the process of registering the damage from A to Z:  take a photo, upload it to the insurer's platform and get an instant quote for the repair costs. From now on, they are no longer reliant on workshop quotes, which were often highly overestimated in line with the principle: "the insurer will pay anyway".

Fraud prevention

29 billion dollars in annual losses These are losses to auto insurers that occur due to fraud. Fraudsters want to scam a company out of insurance money based on illegally orchestrated events. How to prevent this? The answer is AI.

 Analyzed data retrieved from cameras and sensors can reconstruct the details of a car accident with high precision. So, having an accident timeline generated by artificial intelligence facilitates accident investigation and claims management.

 CASE STUDY

 An advanced AI-based incident reconstruction has been tested lately on 200,000 vehicles as part of a collaboration between Israel's Project Nexar and a Japanese insurance company.

Assistance in the event of accidents

According to data from the OECD, car accident fatalities could be reduced by 44 percent if emergency medical services had access to real-time information about the injuries of involved parties.
Still, real-time assistance has great potential not only for public services but also in the context of auto insurance.

By leveraging AI to perform this,     insurers can provide drivers with quick and semi-automated responses during collisions and accidents    . For example, a chatbot can instruct the driver on how to behave, how to call for help, or how to help fellow passengers. All this is essential in the context of saving lives. At the same time, it is a way of reducing the consequences of an accident.

Transparent decision making (client perspective)

New technologies offer solutions to many problems not only for insurers but also for clients. The latter often complain about discrimination and unfair, from their point of view, calculations of policies and compensation.

"Smart automated gatekeepers" are superior in multiple ways to the imperfect solutions of traditional models. This is because, based on a number of reliable parameters, they facilitate the creation of more authoritative and personalized pricing policies. Data-rich and automated risk and damage assessments pay off for consumers because they have decision-making power based on how their actions affect insurance coverage.

The opportunities and future of AI in car insurance

McKinsey's analysis says that across functions and use cases AI investments are worth $1.1 trillion in potential annual value for the insurance industry.

The direction of changes is outlined in two ways: first by increasingly connected and  software-equipped vehicles with more sensors. Second, by the changing analytical skills of insurers. Data-driven vehicles will certainly affect more reliable and real-time consistent repair costs and, consequently, claims payments. And when it comes to planning offers and understanding the client, AI is an enabler of change for personalized, real-time service (24/7 virtual assistance) and for creating flexible policies. All signs indicate that such "abstract" parameters as education or earnings will cease to play a major role in this regard.

Tech impacting insurtech

As can be inferred from the diagram above, the greater the  impact of a given technology on an insurance company's business , the longer the time required for its implementation. Therefore, it is vital to consider the future on a macro scale, by planning the strategy not for 2 years, but for 10.

 The decisions you make today have a bearing on improving operational efficiency, minimizing costs, and opening up to individual client needs, which are becoming more and more coupled with digital technologies.

Grape Up guides enterprises on their data-driven transformation journey

Ready to ship? Let's talk.

Check our offer
Blog

Check related articles

Read our blog and stay informed about the industry's latest trends and solutions.

Finance

Building telematics-based insurance products of the future

Thanks to advancements in connected car technologies and the accessibility of personal mobile devices, insurers can roll out telematics-based services at scale. Utilizing telematics data opens the door to improving customer experience, unlocking new revenue streams, and increasing market competitiveness.

    After reading this article, you will know:  

  •     What is telematics  
  •     How insurers build PAYD & PHYD products  
  •     Why real-time crash detection is important  
  •         How to identify stolen vehicles    
  •     If it’s possible to streamline roadside assistance  
  •     What role telematics plays in predictive maintenance  

Telemetry- the early days

Obtaining vehicle data isn’t a new concept that has materialized with the evolution of the cloud and connectivity technologies. It is called telemetry and was possible for a long time but accessible only to the manufacturers or specialized parties because establishing the connection with the car was not an easy feat. As an example, it first started to be used by Formula 1 racing teams in the late 1980s, and all they could manage was very short bursts of data when the car was passing close to the pits. Also, the diversity and complexity of data were significantly different compared to what is available today because cars were less complex and had fewer sensors onboard that could gather and communicate data.

What is telematics?

At the very basic level, it’s a way of connecting to the vehicle data remotely. More specifically, telematics is a connection mechanism between machines (M2M) enabled by telecommunication advances.  Telematics understood in the insurance context is even more specific and means connecting to the data generated by both the vehicle itself and the driver .

At first, when telematics-based products started gaining popularity, they required drivers to use additional devices like black boxes that needed to be installed in the car, sometimes by a qualified technician. These devices were either installed on the dashboard, under the bonnet, or plugged in the OBD-II connector. The black boxes were fairly simple devices that comprised of GPS, motion sensor, and a SIM card plus some basic software. They gathered rudimentary information about:

  •  the time-of-day customers drive
  •  the speed on different sorts of roads
  •  sharp braking and acceleration
  •  total mileage
  •  the total number of journeys

Meantime mobile apps mostly replaced black boxes as it didn’t take long for smartphones to get sophisticated enough to render them rather useless. Of course, they are still offered by the insurers as an alternative for customers that refuse to install apps that access their location or require one due to not having a sufficiently advanced mobile device. However, these days most of the cars that roll off the assembly line have built-in connectivity capabilities, so the telematics function is already embedded in the vehicle from the very beginning. As an example, 90% of Ford passenger cars starting from 2020 are connected. This means that there is no more need for additional devices. The car can now share all the data black boxes or apps gathered plus a lot of detailed data about the vehicle state from the massive amount of sensors they’ve got on board. More technologically advanced cars like Tesla can send up to 5 gigabytes of data every day.

Telematics-based insurance products and services

By employing new technologies, insurers can be closer to their customers, understand them better and take a more proactive approach to maintain the relationship. Telematics is the key technology that allows for this type of stance in the auto insurance area.  Insurers can leverage telematics to build numerous products and services , but it is important to remember that the regulations can differ from state to state and from country to country.

So, the solutions depicted in this article should serve only as an example of how the technology can be used.

Usage-based products

Usage-based products are probably the most widespread in this category as they have been around for some time and offer the most tangible benefit to customers - cost savings.

The market value for these products is currently estimated at 20 billion dollars, and it is projected to reach 67 billion USD in the next 5 years. This is a good indicator that there is a growing demand in the market, especially from millennials and gen Zs who expect the services & products they buy to be tailored to them and not based on a generic quote.

Currently, the two main categories of usage-based insurance are Pay-how-you-drive (PHYD) and Pay-as-you-drive (PAYD) products. The first one is based on the assumption that the drivers should be rewarded for how they drive. So, when building PHYD offering, insurers need data on when & where their customers drive, the speed on different roads, how they accelerate, and brake, and how they enter corners. Feeding that data to Machine Learning algorithms allows assessing whether the customers are safe drivers who obey the law and to reward them with a discount on their premium. The customer benefits are clear, but the insurer benefits as well. By enabling their customers to use PHYD products, the insurers can:

  •  correct risk miscalculations,
  •  enhance price accuracy,
  •  attract favorable risks,
  •  retain profitable accounts,
  •  reduce claim costs
  •  enable lower premiums

The second category is the PAYD model in which the customers pay only for what they actually drive plus a low monthly rate. In this scenario, the insurers only need to monitor the miles driven and then multiply the amount by a fixed mile fee (a few cents usually). This type of solution is perfect for irregular drivers, and it was also a choice for many during COVID. It can increase insurance affordability, reduce uninsured driving, and provide consumer savings. It makes premiums more accurately reflect the claim costs of each individual motorist and rewards motorists who reduce their accident risk. Additionally, it can be a great alternative to PHYD products for customers who are not comfortable with gathering multiple data points about their driving behavior.

Real-time crash detection

This solution allows  insurers to be closer to their customer and to react to events in real-time. It is a part of a larger trend in which the evolution of technology enables the shift from a mode of operations where the insurer is largely invisible to their customers (unless something happens) to a new model where the company is there to support and help the customers. And if possible, even go as far as to predict and prevent losses occur.

By analyzing the vehicle data and driver behavior, it is possible to detect accidents as they happen. Through monitoring the vehicle location, speed, and sensor data (in this case, motion sensor) and setting up alerts, insurers can be the first to know that there has been an accident. However, detecting the actual accident requires filtering out random shock and vibrations like speed bumps, rough roads, and potholes, parking on the kerb, doors, or boot lid being slammed.

This allows them to take a proactive approach and contact the driver, coordinate the emergency services, and roadside assistance. Using the data from the crash, they can also start the first notice of loss process and reconstruct the accident’s timeline. If it happens that there are more parties involved in the incident, the crash data can be used to determine who is responsible in ambiguous situations.

Stolen Vehicle Alerts

The big advantage of telematics-based products and services is that they are beneficial to both sides, and it’s easy to present. One of the examples can be enabling stolen vehicle alerts. By gathering data about customer behavior, insurers can build driver profiles that allow them to set up alerts that are triggered by unusual or suspicious behavior.

For instance, let’s assume a customer typically drives their car between 7am and 5pm on weekdays and then goes on various medium distance trips during the weekend. So an unexpected, high-speed journey at 3am on Wednesday can seem suspicious and trigger an alert. Of course, there can be unforeseen events that force customer behavior like that, but then the policyholder can be contacted to verify whether that’s them using the car and if there’s been an emergency. However, if the verification fails, then authorities can be notified and informed of the vehicle’s position in real-time to help recover the vehicle once it’s been confirmed as stolen.

For fleet owners, geo-fencing rules can be established to enhance fleet security. Many of the businesses with fleets operate during specific working hours. At night the company vehicles are parked in designated lots. So, if there is a situation when a vehicle leaves the specific area during the hours it shouldn’t, an automated alert can be triggered. The fleet manager can be then contacted to verify whether the car is being used by the company or if it’s leaving the property unauthorized. If necessary, authorities can be notified about the theft, and the vehicle location can be tracked to enable swift recovery.

Roadside assistance

Vehicle roadside assistance is a service that assists the driver of a vehicle in case of a breakdown. Vehicle roadside assistance is an effort by auto service professionals to sort minor mechanical and electrical repairs and adjustments in an attempt to make a vehicle drivable again. According to just a single roadside assistance company in the US, they receive 1.76 million calls for help a year, which translates to 5,000 calls every day. Clearly, any automation and expediting of the processes can have a significant impact on the effectiveness of operations and the customer experience.

By employing modern technologies like telematics, insurers can streamline the process from the moment the driver notifies the insurer of a breakdown. The company can start a full process aimed at resolving the issue as fast as possible in the least stressful way. Using vehicle location, a tow truck can be dispatched without the need for the customer to try and pinpoint their location. And the insurer can then proceed to locate and book the nearest available replacement vehicle. Furthermore, using the telematics data, an initial assessment of damage can be performed in order to expedite the repair. As an example, the data may indicate that the vehicle has been overheating for several miles before it stopped and that can be useful information for the garage that will try to fix the car.

Predictive maintenance

There are two types of servicing: reactive and proactive. While reactive requires managing a failure after it occurs, the various proactive maintenance approaches allow for some level of planning to address that failure ahead of time. Proactive maintenance enables better vehicle uptime and higher utilization, especially for fleet owners. Telematics is helping to further improve maintenance practices and optimize uptime on the path to predictive maintenance models.

This type of service is best suited for more modern vehicles where the telematics feature is embedded and there is a multitude of different sensors monitoring the vehicle’s health. However, a more basic level of predictive maintenance is achievable with plug-in telematics dongles and devices able to read fault codes.

Using that data, insurers can remind policyholders about things like oil and brake pad changes, which will have an impact on both road safety and vehicle longevity. They can also send alerts about issues like low tire pressure to encourage drivers to refill the tires with air on their own rather than wait for a puncture and require roadside assistance.

The simple preventive maintenance can ultimately save a lot of stress for the driver as it will prevent more severe issues with the car as well as money and time spent on the repairs. For fleet owners, it means increased uptime and better utilization of the vehicles that in turn lead to an increase in profit and lower costs.

Building Telematics-based Insurance Products - Summing Up

Aside from offering policyholders benefits like fairer, lower rates, streamlined claims resolution, and better roadside assistance,  telematics technology is a goldmine of data for the insurers . They get a better understanding of driver behavior and associated risk and can adjust the premiums accordingly. In the event of an accident, an adjuster can find out which part of the car was damaged, how severe the impact was, and what is the probability of passengers suffering injuries. Finally, insurance companies can benefit from reduced administration costs by being able to resolve the claim faster and more efficiently.

Read more
Finance

How to enable data-driven innovation for the mobility insurance

 Digitalization has changed the way we shop, work, learn and take care of our health or travel. Cars are no longer used just to get from A to B. They are jam-packed with technology that connects us to the world, enhances safety, prevents breakdowns, and even provides entertainment. With the rise of the Internet of Things and artificial intelligence, a vehicle is no longer understood solely in terms of its performance and sleek design. It has become software on wheels, a gateway to new worlds - not just physical, but also virtual. And if the nature of insurance itself is changing, then the company offering insurance must keep up with these changes as well. Insurance needs digital innovation, as much as any other market area.


These days customers are looking for customization, personalization, and understanding their needs on an almost organic level. Data and advanced analytics allow us to effectively satisfy these needs. Thanks to them, it is possible to fine-tune the offer, not so much for a specific group, but for a particular person - their habits, daily schedule, interests, health restrictions, or aesthetic preferences. And in the case described by us - a person's  driving style and commuting patterns .

If you think about it, the insurer has the perfect tool in their hands. If they can tap into the potential of the  software-defined vehicle and equip it with the right applications, there will be nearly zero chance of inaccurate insurance risk estimates. Data doesn't lie and shows a factual, not imaginary picture of a driver's driving style and behavior on the road.

While in the traditional insurance model pricing is static and data is collected offline and not aligned with the driver's actual preferences, new technologies such as the  cloud, the IoT, and AI allow for these limitations to be effectively lifted.

With them, an offering is created that competes in the marketplace, generates new revenue streams within the company, and builds customer loyalty.

Data-driven innovation - easier said than done. Or maybe not?

The transformation of a vehicle from a traditionally understood mechanical device into a "smartphone on four wheels," as Akio Toyoda once said about modern vehicles, takes time and will not happen overnight. But year by year it already happens, and as the new car models distributed by the big corporations show, this process is actually underway.

        Read our article on the latest trends in the automotive industry    

The so-called software-defined vehicle that we are developing with our clients at Grape Up is a vehicle that moves through an ecosystem of numerous variables, accessed by different players and technologies.
Clearly, one such provider can be - and should be - the insurer whose products have been tied to the automotive market invariably since 1897, when a certain Gilbert J. Loomis, a resident of Dayton, Ohio, first purchased an automotive liability insurance policy.

However, for insurance companies to play an integral role in the use of vehicle-generated data, the driver must receive a precisely functioning and secure service from which they will derive real benefits. Without building specific technical competencies and  software-defined vehicle knowledge , the insurer cannot achieve these goals.

Only by creating this type of business unit from scratch in-house, or by partnering with software companies, will they be able to compete with insurtech startups like, e.g. Lemonade, which builds their businesses from the ground up  based on AI and data analytics .

The right technology partner will take care of:

  •  data security;
  •  selection of cloud and IoT technologies;
  •  and will ensure the reliability and scalability of the proposed solutions.

During this time, the insurer can focus on what they do best - developing insurance competencies and tweaking their offers.

How to choose the right technology partner?

Just as customers are looking for insurance that accommodates their driving and lifestyle, an insurance company should select a technology partner that has more than just technical skills to offer. After all, changing the model in which a traditional insurance company operates does not boil down to creating a digital sales channel on the Internet and launching a modern website. We are talking about a completely different scale of operations requiring the insurance company to be embedded in a completely new, rapidly developing environment.

Therefore they need a partner who naturally navigates the software-defined vehicle ecosystem, understands its specifics, and has experience in working with the automotive industry. Besides, it should be someone  knowledgeable about the specifics of the P&C insurance market and the challenges faced by the insurance client.

It is only at the intersection of these three areas: technology, automotive, and insurance, that competencies are built to effectively compete against modern insurtechs.

Like in the Japanese philosophy of ikigai, which explains how to find one's sense of purpose and give meaning to one's work, both companies can build valuable, useful solutions for users. They will bring satisfaction not only to customers but also to the insurance company, which will open a new revenue channel and meet the needs of the market.

Read more
Software development
AI

Leveraging AI to improve VIN recognition - how to accelerate and automate operations in the insurance industry

Here we share our approach to automatic Vehicle Identification Number (VIN) detection and recognition using Deep Neural Networks. Our solution is robust in many aspects such as accuracy, generalization, and speed, and can be integrated into many areas in the insurance and automotive sectors.

Our goal is to provide a solution allowing us to take a picture using a mobile app and read the VIN that is present in the image. With all the similarities to any other OCR application and common features, the differences are colossal.

Our objective is to create a reliable solution and to do so we jumped directly into analysis of the real domain images.

VINs are located in many places on a car and its parts. The most readable are those printed on side doors and windshields. Here we focus on VINs from windshields.

OCR doesn’t seem to be rocket science now, does it? Well, after some initial attempts, we realized we’re not able to use any available commercial tools with success, and the problem was much harder than we had thought.

How do you like this example of KerasOCR ?

Despite many details, like the fact that VINs don’t contain the characters ‘I’, ‘O’, ‘Q’, we have very specific distortions, proportions, and fonts.

Initial approach

How can we approach the problem? The most straightforward answer is to divide the system into two components:

VIN detection VIN recognition Cropping the characters from the big image Recognizing cropped characters

In the ideal world images like that:

Will be processed this way:

After we have the intuition how the problem looks like, we can we start solving it. Needless to say, there is no “VIN reading” task available on the internet, therefore we need to design every component of our solution from scratch. Let’s introduce the most important stages we’ve created, namely:

  • VIN detection
  • VIN recognition
  • Training data generation
  • Pipeline

VIN detection

Our VIN detection solution is based on two ideas:

  • Encouraging users to take a photo with VIN in the center of the picture - we make that easier by showing the bounding box.
  • Using Character Region Awareness for Text Detection (CRAFT) - a neural network to mark VIN precisely and be more error-prone.

CRAFT

The CRAFT architecture is trying to predict a text area in the image by simultaneously predicting the probability that the given pixel is the center of some character and predicting the probability that the given pixel is the center of the space between the adjacent characters. For the details, we refer to the original paper .

The image below illustrates the operation of the network:

Before actual recognition, it had sound like a good idea to simplify the input image vector to contain all the needed information and no redundant pixels. Therefore, we wanted to crop the characters’ area from the rest of the background.

We intended to encourage a user to take a photo with a good VIN size, angle, and perspective.

Our goal was to be prepared to read VINs from any source, i.e. side doors. After many tests, we think the best idea is to send the area from the bounding box seen by users and then try to cut it more precisely using VIN detection. Therefore, our VIN detector can be interpreted more like a VIN refiner.

It would be remiss if we didn’t note that CRAFT is exceptionally unusually excellent. Some say every precious minute communing with it is pure joy.

Once the text is cropped, we need to map it to a parallel rectangle. There are dozens of design dictions such as the affine transform, resampling, rectangle, resampling for text recognition, etc.

Having ideally cropped characters makes recognition easier. But it doesn’t mean that our task is completed.

VIN recognition

Accurate recognition is a winning condition for this project. First, we want to focus on the images that are easy to recognize – without too much noise, blur, or distortions.

Sequential models

The SOTA models tend to be sequential models with the ability to recognize the entire sequences of characters (words, in popular benchmarks) without individual character annotations. It is indeed a very efficient approach but it ignores the fact that collecting character bounding boxes for synthetic images isn’t that expensive.

As a result, we devaluated supposedly the most important advantage of the sequential models. There are more, but are they worth watching out all the traps that come with them?

First of all, training attention-based model is very hard in this case because of

AI

As you can see, the target characters we want to recognize are dependent on history. It could be possible only with a massive training dataset or careful tuning, but we omitted it.

As an alternative, we can use Connectionist Temporal Classification (CTC) models that in opposite predict labels independently of each other.

More importantly, we didn’t stop at this approach. We utilized one more algorithm with different characteristics and behavior.

YOLO

You Only Look Once is a very efficient architecture commonly used for fast and accurate object detection and recognition. Treating a character as an object and recognizing it after the detection seems to be a definitely worth trying approach to the project. We don’t have the problem and there are some interesting tweaks that can allow even more precise recognition in our case. Last but not least, we are able to have a bigger control of the system as much of the responsibility is transferred from the neural network.

However, the VIN recognition requires some specific design of YOLO. We used YOLO v2 because the latest architecture patterns are more complex in areas that do not fully address our problem.

  • We use 960 x 32 px input (so images cropped by CRAFT are usually resized to meet this condition). Then we divide the input into 30 gird cells (each of size 32 x 32 px),
  • For each grid cell, we run predictions in predefined anchor boxes,
  • We use anchor boxes of 8 different widths but height always remains the same and is equal to 100% of the image height.

As the results came, our approach proved to be effective in recognizing individual characters from VIN.

Metrics

Appropriate metrics becomes crucial in machine learning-based solutions as they drive your decisions and project dynamic. Fortunately, we think simple accuracy fulfills the demands of a precise system and we can omit the research in this area.

We just need to remember one fact: a typical VIN contains 17 characters, and it’s enough to miss one of them to classify the prediction as wrong. At any point of work, we measure Character Recognition Rate (CER) to understand the development better. CERs at a level 5% (5% of wrong characters) may result in accuracy lower than 75%.

About the models tuning

It's easy to notice that all OCR benchmark solutions have much bigger effective capacity that exceeds the complexity of our task despite being too general as well at the same time. That itself emphasizes the danger of overfitting and directs our focus to generalization ability.

It is important to distinguish hyperparameters tuning from architectural design. Apart from ensuring information flow through the network extracts correct features, we do not dive into extended hyperparameters tuning.

Training data generation

We skipped one important topic: the training data.

Often, we support our models with artificial data with reasonable success but this time the profit is huge. Cropped synthetized texts are so similar to the real images that we suppose we can base our models on them, and only finetune it carefully with real data.

Data generation is a laborious, tricky job. Some say your model is as good as your data. It feels like the craving and any mistake can break your material. Worse, you can spot it as late as after the training.

We have some pretty handy tools in arsenal but they are, again, too general. Therefore we had to introduce some modifications.

Actually, we were forced to generate more than 2M images. Obviously, there is no point nor possibility of using all of them. Training datasets are often crafted to resemble the real VINs in a very iterative process, day after day, font after font. Modeling a single General Motors font took us at least a few attempts.

But finally, we got there. No more T’s as 1’s, V’s as U’s, and Z’s as 2’s!

We utilized many tools. All have advantages and weaknesses and we are very demanding. We need to satisfy a few conditions:

  • We need a good variance in backgrounds. It’s rather hard to have a satisfying amount of windshields background, so we’d like to be able to reuse those that we have, and at the same time we don’t want to overfit to them, so we want to have some different sources. Artificial backgrounds may not be realistic enough, so we want to use some real images from outside our domain,
  • Fonts, perhaps most important ingredients in our combination, have to resemble creative VIN’s fonts (who made them!?) and cannot interfere with each other. At the same time, the number of car manufacturers is much higher than our collector’s impulses, so we have to be open to unknown shapes.

The below images are the example of VIN data generation for recognizers:

Putting everything together

It’s the art of AI to connect so many components into a working pipeline and not mess it up.

Moreover, we have a lot of traps here. Mind these images:

VIN labels often consist of separated strings, two rows, logos and bar codes present near the caption.

90% of end-to-end accuracy provided by our VIN reader

Under one second solely on mid-quality CPU, our solution has over 90% of end-to-end accuracy.

This result depends on the problem definition and test dataset. For example, we have to decide what to do with the images that are impossible to read by a human. Nevertheless, not regarding the dataset, we approached human-level performance which is a typical reference level in Deep Learning projects.

We also managed to develop a mobile offline version of our system with similar inference accuracy but a bit slower processing time.

App intelligence

While working on the tools designed for business , we can’t forget about the real use-case flow. With the above pipeline, we’re absolutely unresistant to photos that are impossible to read, even though we want it to be. Often similar situations happen due to:

  • incorrect camera focus,
  • light flashes,
  • dirt surfaces,
  • damaged VIN plate.

Usually, we can prevent these situations by asking users to change the angle or retake a photo, before we send it to the further processing engines.

However, the classification of these distortions is a pretty complex task! Nevertheless, we implemented a bunch of heuristics and classifiers that allow us to ensure that VIN, if recognized, is correct. For the details, you have to wait for the next post.

Last but not least, we’d like to mention that, as usual, there are a lot of additional components built around our VIN Reader . Apart from a mobile application, offline on-device recognition, we’ve implemented remote backend, pipelines, tools for tagging, semi-supervised labeling, synthesizers, and more.

https://youtu.be/oACNXmlUgtY

Read more
View all
Connect

Interested in our services?

Reach out for tailored solutions and expert guidance.

Stay updated with our newsletter

Subscribe for fresh insights and industry analysis.

About UsCase studiesContactCareers
Capabilities:
Legacy ModernizationData PlatformsArtificial Intelligence
Industries:
AutomotiveFinanceManufacturing
Solutions:
DataboostrCloudboostr
Resources
BlogInsights
© Grape Up 2025
Cookies PolicyPrivacy PolicyTerms of use
Grape Up uses cookies

This website uses cookies to improve its user experience and provide personalized content for you. We use cookies for web analytics and advertising. You can accept these cookies by clicking "OK" or go to Details in order to manage your cookies preferences more precisely. To learn more, check out our Privacy and Cookies Policy

Accept allDetails
Grape Up uses cookies

Essential website cookies are necessary to provide you with services available through the website, autosave your settings and preferences, and to enhance the performance and security of the website - you have the right not to accept them through your web browser's settings, but your access to some functionality and areas of our website may be restricted.

Analytics cookies: (our own and third-party : Google, HotJar) – you can accept these cookies below:

Marketing cookies (third-party cookies: Hubspot, Facebook, LinkedIn) – you can accept these cookies below:

Ok