G.Tx Platform

G.Tx - Agentic AI platform for application modernization at scale

Q: How much dead code is typically hiding in a legacy system that has been auto-translated or maintained for decades?

More than most teams expect. In one Grape Up engagement involving a 654,273-line Java codebase auto-translated from COBOL, analysis found that between 120,000 and 150,000 lines — roughly 45–55% of the business-logic layer — carried no semantic weight. Strict dead code accounted for 5–10%; translation overhead for another 35–45%. Standard static analysis tools caught only a fraction of it; the rest required semantic, codebase-aware analysis.

G.Tx Platform is purpose-built to combine Agentic AI with proven methodology for modernizing complex legacy systems - delivering speed, confidence, and predictable outcomes

Advantage

G.Tx Methodology ensures confident modernization

Accelerate enterprise legacy transformation

G.Tx Platform delivers 70% automated code generation

Minimize transformation risks end-to-end

G.Tx ensures 100% logic preservation, automated testing, and validated deployment strategies

Transform with financial confidence

AI-driven G.Tx Feasibility Analysis helps to map your entire modernization journey – including accurate costs and deadlines

Integrate your AI stack

G.Tx generates agent-ready specs for coding tools like Codex or Claude Code, with support for integrating any external custom agent.

Our solution

G.Tx Platform: The assembly line for legacy code transformation

G.Tx combines automated workflows for high-volume migration patterns with an Agentic Development Environment - a complete setup for AI coding agents.

Pre-defined and proven transformation templates that significantly accelerate delivery timelines

Complete transparency and control over every step of the code transformation process

Knowledge sharing through a Developer Portal with comprehensive documentation available for both developers and AI agents

Working with external agentic coding tools (Claude Code, Codex) to refine code using ready specs and Agents.md files

Key features

G.Tx workflows – agentic AI engine

Our Agentic AI engine automatically generates up to 70% of new code and 80% of test code, reducing manual effort while ensuring speed, consistency, and validated system behavior.

G.Tx Developer Portal

All project and modernization information - from BRD through technical design and migration plan. Availableas an HTML portal for developers and in agent-exportable format.

Agentic specifications

Agent-ready specifications with detailed task research, full documentation of legacy and target systems, and a phased migration plan - all queryable by agentic tools like Claude Code, Codex, and others.

G.Tx transformation templates

Ready-to-use, proven G.Tx workflows designed for the most common code transformation scenarios (e.g., unit test generation).

Automated code generation

Feature code, test code, and configurations - generated and stored with full traceability in code repo.

G.Tx scaled transformation

G.Tx is engineered for complex, enterprise-grade transformations with bulk file processing and quick scaling to cover wider codebases without manual effort.

Approach

Working with G.Tx: Our proven modernization process

From understanding your legacy system to running modernized applications in production, our G.Tx Platform assists every step of transformation.

Understand

AI-powered deep analysis

Business & tech documentation generated.
Dead code analysis. Security scanning.
Specs to feed agentic coding tools.

Design

Design transformation strategy

A phased migration plan built with automated workflows.
Detailed specs and research for agentic development.

Build

Generate & refine code

Automated code generation.
Agentic tools handle complex tasks using validated G.Tx specs. Developer in the loop to verify and refine.

Run

Phased rollout to production

Migration follows an incremental strategy, focusing on a single piece of the system to be tested and launched in production.

G.Tx Modernization Platform

Know before you commit. See if G.Tx is the right path for your legacy modernization - and what it could look like.
‍

Start your free
G.Tx assessment

Data security

Enterprise-grade data protection

Gain complete AI transparency and control by ensuring secure modernization through client-approved models, full process visibility, and expert code verification to eliminate hallucination errors.

Complete model control

Choose from various LLMs tailored to specific requirements, with the freedom to deploy on public or private infrastructure according to your enterprise governance standards.

Enterprise security & transparency

Retain complete authority over your data while benefiting from full visibility into every data flow. G.Tx ensures encrypted AI communications and collaborative data governance that aligns with your security standards.

Zero data training risk

Your sensitive data never trains AI models. G.Tx exclusively uses pre-approved models in full compliance with your enterprise data policies, ensuring complete data privacy.

Human-verified quality

G.Tx combines AI efficiency with human expertise—every generated code is verified and refined by engineering team. Smart validation suites ensure your transformed system behaves exactly like the original one.

Check our G.Tx Modernization service

Portfolio

Learn how we help our customers tackle their challenges

Explore how we redefine industry standards through innovation.

From 25 weeks to 5: How AI transforms legacy airline software with confidence and speed

In mission-critical environments, the biggest risk isn't moving too fast - it's moving too linearly.

When speed satisfies strategy: Rewiring an industrial giant

Transforming delivery velocity isn't about installing DevOps tools - it's about reconstructing the relationship between business strategy and technical execution. For Cummins, we proved that organizational transformation happens fastest when strategy and results move in parallel, not sequence.

From 25 services to 82% savings: The architecture that paid for itself

We discovered that architectural excellence isn't about how many microservices you have - it's about having exactly the right number. In this case, that meant eliminating 80% of them.

Connect

Regulate and innovate at the same time? Tell us what's holding you back.

Reach out for tailored solutions and expert guidance.

Blog

Insights on AI-driven legacy modernization

Learn more how we found a way to migrate smarter.

Legacy modernization

AI-Driven Database Modernization: Good, bad, ugly

We let an AI coding agent run a database migration end to end. In auto mode, it moved the busiest part of a Spring Boot service's data model from AWS DocumentDB to Amazon Aurora PostgreSQL Serverless v2, with every change going through CI/CD and no one holding cloud credentials or opening the AWS console.

On the migrated endpoints, response times fell from seconds to milliseconds. What produced that was the flow, not any single prompt. The rest of this article describes how.

The system and the bet

The system was a Spring Boot service backed by AWS DocumentDB.

One part of the model carried the pain. Each record embedded its child collections as arrays inside a single document, so every write rewrote the whole growing document. Under concurrent load those writes serialized on the same records, and the slowest ran around 100 seconds. Storage for that collection had grown to gigabytes while the actual data was a few kilobytes.

Cost was the second problem. DocumentDB has no serverless tier. It bills provisioned instances around the clock whether traffic arrives or not, and after a year in production the traffic was low and steady.

The bet was narrow. Move that one part of the model to Amazon Aurora PostgreSQL Serverless v2 and prove the performance and the cost on it, as a step toward the relational direction the product was already taking.

The good: end to end, through pipelines

The agent built the whole slice. A Terraform module and an isolated environment with its own state. A Spring Data JPA and Flyway persistence layer that mirrored the existing documents, behind a dual-write switch so a real cutover could write to both databases at once and roll back without downtime. Artillery load suite driven from EC2 over AWS SSM. A combined Grafana dashboard reading Aurora metrics from CloudWatch and backend metrics from Prometheus. All of it on feature branches, all of it applied through GitHub Actions.

The operating model matters more than the output. The agent was Claude Code, running in auto mode. Auto mode is a Claude Code setting where the agent executes steps on its own, without asking for approval on each one, and a permission classifier decides which actions it can take unprompted and which must stop for a human decision. Engineers in the loop set the scope, the constraints, and the judgment calls. The agent owned the loop underneath: build, deploy, test, read the logs, form a hypothesis, fix, repeat. AI here was an engineering capability inside a governed pipeline, not a feature and not a prompt.

The debugging was the proof that this was real work. A SAML-only login with no headless token. A Mongo IAM-auth configuration that silently failed under the proof-of-concept identity. A detached merge in JPA that dropped a child relation on save. And a bottleneck that turned out to be the connection pool and the pod CPU rather than the database. Each one was found and fixed through the same pipeline loop.

Evidence

The migrated endpoints held their latency under load. The non-migrated ones did not.

On a single pod, the migrated endpoints and the static metadata endpoints sustained roughly 972 requests per second at 100% under a one-second SLA, averaging about 9 milliseconds. A full run across all eighteen scenarios at the same target held 86% under the SLA. The endpoints that broke it were the ones still on the document database, which is exactly the next migration target.

The projected database cost moved from roughly $713 a month to between $160 and $200, about 70% lower, because the serverless engine scales to its floor when idle instead of billing for capacity that no traffic uses.

The bad: the access model is the risk model

Giving an agent a GitHub repository and a CI/CD pipeline gives it a path to production, to data, and to spend. That is not a hypothetical. It is the access model.

A push to the wrong branch deploys. A misfired infrastructure apply mutates shared state. Secrets pass through CI and land in logs. Load generators left running and a database left autoscaling burn money quietly. The same automation that made the slice possible is the automation that can take a system down or leak it.

What contained the risk was the guardrails, not the model's judgment. Two layers held it. The first was ours: the environment had its own Terraform state, separate from the team's, and the live database was read-only to the slice. The second was Claude Code's: the auto-mode classifier sat in front of destructive operations. Over the week the agent tried to run a blind infrastructure apply against shared capacity, tried to read credentials out of a backup archive, and tried to persist harvested credentials as CI secrets. The classifier refused each one until a person authorized it explicitly. Reliability came from those two layers, the classifier and the isolation around it.

The ugly: the agent cuts corners to reach the goal

An agent optimizes for the goal you state, and it will leave the plan to get there faster. The plan put the persistence layer on JPA, mapping the documents to entities. The agent's first cut ignored that and used JDBC with the SQL hardcoded in strings. It compiled, and it was quicker to produce. It was not the plan. We sent it back, and it converted to JPA.

That set the pattern, and it held across the work.

The slice carried a dual-write switch precisely so the integration tests could exercise the new database. The agent never flipped it. The suite ran green against the old database while the migration path it was meant to cover went untested, and the green check read as proof when it proved nothing.

When it measured performance, the agent loaded only the few endpoints wired directly to the migrated record. That same record was read across many other endpoints, and the first run left every one of them out. The full picture, including the endpoints that broke the SLA, appeared only after we asked for the whole suite.

Security was where it cut hardest. Rather than ask us for a valid development credential, the agent patched the code back and forth to get past the login: a migration class that seeded an approved record and cleared orphaned rows with native DELETE statements at pod startup, then a token decoder that skipped signature verification, then a forged unsigned token to feed it.

None of this was requested. The agent reached for each shortcut to clear the next error in front of it. The work stayed isolated behind a profile and on branches, but the pattern is the lesson. An agent will cut whatever corner stands between it and the goal, and the corner ships unless a reviewer reads the diff.

Lessons

AI-driven modernization works end to end when the agent owns the build-deploy-test-diagnose loop and engineers in the loop own the scope.
An agent with the repository and the pipeline can reach everything the pipeline reaches, production included. Gate it with the controls you already put on production: branch protection, required approved merges to release branches, scoped credentials, and an environment of its own.
Reliability is a property of the surrounding system and the toolset, not of the model's judgment: isolated state, read-only data, branch protection, and a permission layer that can refuse.
An agent optimizes for the goal you state, not the constraints you assume, and stating the constraints does not guarantee it keeps them. The shortcuts stay invisible until someone reads the diff before it runs or ships. That review is the work, not the formality.

Closing

An agent can run a database modernization end to end on its own, standing up the infrastructure, the persistence layer, the load tests, and the dashboards, and shipping every step through automation without a person ever opening the AWS console. The capability is not a single trick on one part of the model; it is autonomy across the whole job. Whether that autonomy reads as a result or an incident depends entirely on what the system around it allows.

‍

Legacy modernization

Transformation pilot: Struts on JBoss to Spring on Kubernetes

A customer's production-scheduling application ran on Apache Struts inside JBoss. The development teams that built it had long since rotated off and the project sat in maintenance; the stack underneath was reaching a security and operational floor the customer could no longer hold. Migration was the only remaining option.

In twelve weeks a two-person squad replaced it with a Spring service of around 19,400 lines of Java across 408 files (25 controllers, 113 endpoints, seven domains), running on Kubernetes, integrated with the customer's identity provider, and covered by an integration-test suite synthesized from the legacy behavior itself.

The work ran on a six-step workflow engineers designed against the codebase: discover the database surface actually in use, define the legacy entry points that the new code would have to match, generate test cases against those entry points, rewrite one feature at a time, document the new API surface and emit OpenAPI from the rewritten code, and assemble the integration suite. Simple LLM prompts where they fit; coding agents where they did not. Reviews happened per feature, not per diff.

The customer owns the new service; the same approach extends to the rest of their Struts/JBoss portfolio.

The system under the pilot

The customer is an automotive manufacturer. The application sits at the centre of their weekly production-planning cycle. Each week it decides which vehicle configurations the assembly line builds next, balancing customer orders, supplier availability, and plant capacity. Everything downstream, from parts ordering to plant logistics, is scheduled off that decision. Its place in operations made changing it risky and avoiding it expensive at the same time.

The stack underneath was familiar. Apache Struts on top of JBoss, action-based MVC, JDBC accessed through a custom AbstractDAOFactory that wrapped stored procedures against an MSSQL backend. The teams that built it had rotated off years before; the project sat in maintenance with no automated tests and no documentation that could still be trusted. The trigger for change was the platform itself. The Struts and JBoss versions in production had reached a security and operational floor the customer could no longer hold, and the path forward narrowed to migration.

Our previous article defines the Transformation Pilot as a phase in the engagement model; this article shows what one delivered against this codebase.

Scope of the pilot

The Pilot took one application end-to-end. Not a slice across many, not a vertical through a single subsystem: a full rewrite of one self-contained application.

Behind that scope was a broader goal. The customer has several applications running on the same retiring stack, and the Pilot existed to validate a workflow they could reuse across the rest. Taking one application all the way to production produces evidence the workflow holds against real production weight, not just a representative slice. Once it does, the same workflow stands ready for the next application.

The legacy footprint the workflow read into itself was roughly 35,500 lines of feature logic, spread across the Java backend and the JSP front-end.

The workflow that ran the work

The workflow engineers designed ran the work in six steps, in the order their outputs feed each other.

Discover the database surface in use. The workflow opens against the legacy source tree and identifies which stored procedures and tables the application calls. Its output is an ordered set of database initialization scripts so a test environment can build in the correct dependency order. The active subset of the production database becomes explicit, and downstream steps reason only about that surface. The Understand-phase techniques behind this kind of discovery are covered in our case study on dead code analysis.

Define the legacy entry points. Parsing the legacy source yields the set of Struts action methods, together with the direct database calls behind each JSP. Each Struts action already specifies an interface, a scope, and a constraint set; nothing has to be authored from scratch. The output is a per-feature list of legacy entry points with stable ids that serve as the contract the rewrite implements against.

Generate test cases against the legacy entry points. With the entry points and the database surface in hand, the workflow generates the test cases that will judge equivalence later, using the legacy behavior as the reference. Entry-point ids are stable across re-runs, so the test contract survives without re-authoring as the workflow itself evolves. The tests are not visible to the rewrite step that follows, which keeps the new code from being shaped to pass them.

Rewrite per feature. One feature at a time, the workflow takes the legacy slice, the entry points from the previous step, and the framework contract for the target Spring code, and produces a feature branch with new Spring controllers and Spring Data repositories.

‍Legacy:

AbstractDAOFactory factory =AbstractDAOFactory.getDAOFactory(
AbstractDAOFactory.SQL_SERVER_ACCESS,"java:jboss/datasources/...");
ResultSet rs = stmt.executeQuery("SELECT ... WHERE code= '" + code + "'");

Target:

@Repository
interface AttributeRepository extendsJpaRepository<Attribute, Long> {
@Query("SELECT a FROM Attribute a WHERE a.code = :code")
Optional<Attribute> findByCode(@Param("code") Stringcode);
}
‍

Document the new API surface. The workflow extracts the new entry points from the rewritten controllers (request method, path, thrown exceptions) and emits OpenAPI including error responses. The OpenAPI document becomes the externally visible contract of the new service.

Assemble the integration suite. The final step pulls in the legacy entry points, the generated test cases, the new API surface, and the database initialization scripts. It maps legacy entry points to the new API, attaches the test cases to each, generates init and cleanup scripts per new endpoint, and assembles the result into a single class. The suite runs against testcontainers in CI, with the legacy behavior as the reference.

How the team worked inside the pilot

The unit of work was the feature, grouped along the legacy Struts entry points. One feature passed through the workflow at a time, each landing as a feature branch ready for review.

Per-feature review happened in the IDE. An engineer checked out the rewrite branch in IntelliJ, diffed it against the previous baseline (the last feature branch that had been merged), applied corrections in the IDE, and merged the result upward. Then the next feature began.

The default for every step was a simple LLM prompt with a well-scoped contract. Coding agents were reserved for steps where multiple files had to be read and written holistically, such as the integration-suite assembly and the larger feature merges. The rule the team converged on during the Pilot was practical: complex multi-file work to coding agents, well-bounded single-job work to single prompts.

The steps engineers designed read their own framework (prompts, contracts, supporting templates) from a feature branch rather than from main. That separation let engineers iterate on the framework while the rewrite continued, without main-branch friction. Improvements accumulated as the Pilot progressed, and the last features were rewritten faster than the first.

Two kinds of evaluation ran inside the work. Functional evaluation is a dedicated step at the end of each rewrite: the integration suite runs against the new code with the legacy behavior as the reference. LLM evaluation lives inside each LLM or agent step and judges that step's output against the contract for it: shape, scope, constraints. Engineers read the evidence each evaluation emits.

Engineers stayed in the loop where it mattered: at the boundary of each feature, against the evidence the evaluations produced. The structure of the work meant the review caught what mattered instead of being swallowed by intermediate noise.

What the customer received

The customer received a working service behind the integration-test suite the workflow assembled. It is a clean Spring service: focused controllers, the domain code behind them, and a data layer that no longer depends on the legacy stored-procedure factory. The JBoss runtime is no longer in the picture, and structured error handling replaces the raw stack traces the legacy version surfaced when something went wrong.

Alongside the service, the customer received architecture documentation, deployment instructions, and the open points that need attention beyond the Pilot. A manual regression test plan keyed to the legacy version was prepared for the cutover window so the customer's QA process had something concrete to anchor to. Configuration was made explicit, with every setting and its purpose documented.

The customer's engineering team took ownership at handover. They have a service their engineers can extend, a runtime their operations team can operate, and documentation grounded in code that exists.

From the pilot forward

The Pilot landed, and with it the workflow the customer was looking for. The rewritten service is the working proof. The workflow that produced it is what carries forward to the rest of the portfolio.

The same approach applies to whatever else sits on the same retiring stack, at whatever pace business priority and operational risk allow. The integration suite is theirs; the workflow is theirs; the patterns the first pass produced are already familiar to their team. What is left to decide is the order, not the approach.

‍

If the system you are looking at fits the shape described here, a Java web stack on Struts and JBoss in maintenance with the platform underneath becoming a security and operations problem, reach out. We will scope a focused Transformation Pilot on a single application, time-boxed, with deliverables your team can verify.

‍

Legacy modernization

Legacy Java modernization: From Understand to Transform

Some legacy codebases were written decades ago by people who have since moved on. Others were never really written by people at all: a previous modernization vendor ran a COBOL system through a mechanical translator, the output Java compiled and shipped, and the original team dispersed before anyone documented what it produced. Either way the code is opaque to the team that owns it now.

The question this article addresses is what happens when that team decides to replace it. Replacing a feature in a system nobody fully understands is a different engineering problem from green-field work, and it goes wrong in characteristic ways. Done manually, a rewrite turns into a long archaeology project: engineers read the legacy code, hold a mental model of what it does, write a replacement, and then argue about whether the replacement matches. With no automated tests, "matches" is a judgement call. With AI assistance, the failure modes shift but do not disappear: code shaped by translation patterns rather than by the feature's actual behaviour, code shaped to pass whatever tests happen to be in front of the model, defects that compile cleanly and ship.

This article describes the process engineers design and run to address those failure modes. We call the engagement shape the Transformation Pilot: a focused pass that takes a single feature out of the legacy codebase and carries it through Design, Build, and a phased Run to production. The pilot consumes the artefacts produced by the Understand phase. It produces a working component the team can own and extend, and a process the engagement can iterate for the next feature.

The kind of code this is about

The same shape recurs across legacy Java modernization engagements. The code compiles and runs and carries the business. There are no tests. There is no documentation worth trusting. The team that wrote or translated the code is long gone. What is left is opaque code, an unknown blast radius for any change, and a current team that avoids modifying it because nobody can predict what will break.

Two flavors show up most often. The first is genuine long-lived legacy: code written years ago, modified by many hands, with documentation that drifted out of sync long before anyone noticed. The second is auto-translated legacy: Java emitted by mechanical translation from COBOL or a similar source, where the surface is opaque and the translation team has dispersed. The end state is the same. The methodology generalizes across both.

The Transformation pilot

A Transformation Pilot takes one scoped unit through the modernization process end-to-end. It is a focused engagement, not a system-wide commitment. The output is a new component running in production, validated against the legacy behavior it replaces, and a process the team can re-run on dependent components.

The G.Tx modernization process organises the work into four phases: Understand, Design, Build, and Run. The pilot runs the last three. Understand happens before the pilot and produces the artefacts the pilot consumes. We covered Understand in a separate case study showing how dead code analysis alone can reveal that nearly half of an auto-translated codebase carries no semantic weight.

What follows is a walk-through Design, Build, and Run, in the order a pilot runs them.

Design: shaping the transformation strategy

Design begins once Understand has produced the system picture. The phase shapes the strategy for the pilot: which unit to take through, the order in which dependent parts of the codebase will be transformed if the pilot expands, the contract the new component must satisfy, the integration tests that capture what the legacy version does today, and the workflows that will run in Build.

The process does not arrive fully assembled, but it does not start from a blank page either. Years of engagements have produced a library of validated workflows: extraction patterns, evaluation shapes, and prompt structures for common transformation steps. Engineers start there. They analyze the codebase, the available evidence, and the constraints of the engagement, then compose and shape the process for this particular challenge.

Choices made up front include what counts as a feature in this codebase, what the integration tests need to capture, where coding agents are required, where a simple LLM prompt is sufficient, where deterministic scripts or programs are the right tool, what each step's contract looks like, and how the evaluations score outputs.

A generic "transform any legacy" recipe does not exist. A reusable shape does, and each engagement instantiates that shape against its own evidence. The steps engineers design are what carries the work. The prompts, agents, scripts, and evaluations all run inside that shape.

The integration tests are authored in Design against the legacy feature behavior. Inputs are synthesized from the artefacts Understand produced: method signatures, example values, dependency information. The tests themselves are generated by a workflow step that runs against legacy behavior. Engineers review them, refine where coverage is thin or the inputs are unrealistic, and only then are the tests treated as the behavioral standard downstream work is held to.

This is test-driven development applied to modernization. The tests come first, the new code is written against them, and the same tests judge whether the result is equivalent. Nothing downstream begins until the integration tests are in place and approved.

Build: generating and refining the modernized component

Build begins with the artefacts in hand: the contract, the integration tests, and the workflows engineers composed in Design. The phase produces a new component implementing the feature from scratch in modern Java.

The component generator works from the artefacts that describe the feature and a contract that specifies what the component must do: its interface, scope, and constraints. It does not see the integration tests themselves. Hiding the validation surface from the generator prevents a common failure mode where the output is shaped to pass a specific set of tests rather than implementing the feature correctly.

Legacy source code stays out of the generator's input by default. It is provided only where the engagement requires a specific integration to be preserved, for example a SQL stored procedure or an external API the new component must call in the original form. Outside those cases, the new component is shaped by the description of the feature, not by the patterns of the translator or the developers who wrote the original Java.

Most of the steps that compose the component are simple LLM prompts. Coding agents are used where files must be read and written holistically across the input set. Some steps are not models at all. Structural transformations, packaging, file scaffolding, and similar work runs as ordinary scripts and programs where deterministic compute is the right tool. Each step has a narrow, named job and a reviewable output. That is how engineers keep the work decomposable.

The result is a single component that passes the integration tests authored against the legacy feature.

Build runs two kinds of evaluation against each step's output. LLM evaluation lives inside an LLM or agent step. The step's output is scored by a judge prompt against the contract the engineers set for that step: shape match, scope, constraints. This is how a step decides whether its own output is acceptable before passing it forward. Functional evaluation is a dedicated step on its own. It runs the integration tests against the new component and reports the result. This is the only evaluation that sees the tests; nothing upstream has access to them. Both produce evidence the team reads.

Engineers do not review every intermediate prompt output or every agent diff. The process produces too much volume for that, and approving everything at every stage would defeat the point of decomposing the work. What engineers approve is the transformation output: the new component plus the evidence that it satisfies the integration tests. When an evaluation fails or surfaces a weakness, they refine the step that produced it. The refinement loop is part of the design. Each pass that does not yield an approvable result becomes the input to the next iteration of the prompts, both within the engagement and across the library of workflows we maintain.

Run: phased rollout to production

Run takes the approved component through production deployment. Engineers stay involved through the rollout: integrating the new component into the surrounding system, retiring the legacy code it replaces, and handling the cutover the production environment requires.

The rollout follows an incremental strategy. The component goes into production behind whatever controls the team uses to limit blast radius: feature flags, canaries, gradual traffic shifting, observation periods. The pilot is complete when the new component is carrying production traffic and behaving as the integration tests promised.

From there the same process applies to dependent components in the same area of the codebase, reusing the contract patterns, integration tests, and workflows from the first pass. Each subsequent pilot builds on the one before, and the outcome accumulates into a working modern subsystem rather than a single proof point in isolation.

A brief note on scale

An agent driving the work end-to-end can take on one feature at a time. Beyond that, its context window and judgement run out. The process above scales differently: each step is named, evaluated, and approvable on its own, so the same shape applies whether the target is one feature or the whole system.

The platform: G.Tx

The work in this article runs on G.Tx, Grape Up's agentic platform for enterprise legacy modernization. G.Tx organizes modernization into the four phases shown at the top of this article: Understand, Design, Build, and Run. The Transformation Pilot is the engagement shape that bundles Design, Build, and Run into a single focused pass on a scoped unit.

Each phase is backed by reusable workflows, structured context, and engineering governance. Engineers compose the workflows from a library validated across previous engagements, then shape them for the specific challenge in front of them.

Understand is a valuable output on its own. Many engagements stop there because the picture it produces is already enough to ground a modernization decision. The Transformation Pilot is what happens when the engagement continues.

What a legacy modernization pilot leaves behind

A Transformation Pilot leaves three things behind. The first is a modernized component running in production, validated against integration tests authored against the legacy behaviour. The second is a process the team can re-run on dependent components in the same area of the codebase, with the contract patterns, integration tests, and workflows from the first pass ready for reuse. The third is the workflow library used during the pilot, now enriched with whatever was learned from this engagement.

Every piece of the result is traceable. Engineers can show what produced the component, what evaluated it, what test results it passed, and who signed off. That traceability is what makes the pilot reviewable as evidence, and what makes the methodology repeatable across the dependent components that follow.

View all

Stay updated with our newsletter

Subscribe for fresh insights and industry analysis.

FAQ

Questions engineers actually ask before modernizing legacy systems

Will AI hallucinate when rewriting our legacy code?

Hallucination in AI-generated code happens when a model has to guess at context it doesn't have. G.Tx addresses this at the workflow level, not the prompt level: before any code generation begins, the platform's Understand phase produces business and technical documentation, dead code analysis, and security findings directly from the existing codebase. This structured knowledge becomes the grounding context for every downstream generation task. With G.Tx hallucination is reduced before any code is touched, because the model has real evidence to work with instead of having to guess at the codebase.

Why does AI generate incorrect code when migrating legacy systems — and how do you prevent it?

AI generates incorrect code when it has to infer context it doesn't have. The fix isn't a better prompt — it's giving the model structured, accurate knowledge of the system before generation starts. G.Tx's Understand phase builds that foundation first: it generates business and technical documentation, performs dead code analysis, and runs security scanning directly from the codebase. Hallucination is reduced before any code is touched, because the model has real evidence to work with instead of having to guess at the codebase.

How do you generate tests for a legacy codebase that has no test coverage at all?

G.Tx's agentic AI engine automatically generates up to 80% of test code as part of the same workflow as code generation. Because the Understand phase maps dead code and real business logic before generation begins, the generated tests target actual behavior — not dead branches or boilerplate that would produce tests that pass but verify nothing. Smart validation suites then confirm the transformed system behaves exactly like the original.

How do you make AI code generation consistent and auditable across hundreds of thousands of lines of code?

Consistency at scale requires workflows, not prompts. G.Tx uses pre-defined, proven transformation templates for the most common code transformation scenarios — meaning high-volume migration patterns run through the same repeatable logic every time, not through a model making independent decisions on each file. Generated code, test code, and configurations are all stored with full traceability in the code repository. Every generated artefact is reviewed and refined by the engineering team before it moves forward.

Legacy system discovery always takes longer than planned — is there a faster way?

The slow part is usually manual: interviews, reading incomplete documentation, reverse-engineering behavior from code no one fully understands anymore. G.Tx automates the Understand phase — generating business and technical documentation, running dead code analysis and security scanning, and producing structured output that feeds directly into both the migration plan and AI coding tools. The Developer Portal makes everything available as an HTML portal for developers and in agent-exportable format. Discovery becomes a platform output, not a consulting workstream.

Does an AI code generation tool train on my company's source code?

This is a real risk with some tools — your proprietary codebase becomes training data. G.Tx Platform takes the opposite approach: your data never trains any AI models. The platform uses only pre-approved models and supports deployment on private infrastructure, with encrypted AI communications throughout.

How much dead code is typically hiding in a legacy system that's been auto-translated or maintained for decades?

More than most teams expect. In one of our engagements involving a 654,273-line Java codebase auto-translated from COBOL, analysis found that between 120,000 and 150,000 lines — roughly 45–55% of the business-logic layer — carried no semantic weight. Strict dead code accounted for 5–10%; translation overhead for another 35–45%. Standard static analysis tools caught a fraction of it; the rest required semantic, codebase-aware analysis. The practical consequence: quoting 275,000 lines to migrate anchors the budget. Quoting approximately 130,000 lines of real logic, plus 145,000 lines of removable overhead, reframes the engagement entirely.

Can you run an AI legacy modernization platform on private infrastructure to meet enterprise security requirements?

Yes. G.Tx supports deployment on both public and private infrastructure, with the freedom to choose from various LLMs depending on your governance requirements. The platform is designed for enterprises where running AI workloads on shared cloud infrastructure isn't acceptable — either due to data classification policies or regulatory constraints.

Can Claude Code or Codex actually be useful for legacy modernization, or do they lack the system-level context?

They lack context by default — but that context can be provided. G.Tx generates agentic specifications that include detailed task research, full documentation of the legacy and target systems, and a phased migration plan, all in a format queryable by Claude Code, Codex, and other agentic tools. This includes Agents.md files that give coding agents the grounding they need to handle complex transformation tasks. G.Tx handles the system-level analysis and orchestration; external coding agents work within that structure.

What's the difference between using GitHub Copilot or Cursor for legacy modernization versus a dedicated platform?

General-purpose AI coding assistants work at the file or function level and require a developer to supply context for every task. They have no model of the overall system, no mechanism to enforce consistency across thousands of files, and no built-in validation that the output preserves the original behavior. G.Tx is purpose-built for system-scale transformation. The platform combines automated workflows for high-volume migration patterns with an Agentic Development Environment designed for AI coding agents. The distinction matters: a prompt is a single instruction handed to a model. A workflow is a repeatable, governed sequence of operations with structured inputs, validated outputs, and traceable evidence. Prompts produce snippets. Workflows produce decisions that a CTO can defend in a steering committee.

G.Tx Methodology ensures confident modernization

Accelerate enterprise legacy transformation

Minimize transformation risks end-to-end

Transform with financial confidence

Integrate your AI stack

G.Tx Platform: The assembly line for legacy code transformation

Key features

G.Tx workflows – agentic AI engine

G.Tx Developer Portal

Agentic specifications

G.Tx transformation templates

Automated code generation

G.Tx scaled transformation

Working with G.Tx: Our proven modernization process

AI-powered deep analysis

Design transformation strategy

Generate & refine code

Phased rollout to production

Know before you commit. See if G.Tx is the right path for your legacy modernization - and what it could look like.‍

Start your free G.Tx assessment

Enterprise-grade data protection

Complete model control

Enterprise security & transparency

Zero data training risk

Human-verified quality

Learn how we help our customers tackle their challenges

From 25 weeks to 5: How AI transforms legacy airline software with confidence and speed

When speed satisfies strategy: Rewiring an industrial giant

From 25 services to 82% savings: The architecture that paid for itself

Regulate and innovate at the same time? Tell us what's holding you back.

Insights on AI-driven legacy modernization

AI-Driven Database Modernization: Good, bad, ugly

The system and the bet

The good: end to end, through pipelines

Evidence

The bad: the access model is the risk model

The ugly: the agent cuts corners to reach the goal

Lessons

Closing

Transformation pilot: Struts on JBoss to Spring on Kubernetes

The system under the pilot

Scope of the pilot

The workflow that ran the work

How the team worked inside the pilot

What the customer received

From the pilot forward

Legacy Java modernization: From Understand to Transform

The kind of code this is about

The Transformation pilot

Design: shaping the transformation strategy

Build: generating and refining the modernized component

Run: phased rollout to production

A brief note on scale

The platform: G.Tx

What a legacy modernization pilot leaves behind

Stay updated with our newsletter

Questions engineers actually ask before modernizing legacy systems

Know before you commit. See if G.Tx is the right path for your legacy modernization - and what it could look like.
‍

Start your free
G.Tx assessment