Implementing GenAI Projects Successfully: Lessons from Practice

written by Dr. Matthias Rüdiger | Head of AI/ML

Generative AI promises transformative potential for enterprises across industries. However, many GenAI initiatives struggle to move beyond proof of concept and deliver reliable value in production environments. Especially in regulated industries such as pharmaceuticals, organizations must ensure that AI systems produce consistent, traceable, and compliant results.

This article explores practical lessons learned from real-world GenAI implementations. It highlights why structured data, robust architecture, and carefully designed workflows are essential for successful AI projects and how organizations can bridge the gap between experimental prototypes and scalable, production-ready GenAI solutions.

Hype vs. hope

Euphoria was widespread when generative AI models such as ChatGPT demonstrated their impressive capabilities. Companies across all industries launched GenAI initiatives, and the pharmaceutical industry was no exception. However, many of these projects are failing to meet initial expectations. A typical pattern: An LLM is fed hundreds of documents and delivers impressive initial results – which, on closer inspection, prove to be inconsistent or even grossly incorrect and irreproducible. There is often a significant gap between proof of concept and productive use. The reasons for this are complex, but one pattern emerges: the focus was too much on the language models themselves and less on the supporting software infrastructure. Perhaps driven by the hope that the use of large language models would largely eliminate this typically more time-consuming part of the job.

This challenge is particularly evident in the regulated pharmaceutical industry. Here, the focus lies on traceable and GxP-compliant systems with reproducible results, not on demos that look good but are essentially unreliable. The key question is therefore: What distinguishes successful GenAI initiatives from unsuccessful ones?

Structured data as a foundation

Our experience from several GenAI implementations in the pharmaceutical sector shows that the success of a project is largely determined before an LLM is even used. While LLMs impress with their linguistic flexibility and now offer extensive problem-solving capabilities, they require a carefully prepared foundation to deliver accurate, reliable results.

The challenge begins with the data itself. In pharmaceutical companies, information is typically stored in unstructured, heterogeneous documents: PDFs, Word documents with variable formatting, Excel spreadsheets, often distributed across various content management systems. However, an LLM can only handle this heterogeneity to a limited extent. The task is therefore to create structured, semantically enriched information from the data pool that enables the compilation of suitable information packages for clearly defined tasks.

Lessons learned from practice

Case 1: Automatic processing of patent litigation cases

Pharmaceutical companies face the challenge of keeping track of patent litigation cases involving more than a thousand documents per case over long periods of time. New court documents, expert opinions, and correspondence must be continuously integrated into existing cases. This is a time-consuming, error-prone manual process.

A common misconception in GenAI projects is that LLMs can process such large volumes of documents as if by magic. However, this approach fails on several counts. First, large volumes of documents quickly exceed the context windows of even powerful models. And second, being confronted with irrelevant information causes the model to lose focus and become more prone to hallucinations.

The solution lies in intelligent pre-sorting and classification. Before the LLM comes into play, the documents are sorted according to type, relevance, thematic context, and interrelationships. This allows the tasks of the LLM to be clearly defined and the context to be reduced to the relevant documents. In this case, the tasks included extracting procedural statuses, assessing submissions in the dispute, and relevant evidence for the parties' positions.

The result: unstructured piles of files were transformed into searchable, semantically indexed patent cases, providing patent attorneys with a quick overview. Processing by the LLM only works reliably if there is meaningful pre-structuring.

Case 2: Test documentation - From documents to feasible tests

The creation of test documentation in regulated IT projects is one of the most time-consuming activities in the validation environment. Test plans and test cases must be derived from user requirements specifications (URS), functional specifications (FS), design specifications (DS), and technical specifications (TS). This task is not only time-consuming but also prone to errors: traceability links must be maintained, and ensuring test coverage and consistency across hundreds of test cases is virtually impossible.

The traditional approach is document-driven: validation engineers read specification documents, interpret requirements, and formulate test cases from them; usually in Word or Excel. The connection between the requirement and the test case remains implicit or is tracked in separate traceability matrices. Changes in specifications require manual review of all affected tests.

Here, too, it is tempting to simply use an LLM, possibly in the form of an agent, to check the documents and generate new test cases. However, specification documents have a complex structure, requirements are formulated with varying degrees of detail, and dependencies between requirements are often only implicit.

The solution builds on a pipeline that turns unstructured spec documents into structured data. Different document types are normalized, requirements are pulled out, and semantically modeled. This foundation is key for using LLMs.

Once prepared, an LLM can then automatically generate structured test cases directly from the processed data. It analyzes functional descriptions and suggests test steps. The system is also able to identify requirements without sufficient test coverage, and the automatic linking between requirements and test cases creates seamless traceability.

Critical for use in regulated environments: The generated test cases undergo a structured review process by validation engineers (human-in-the-loop) before they are released for qualification. This combination of AI-supported generation and human expertise ensures both efficiency and GxP compliance.

The result: a paradigm shift from document-driven to data-driven testing. The structured, machine-readable test cases form the basis not only for GxP-compliant documentation, but also for future test automation. Test cases thus become structured objects that can be used for both manual qualification and automated test execution.

Case 3: Regulatory intelligence with AI support

Regulatory affairs departments must track regulatory changes worldwide, assess their relevance for different products and markets, and proactively prepare regulatory response measures. The information comes from various sources: government websites, trade publications, newsletters, databases. Formats, languages, and levels of detail vary considerably.

In theory, an LLM could process all these sources and assess their relevance. In practice, however, it lacks the product-specific and regional context necessary for reliable assessments.

The solution first requires comprehensive data harmonization and the creation of a knowledge graph. Various data are normalized and converted into standardized vocabularies, with ontologies forming the semantic backbone: active ingredients, indications, dosage forms, regulatory categories, and their relationships to each other must be mapped in a structured knowledge network.

The graph acts as a semantic compass for the LLM. It provides the necessary context to view regulatory changes not in isolation, but in relation to the company's portfolio. Instead of handing over the entire data set to the LLM, the knowledge graph dynamically compiles the relevant context for each query: the model receives exactly the information it needs to make accurate assessments. It is only through this structuring that the LLM can play to its actual strengths.

The result: Multi-source integration through the LLM enabled the synthesis of information across different sources. This made it possible to identify connections that were scattered across sources and to create automatic summaries and evaluations. The precision of the results can only be achieved through structuring and semantic enrichment.

Success factors for GenAI projects

These three cases illustrate overarching success patterns for GenAI projects, particularly in regulated environments:

Data quality: Without pre-structuring, normalization, and semantic enrichment of unstructured, heterogeneous data, LLMs remain unreliable. Investing in data quality pays off.
Context engineering: The trick is to give the LLM clearly defined subtasks. Reducing complexity through preprocessing is necessary to make optimal use of LLMs.
Guardrails: Ensuring data quality before and after LLM deployment, human-in-the-loop processes where necessary provide the necessary control and traceability.

With the EU AI Act and expected updates to regulatory frameworks such as Annex 11/22, the question of validated, GxP-compliant AI systems will become even more pressing in the coming years. Companies that lay the right foundations now will have a significant head start.

Conclusion: The LLM as an orchestrated component

Successful GenAI projects have one thing in common: language models are an essential but only one part of the solution; most of the work consists of problem structuring and understanding, data preparation, quality control, and integration. This means that competitive advantages from the use of generative AI do not come from access to powerful models, but from excellence in data architecture, application engineering, and domain-specific orchestration.

This insight makes GenAI projects plannable, controllable, and successful. Those who understand that LLM is only one component and not the solution itself can draw up realistic project plans, manage risks, and create sustainable added value.

At INCONSULT, we combine longstanding experience in Computer Systems Validation and GxP compliance with deep AI/ML competence. This combination enables us to develop GenAI solutions that not only work technically, but are designed to meet regulatory requirements from the outset. Get in touch with us – we will show you how to leverage GenAI potential in your organization in a structured and compliant way.

Meet our Expert

For deeper insights into successful GenAI implementations in regulated environments, connect with Dr. Matthias Rüdiger, Head of AI/ML at INCONSULT, on LinkedIn.

For marketing, sales, or collaboration inquiries, please contact our team at marketing@inconsult-online.de.