Designing an Agentic AI System for a Regulated Industry
Introduction of basic concepts
Dear reader,
I am currently learning how to design agentic AI systems. This article is a brainstorm, a draft. At the end of this piece, I am interested in your opinion: What did you understand? What is unclear? What would you do differently? Let’s discuss!
The use case is chosen arbitrarily
It’s a random chosen use case.
The Use Case
Starting Point
An investment firm publishes various materials every year: factsheets, quarterly reports, monthly letters, sustainability reports, disclosures, webinars, and FAQs. Every publication goes through the same process: pulling data, writing text, compliance review, approval, and translation if necessary. The “voice” must remain authentic; any deviation from the firm’s style is immediately noticed by clients.
Today, portfolio managers spend days on every publication. During time-critical events, such as a sharp market decline or a market upheaval, transparent communication to investors must be sent out within hours—exactly the moment when the team is most constrained.
The Vision from the Portfolio Manager’s Perspective
“As a portfolio manager, I specify which publication I need: factsheet, quarterly report, monthly report, drawdown communication. I provide the time frame and the language. Then, I start the process.
Within minutes, I have the correct figures: returns, risk metrics, composition, ESG metrics. From the right systems, for the right period, consistently verified. I don’t have to query four data sources myself and put them into a table.
Based on these figures, a draft text is created. In our tone: sober, honest, explanatory. Restrained in good quarters, transparent and self-critical in difficult ones. Never promotional. Every number in the text has a traceable source.
If the report needs to be published in another language, it is translated—not literally, but with an understanding of regulatory nuances.
Before I see the report, it has already been checked against regulatory requirements: SFDR, BaFin, FNG. No performance promises, correct phrasing, complete risk disclosures. This review is rule-based and reproducible.
What I see then: the finished report. Beside it, the complete review with every rule applied and every result. And the source references for every number and every formulation decision. I no longer review my own text; I review the work of a system that I can judge from the outside.
How I know it turned out well: Every number is traceable to its data source. Every formulation is justified. The regulatory review is fully documented. The text sounds like us, not like an AI. And I finally have time for the tasks that require human judgment.
The time I would have needed for manual creation is drastically reduced. I can not only review the text but also how the text was assembled.”
Before I go into how an agentic AI system would have to be built to make this vision a reality, here are a few basic terms.
Basic Terms: Workflows and Patterns
I am referring to this source in the following: https://www.anthropic.com/engineering/building-effective-agents
Anthropic distinguishes between Workflows and Agents. In a workflow, LLMs are orchestrated by predefined code paths; the sequence is fixed. In an agent, the LLM dynamically controls its own process, deciding for itself which step to take next.
Anthropic describes six patterns for agentic systems. We use five of them in this use case. One, we deliberately do not.
Building Block: The Augmented LLM
Building Block: The Augmented LLM
Source: Anthropic, Building Effective Agents, Dez 2024
An LLM with access to external tools: data queries, API calls, memory. The fundamental building block of every agentic system.
Prompt Chaining
Prompt Chaining
Source: Anthropic, Building Effective Agents, Dez 2024
A fixed chain of LLM calls. Each step receives the output of the previous one. Between steps, a gate can check whether the result is allowed to be processed further. Ideal for tasks that can be cleanly broken down into fixed sub-steps.
Routing
Routing
Source: Anthropic, Building Effective Agents, Dez 2024
A router decides which path is taken. Different inputs lead to different processing chains.
Parallelization
Parallelization
Source: Anthropic, Building Effective Agents, Dez 2024
Several independent calls run simultaneously. The results are merged at the end. Useful when the sub-tasks do not require a specific sequence.
Evaluator-Optimizer
Evaluator-Optimizer
Source: Anthropic, Building Effective Agents, Dez 2024
A generator creates a solution. An evaluator checks it. If rejected, the solution goes back to the generator with feedback. The loop repeats until the solution is accepted or a maximum number of attempts is reached.
Autonomous Agent (deliberately not used in this use case)
Autonomous Agent
Quelle: Anthropic, Building Effective Agents, Dez 2024
The LLM controls its own process. It decides which action to take, interprets feedback, and decides whether to continue or stop. Humans intervene only when necessary. Why this pattern is not used in this use case is explained in section 5.
The Architecture Draft
Why a Workflow and not an Agent?
This use case is not a creative process, as would be the case when writing a fictional text. The reports must meet specific regulatory requirements.
The LLM must not decide how the process runs in this use case. It must not suddenly insert an additional analysis, skip a step, or choose a different structure. The process must be predictable, auditable, and reproducible. This use case is a workflow.
The Graph
The use case combines four patterns: Prompt Chaining as the main pattern, Routing in the Guardian, Parallelization in the DataAgent, and Evaluator-Optimizer as the Guardian-Writer-Loop.
Use Case Architecture
Architectural Decisions in the Draft
Prompt Chaining as the main pattern. The sequence of the process is fixed: first get numbers, then formulate text, then translate if necessary, then check compliance, then approve. Step 4 logically cannot come before Step 2. Each step is simpler than the overall task.
Deterministic Guardian instead of LLM-Evaluator. An LLM as an evaluator would not be reproducible. The same input could produce different results in two calls. This would be difficult for a regulatory (e.g., BaFin) audit. The Guardian should always reach the same verdict with identical input. Therefore, the Guardian is not an LLM; it is rule-based routing: a fixed rule set per publication type.
Parallelization in the DataAgent. The DataAgent calls four independent data sources in parallel: Performance, ESG, Composition, and Risk. This is sectioning within a node. The graph remains linear.
No Orchestrator-Worker, no Autonomous Agent. Orchestrator-Worker is used when sub-tasks are unpredictable. Here, all sub-tasks are known. An Autonomous Agent would be counterproductive. Freedom of decision in a regulated environment is what you want to avoid.
Why no autonomous agents?
The current discussion often revolves around autonomous AI agents—agents that control their own processes, make their own decisions, and select their own tools. This sounds like progress. In regulated domains, however, it is problematic.
Behavioral drift. An autonomous agent can change its behavior over time without its code changing. Different training data, different weights, a model update. The paper AI Agents Under EU Law (Nannini et al., 2026) states: High-risk agent systems with untraceable behavioral drift currently cannot meet the essential requirements of the AI Act.
Contamination between agents. When agents communicate with each other in natural language, a coupling channel is created that is not visible in any single-agent test. The coupling is measurable but invisible to classical evaluation. In a compliance context, invisible coupling is unacceptable.
No provable reproducibility. Same input, same autonomous agent, two different days: potentially two different results. For a BaFin review, for an FNG audit, for an SFDR disclosure, this is unacceptable.
Workflow instead of agent is not a step backward. It is an architectural decision that follows from the domain. The question is not: “How much autonomy can I give my agent?” The question is: “How much autonomy does the domain allow?” In compliance reporting, the answer is: none for decisions. Formulation, yes. Decision, no.
Why an AI system is still worth it for this use case
The previous sections describe limitations. One might ask: Why use AI at all?
The data work is mechanical but laborious. Querying four data sources, merging numbers, checking consistency, and pouring them into a template. A human does this today. It takes hours. A DataAgent does it in a fraction of the time.
The text work is creative but within narrow limits. A quarterly report is not a novel. The structure is fixed. The tone is fixed. What varies are the numbers and the interpretation. The portfolio manager no longer corrects the entire text; they correct the parts that require judgment.
The review becomes more thorough. Today, the portfolio manager reviews their own text. With this workflow, the Guardian reviews it first. Deterministically. Completely. The portfolio manager then reviews the work of a system they can judge from the outside.
Response time during crises drops drastically. Sharp price decline on a Monday morning? The portfolio manager has a reviewed draft in front of them within a short time. In a crisis, that is the difference between reacting and being driven by events.
The creation time is drastically reduced because the AI takes over the mechanical parts and the portfolio manager focuses on what only they can do: judge, decide, and take responsibility.
Next Steps
This article has not yet fully addressed or solved governance.
Similarly, the question arises as to which LLMs to use for which step and where they are hosted.
Furthermore, it has not yet been explained how to pull data, build context, establish memory, and codify learning loops.
All of this will be covered in follow-up articles.
Conclusion
In any case, I have learned a lot while working on this article. How about you?
The whole debate about agentic AI systems has completely demystified itself for me.
I have realized that building AI systems is a combination of understanding the user, the goal, the desired outcome, the data, the context, the business workflow, and the way you build workflows and agents with AI.
Building agentic AI systems, by the way, has a lot in common with software engineering and software architecture. An LLM is part of a pattern, and a pattern strongly resembles a function in software code.
In my opinion, the disciplines are moving closer together; the boundaries between technology and business can hardly be drawn anymore, it is ONE construct.
This understanding is important to be able to form an idea of how the organization and the way of working will change, because only when I understand what I am building and how, can I decide who I need for it and why.
I look forward to your comments!









Very well analyzed and clearly explained! 👏
Good requirements management has always led to better results:
Where exactly do we start: in which environment, at which point in the process chain, and to what extent?
Who are the stakeholders, and what are their clearly articulated expectations and objectives?
Who are the teams responsible for implementing this task?
Clarification of what the scope is for each team, with clear communication rules enforced by a project lead.
Only once this has been clarified and communicated to everyone simultaneously—only then do we move on to the teams that must implement it. Then the new architecture is set up, and the necessary packages are put together in this order.
With agent-based processes, it has become even more important not to neglect any of these points.
And this list can be expanded as needed, depending on the size and scope of the overall package.
Hi Bianca, Agree with your thought process on how you decide this is a legit AI usecase. I especially like the part where you explain why autonomous agency doesn’t work for this use-case.
This example walkthrough is one of the most honest attempt at design thinking, structured and restrained and clear in just the right way, taking the readers along. (One of my professors used to teach with examples back in the day, reminded me of that! )