{"id":88901,"date":"2026-06-09T14:59:54","date_gmt":"2026-06-09T12:59:54","guid":{"rendered":"https:\/\/webraketen.space\/?post_type=glossar&#038;p=88901"},"modified":"2026-06-09T17:29:00","modified_gmt":"2026-06-09T15:29:00","slug":"ki-agent","status":"publish","type":"glossar","link":"https:\/\/webraketen.space\/en\/glossar\/ki-agent\/","title":{"rendered":"AI Agents \u2013 Definition, Function, and Practical Applications"},"content":{"rendered":"<p class=\"wp-block-paragraph\">An AI agent is a software system that independently solves a task over multiple steps: It plans the approach, calls on tools such as databases, software, or APIs, evaluates intermediate results, and then autonomously decides on the next step. Unlike an AI chatbot, which provides a response to an input, an agent acts independently until it has reached its goal.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">AI agents handle tasks for which the solution path is not predetermined. For example, competitive research, anomaly detection in machine data, or cross-system inquiries. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The current understanding has been established since around 2023, when Large Language Models (LLMs) became the controlling component in such systems. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The theoretical foundation dates back to <a href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_Intelligence:_A_Modern_Approach\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_Intelligence:_A_Modern_Approach\" rel=\"noreferrer noopener\">Russell &amp; Norvig (1995)<\/a>An agent is therefore an entity that perceives its environment and acts upon it. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In business processes, an agent is used where a task requires multiple steps with decisions in between and the path is not deterministic.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A common misconception concerns the term <strong>Agentic AI<\/strong>This is not a synonym variant of AI agent, but rather denotes the characteristic of autonomous, goal-oriented action. The AI agent is the artifact, Agentic AI is the characteristic it realizes. The LLM is also not the agent, but its cognitive core. This core is arbitrarily exchangeable.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The term is currently used in an inflationary way, primarily in product and marketing communications. Gartner calls the phenomenon \u201eagent-washing\u201c: the renaming of existing chatbots, RPA bots, and assistants without actual agentic capabilities.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How to recognize a real AI agent<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The most robust dividing line against agent-washing is a four-step criterion: <\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>A true agent plans the next step, <\/li>\n\n\n\n<li>execute it via a tool, <\/li>\n\n\n\n<li>observes the result <\/li>\n\n\n\n<li>and adjusts as needed based on observation. <\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">If one of these steps is missing, it's usually a chatbot with API integration or a pre-wired workflow (automation). <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The order of magnitude shows the vendor landscape: Gartner identified around 130 vendors positioning themselves as agentic AI solutions. However, Gartner considers most of these offerings to be relabeled assistants or chatbots without multi-step autonomy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Anyone who wants to be sure that it is a real AI agent can orient themselves by these three visible indicators:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Dynamic Tool Selection:<\/strong> The system decides at runtime which tool or data source it calls. There is no pre-configured path. If every step is visible in the Flow Editor, it is a workflow.<\/li>\n\n\n\n<li><strong>Intermediate observation influences the next step:<\/strong> With the same input but different intermediate results, the system performs different subsequent steps. Anyone who cannot reproduce this does not have an agent in front of them.<\/li>\n\n\n\n<li><strong>Persistent memory across sessions:<\/strong> The agent can access previous interactions or intermediate states. A stateless single-turn bot cannot do that.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">How do I build an AI agent?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">An agent is not created from a single tool, but from multiple layers working together: <\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>the language model as the cognitive core, <\/li>\n\n\n\n<li>a framework that controls the loop, <\/li>\n\n\n\n<li>an integration with the tools the agent works with, <\/li>\n\n\n\n<li>and a layer that records every step. <\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Those who distinguish these layers will navigate the tool landscape more quickly than by looking at vendor lists. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The first decision concerns only one of these layers, namely the framework (2.): Do you write the agent yourself in code, or assemble it on a visual interface (Low\/No Code)? The two main paths diverge at this fork.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Coding yourself<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Those who follow the code path work with an orchestration framework. Its task is to control the agent loop. It coordinates planning, tool calls, state, and resumption after an interruption. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Which framework is right depends on the project's focus. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.langchain.com\/langgraph\" data-type=\"link\" data-id=\"https:\/\/www.langchain.com\/langgraph\" target=\"_blank\" rel=\"noreferrer noopener\">LangGraph<\/a> gilt 2026 is considered the most common choice for productive, multi-step agents because it models the process as a graph with state persistence and rollback, thus remaining reliable even if a step aborts mid-execution. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Is it more about quickly setting up a role-based prototype with multiple agents? <a href=\"https:\/\/crewai.com\/\" data-type=\"link\" data-id=\"https:\/\/crewai.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">CrewAI<\/a> the more direct way. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Seeks the agent predominantly in large knowledge bases, plays <a href=\"https:\/\/www.llamaindex.ai\/\" data-type=\"link\" data-id=\"https:\/\/www.llamaindex.ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">LlamaIndex<\/a> its strength from the fact that retrieval is at the center there. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And whoever works in a Python team with strict types can rely on <a href=\"https:\/\/pydantic.dev\/\" data-type=\"link\" data-id=\"https:\/\/pydantic.dev\/\" target=\"_blank\" rel=\"noreferrer noopener\">Pydantic AI<\/a> fall back.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In addition, the major model providers offer their own SDKs. For example, the <a href=\"https:\/\/developers.openai.com\/api\/docs\/guides\/agents\" data-type=\"link\" data-id=\"https:\/\/developers.openai.com\/api\/docs\/guides\/agents\" target=\"_blank\" rel=\"noreferrer noopener\">OpenAI Agents SDK<\/a>, the <a href=\"https:\/\/code.claude.com\/docs\/en\/agent-sdk\/overview\" data-type=\"link\" data-id=\"https:\/\/code.claude.com\/docs\/en\/agent-sdk\/overview\" target=\"_blank\" rel=\"noreferrer noopener\">Claude Agent SDK from Anthropic<\/a> or <a href=\"https:\/\/docs.cloud.google.com\/gemini-enterprise-agent-platform\/build\/adk?hl=de\" target=\"_blank\" rel=\"noreferrer noopener\">Google ADK<\/a>. These are particularly worthwhile if you are already at home in one of these ecosystems.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Code solutions offer maximum flexibility but currently also require a high level of technical skill.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">No-\/Low-Code<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">With no-code or low-code solutions, AI agents can be implemented on web interfaces with less complex programming work. Which toolkit is suitable often depends on the task itself. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If the agent is just one step in a larger process automation \u2014 for example, reading an email, extracting data from it, updating a database, and finally sending a notification \u2014 then <a href=\"https:\/\/n8n.io\/\" data-type=\"link\" data-id=\"https:\/\/n8n.io\/\" target=\"_blank\" rel=\"noopener\">n8n<\/a> the obvious choice as a workflow automation platform.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">n8n comes with loops and error handling built-in, and the agent integrates seamlessly into the rest of the workflow. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/flowiseai.com\/\" data-type=\"link\" data-id=\"https:\/\/flowiseai.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Flowise<\/a> and <a href=\"https:\/\/www.langflow.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Langflow<\/a> are against open workshops specifically for LLM pipelines. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Langflow directly integrates the aforementioned LangGraph and is therefore also suitable for stateful loops. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/dify.ai\/\" data-type=\"link\" data-id=\"https:\/\/dify.ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">Dify<\/a> goes a step further and bundles the builder, knowledge base, API layer, and logging into a single product. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Anyone who has to connect many different services, especially <a href=\"https:\/\/www.make.com\/en\" data-type=\"link\" data-id=\"https:\/\/www.make.com\/en\" target=\"_blank\" rel=\"noreferrer noopener\">Make<\/a> or <a href=\"https:\/\/zapier.com\/\" data-type=\"link\" data-id=\"https:\/\/zapier.com\/\" target=\"_blank\" rel=\"noopener\">Zapier<\/a> well served, because the connector cover is widest there. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Meanwhile, the major providers have also entered this area, for example, OpenAI with the <a href=\"https:\/\/developers.openai.com\/api\/docs\/guides\/agent-builder\" data-type=\"link\" data-id=\"https:\/\/developers.openai.com\/api\/docs\/guides\/agent-builder\" target=\"_blank\" rel=\"noopener\">OpenAI Agent Builder<\/a>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Two layers on top<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Two more layers lie across this decision: they are added regardless of whether one has opted for code or no-code. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The first is the connection between the agent and the tool. For an agent to be able to use a tool at all, it needs a standardized connection, and that's exactly what the Model Context Protocol (MCP) provides.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Anthropic unveiled it in 2024 and handed it over to the Linux Foundation at the end of 2025; meanwhile, all major vendors support it, and leading frameworks have made it the standard for tool calls. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">MCP handles the connection from agent to tool, while the related A2A protocol (Google, 2025) takes care of agent-to-agent connections in multi-agent systems. In practice, this primarily means one thing: Before writing your own integration, check if there is already a ready-made MCP server for the desired tool.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The second layer, which is relevant in both cases, is observability. This means observing what the agent is actually doing. It's necessary because agents fail silently: a hallucinated response is returned without an error message, and traditional monitoring of latency and error rates won't notice it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Specialized tools close this gap by recording every single step. So, every LLM call, every tool call, every planning decision. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Which of these is suitable depends on the environment again: <a href=\"https:\/\/www.langchain.com\/langsmith\/observability\" data-type=\"link\" data-id=\"https:\/\/www.langchain.com\/langsmith\/observability\" target=\"_blank\" rel=\"noreferrer noopener\">LangSmith<\/a> is tightly integrated with LangChain and LangGraph, <a href=\"https:\/\/langfuse.com\/\" target=\"_blank\" rel=\"noopener\">Langfuse<\/a> is the open-source and framework-independent variant, and <a href=\"https:\/\/arize.com\/phoenix\/\" target=\"_blank\" rel=\"noopener\">Arize Phoenix <\/a>offers the greatest depth in evaluation. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This layer is not an afterthought, but the practical answer to two problems that appear later in the post: to the failure modes, for example when the <a href=\"https:\/\/webraketen.space\/en\/glossar\/token-ki-llom\/\" data-internallinksmanager029f6b8e52c=\"17\" title=\"Token (AI\/LLM)\">Token<\/a>- Consumption is not measured during the test phase, and the logging obligation of the EU AI Act, according to which every tool call and every planning step must be traceable.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How an AI agent works<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The architecture is fundamentally different from a chatbot that takes a prompt and returns a response. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">An agent runs in a loop where the LLM is not the executor, but the controller. It generates structured instructions that other components interpret\u2014database queries, API calls, code execution. From this separation follows: Agents must be calculated differently than chat applications. For each task, the LLM is called multiple times, often dozens of times.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">The Agent Loop<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">A typical agent goes through five stages, each with a characteristic failure mode:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Perception<\/strong> \u2014 Inputs and context are read. Error mode: incomplete or noisy context that distorts subsequent decisions.<\/li>\n\n\n\n<li><strong>Planning<\/strong> \u2014 The LLM breaks down the goal into sub-steps. Failure mode: hallucinated steps or infinite loops without an exit criterion.<\/li>\n\n\n\n<li><strong>Execution<\/strong> \u2014 Tools are being called (API, database, code). Error mode: faulty tool signatures or unhandled API errors.<\/li>\n\n\n\n<li><strong>Reflection<\/strong> The result will be checked against the target. Failure mode: Override, where the agent discards working intermediate results.<\/li>\n\n\n\n<li><strong>Memory<\/strong> \u2014 Intermediate results are saved. Error mode: Context Red \u2014 the context becomes so long that relevant information gets lost.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image alignwide size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"640\" src=\"https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-loop-2-1024x640.jpg\" alt=\"Ein kreisf\u00f6rmiges Flussdiagramm mit dem Titel Der Agenten-Loop zeigt f\u00fcnf Schritte: Wahrnehmung, Planung, Ausf\u00fchrung, Reflexion und Erinnerung. Jeder Schritt ist nummeriert und beschrieben, verbunden durch Pfeile, mit LLM steuert den Ablauf in der Mitte.\" class=\"wp-image-88916\" srcset=\"https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-loop-2-1024x640.jpg 1024w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-loop-2-300x188.jpg 300w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-loop-2-768x480.jpg 768w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-loop-2-1536x961.jpg 1536w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-loop-2-18x12.jpg 18w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-loop-2.jpg 1586w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">The ReAct Pattern<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The current de facto standard for aligning thinking and action is <a href=\"https:\/\/research.google\/blog\/react-synergizing-reasoning-and-acting-in-language-models\/\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/research.google\/blog\/react-synergizing-reasoning-and-acting-in-language-models\/\" rel=\"noreferrer noopener\">ReAct (Reasoning + Acting)<\/a>. Shunyu Yao and co-authors presented the pattern in 2022 at Princeton and Google Research. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The agent thinks one step, calls a tool, observes the result, and thinks again. It continues this process until a final state is reached. <\/p>\n\n\n\n<figure class=\"wp-block-image alignwide size-large\"><img decoding=\"async\" width=\"1024\" height=\"640\" src=\"https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-react-1024x640.jpg\" alt=\"Ein Diagramm vergleicht die KI-Schleifen &quot;Reason Only&quot;, &quot;Act Only&quot; und &quot;ReAct&quot; und zeigt, wie Sprachmodelle (LM) mit Umgebungen (Env) \u00fcber Aktionen, Beobachtungen und Schlussfolgerungsspuren interagieren. Pfeile veranschaulichen den Datenfluss.\" class=\"wp-image-88917\" srcset=\"https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-react-1024x640.jpg 1024w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-react-300x188.jpg 300w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-react-768x480.jpg 768w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-react-1536x961.jpg 1536w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-react-18x12.jpg 18w, https:\/\/webraketen.space\/wp-content\/uploads\/2026\/06\/ki-agent-react.jpg 1586w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">In the ALFWorld and WebShop benchmarks, ReAct surpassed previous imitation and reinforcement learning methods by an absolute margin of 34 and 10 percentage points in success rate, respectively, with only one to two in-context examples.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">ReAct is worthwhile when the environment is uncertain or noisy. For tasks without tools, ReAct only adds overhead. In this case, a pure Chain-of-Thought prompt is easier. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For deterministic workflows, where each step is known in advance, a plan-and-execute approach is more advantageous. It plans once and then executes, rather than reasoning anew between each action.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Agent types and when to use each<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The academic taxonomy according to Russell &amp; Norvig distinguishes four basic types, sorted by increasing generality. It is suitable as a selection grid:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Simple Reflex Agents<\/strong> react exclusively to the current input. Particularly suitable when the task is stateless and the next step can be directly derived from the input.<\/li>\n\n\n\n<li><strong>Model-Based Reflex Agents<\/strong> maintain an internal state of the environment. These are necessary once previous observations influence the next decision.<\/li>\n\n\n\n<li><strong>Goal-based Agents<\/strong> Evaluate actions based on whether they bring you closer to a defined goal. These are the standard for most current LLM-based agents in business processes.<\/li>\n\n\n\n<li><strong>Utility-based Agents<\/strong> Weighing multiple goals with different levels of importance. Especially relevant for trade-offs, such as cost versus speed versus accuracy.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Each of these types can be extended as a learning agent that learns from results and adapts its behavior.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Related to this is the question of single-agent or multi-agent systems (MAS). <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A single agent is sufficient as long as the task can be solved as a coherent sequence. Only when it breaks down into clearly separable subtasks with their own specialized domains\u2014such as research, analysis, and reporting\u2014does an orchestrator agent become useful, coordinating specialized worker agents. MAS costs many times more tokens (see below). <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>As a heuristic, it is considered:<\/strong> only split when the subtasks require different domains, tool sets, or context windows \u2014 not because a single agent occasionally makes mistakes.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">AI Agent or Chatbot, RPA, Workflow? What to use when<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The choice of the right tool depends on the task structure, not on the available tool stack.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>RPA \/ Classic Workflow<\/strong> The path is deterministic and known in advance. Each step can be defined as a rule or a click path. Example: Transferring invoice data from PDF to SAP.<\/li>\n\n\n\n<li><strong>Chatbot:<\/strong> One question leads to an answer. No multi-step execution, no external actions. Example: Answer FAQs about the product portal.<\/li>\n\n\n\n<li><strong>AI Agent:<\/strong> The task requires several steps, the order of which depends on the intermediate result. Example: Identify ten competitors, analyze their pricing pages, create a comparison report. This is done weekly, with changing competitors.<\/li>\n\n\n\n<li><strong>Assistant \/ Copilot:<\/strong> Reactive-assistive, without autonomous multi-step execution. The human remains in the loop and provides approval after each step. Gartner explicitly ranks assistants below agents.<\/li>\n\n\n\n<li><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">When an AI agent is worthwhile<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The deployment threshold can be determined by three criteria: <\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>The task has high path diversity (deterministic tasks belong in RPA)., <\/li>\n\n\n\n<li>it is recurring (one-off tasks do not justify development costs), <\/li>\n\n\n\n<li>and it has a clearly verifiable outcome (otherwise it's not possible to measure if the agent is working).<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">The cost order of magnitudes, which are missing from many glossaries, are the second part of the worthwhile decision. Agentic systems consume 5 to 30 times more tokens per task than a single chat interaction:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Simple Tool-Calling Agents:<\/strong> 5,000\u201315,000 tokens per task.<\/li>\n\n\n\n<li><strong>Complex Multi-Agent Systems:<\/strong> 200,000 to over 1,000,000 tokens per task.<\/li>\n\n\n\n<li><strong>Unlimited Software Engineering Agent:<\/strong> According to a study by the Stevens Institute of Technology, approximately $5\u20138 per resolved issue. This value applies to an agent without budget or step limitations; with set limits, it is lower.<\/li>\n\n\n\n<li><strong>Enterprise-level inference costs:<\/strong> According to ICONIQ Capital<a href=\"https:\/\/cdn.prod.website-files.com\/65d0d38fc4ec8ce8a8921654\/6979532decf89bd3df2163b0_ICONIQ_Analytics_Insights_2026_State_of_AI_Bi-Annual_Snapshot.pdf\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/cdn.prod.website-files.com\/65d0d38fc4ec8ce8a8921654\/6979532decf89bd3df2163b0_ICONIQ_Analytics_Insights_2026_State_of_AI_Bi-Annual_Snapshot.pdf\" rel=\"noreferrer noopener\">State of AI, January 2026, Survey of approximately 300 software executives<\/a>) for scaling stage AI companies, approximately 23 percent of revenue comes from inference \u2014 per $1 million in product revenue, this means about $230,000 in pure model costs.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">An architectural adjustment screw for multi-agent systems is the hierarchical distribution of model classes: Frontier models only for the orchestrator, budget models for the workers. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In a benchmark for processing financial documents (SEC filings), approximately 97.7 percent of full-cost accuracy was achieved at roughly 60.9 percent of the costs. Whether these values can be transferred to other domains has not been proven. The leverage itself (small models for routine subtasks) is considered more general.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Example fields of application<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Predictive Maintenance<\/strong> An agent reads sensor data, correlates it with maintenance histories, and suggests an intervention time. According to Databricks, Lippert uses agent-assisted analytics to anticipate maintenance needs in manufacturing and logistics.<\/li>\n\n\n\n<li><strong>Competitive Monitoring<\/strong> An agent researches defined competitors weekly, analyzes price pages, and creates a comparison report. A classic case of high path diversity \u2014 different peculiarities each week.<\/li>\n\n\n\n<li><strong>Document Review:<\/strong> An orchestrator agent distributes the review of a contract or report to specialized workers. One worker extracts relevant clauses, one checks against internal guidelines, and one summarizes. Here, the task breaks down into specialized subtasks. A typical candidate for a multi-agent system.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">When AI agents fail in reality<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Gartner predicts that by the end of 2027, over 40 percent of agentic AI projects will be abandoned, primarily due to escalating costs, unclear business value, or insufficient risk controls. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The recurring failure patterns can be recognized in advance:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Context Rot<\/strong> The agent increasingly pulls in more context in long loops; relevant information gets lost in the noise. Preliminary indicator: Increasing latency and decreasing hit quality with increasing loop length during the test phase.<\/li>\n\n\n\n<li><strong>Quadratic Token Growth<\/strong> Each new loop step processes the entire context so far. This means costs and latency do not grow linearly. Preliminary indicator: Token consumption per task not measured in test runs.<\/li>\n\n\n\n<li><strong>Missing checkpointing<\/strong> If a step fails, you have to start over. Prior indicator: Architecture diagram shows no persistence layer for intermediate states.<\/li>\n\n\n\n<li><strong>Governance Gaps<\/strong> Agents can perform actions without permissions, <a href=\"https:\/\/webraketen.space\/en\/seo-audit-agentur\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"SEO Audit Agency\">Audit<\/a>-Logs and escalation paths are defined. Pre-indicator: Nobody can name what the agent is allowed to do and what not\u2014and who is liable if they do it anyway.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What the EU AI Act requires for AI agents<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The EU AI Act entered into force on August 1, 2024, and will be applied in stages. The last major phase, originally scheduled for August 2, 2026, concerns the full enforceability of requirements for high-risk AI systems under Annex III. A political agreement in May 2026 (Digital Omnibus Package) foresees a postponement of approximately 16 months for Annex III systems; the formal decision was still pending at the time of publication. Fines for violations can reach up to \u20ac35 million or 7 percent of global annual turnover.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">According to Annex III, high-risk applications include, among other things, AI systems used in human resources (candidate selection, performance evaluations), credit scoring, and critical infrastructure. An AI agent that, for example, pre-sorts incoming applications or pre-screens credit applications thus falls into the high-risk category\u2014regardless of whether a human ultimately makes the final decision.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Specific requirements apply to such systems, which must be incorporated into the agent architecture:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Risk Management System:<\/strong> Documented identification, assessment, and mitigation of risks throughout the entire lifecycle.<\/li>\n\n\n\n<li><strong>Data Governance<\/strong> Requirements for Training and Input Data (Relevance, Representativeness, Accuracy, Completeness).<\/li>\n\n\n\n<li><strong>Technical Documentation:<\/strong> A complete description of the design, functionality, performance metrics, limitations, and intended use\u2014prior to placing the product on the market.<\/li>\n\n\n\n<li><strong>Logging:<\/strong> Automatic recording of inputs, outputs, and decisions. For agents, this means: every tool call, every planning step, every reflection must be auditable.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Anyone planning to deploy an agent that may fall into a high-risk category must treat logging, compliance assessment, and documentation as architectural requirements from the outset\u2014not as an after-the-fact compliance layer.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Frequently Asked Questions<\/h2>\n\n\n<div id=\"rank-math-faq\" class=\"rank-math-block\">\n<div class=\"rank-math-list\">\n<div id=\"faq-question-1781017114372\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>Can a company develop AI agents on its own, or does it need external help?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>Prototypes can be realized internally with standard frameworks if data access and LLM infrastructure are available. Productive operation\u2014especially multiple coordinated agents, governance, audit logging, and cost control\u2014typically requires specialized knowledge that has not yet been built up in most companies. The high abandonment rates primarily affect projects that underestimate the leap from prototype to production.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1781017141995\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>How many tokens does an AI agent use per task?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>Simple tool-calling agents use 5,000\u201315,000 tokens per task, while complex multi-agent systems use 200,000 to over 1,000,000. Compared to a single chat interaction, that is 5 to 30 times more. Token consumption per task should be measured during the testing phase, not just in production.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1781017151743\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>Is an AI agent the same as Agentic AI?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>No. The AI agent is the concrete software system; agentic AI refers to the underlying property of acting autonomously and purposefully. An agent is an artifact, agentic AI a conceptual property\u2014providers who use both synonymously are an indicator of fuzzy terminology.<\/p>\n\n<\/div>\n<\/div>\n<div id=\"faq-question-1781017165856\" class=\"rank-math-list-item\">\n<h3 class=\"rank-math-question\"><strong>When is an AI agent considered a high-risk system under the EU AI Act?<\/strong><\/h3>\n<div class=\"rank-math-answer\">\n\n<p>As soon as it is used in one of the areas of application listed in Annex III\u2014including personnel decisions, creditworthiness checks, and critical infrastructure. Full enforcement of the high-risk requirements was scheduled for August 2, 2026; a political agreement reached in May 2026 provides for a postponement of approximately 16 months; the formal decision is still pending. The classification depends on the intended use, not on the technical complexity of the agent.<\/p>\n\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\"><\/p>","protected":false},"excerpt":{"rendered":"<p>Ein KI-Agent ist ein Softwaresystem, das eine Aufgabe eigenst\u00e4ndig \u00fcber mehrere Schritte l\u00f6st: Es plant das Vorgehen, ruft Werkzeuge wie Datenbanken, Software oder APIs auf, wertet die Zwischenergebnisse aus und entscheidet daraufhin eigenm\u00e4chtig den n\u00e4chsten Schritt. Anders als ein KI-Chatbot, der auf eine Eingabe eine Antwort gibt, handelt ein Agent selbstst\u00e4ndig, bis er sein Ziel [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","class_list":["post-88901","glossar","type-glossar","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/webraketen.space\/en\/wp-json\/wp\/v2\/glossar\/88901","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/webraketen.space\/en\/wp-json\/wp\/v2\/glossar"}],"about":[{"href":"https:\/\/webraketen.space\/en\/wp-json\/wp\/v2\/types\/glossar"}],"version-history":[{"count":6,"href":"https:\/\/webraketen.space\/en\/wp-json\/wp\/v2\/glossar\/88901\/revisions"}],"predecessor-version":[{"id":88927,"href":"https:\/\/webraketen.space\/en\/wp-json\/wp\/v2\/glossar\/88901\/revisions\/88927"}],"wp:attachment":[{"href":"https:\/\/webraketen.space\/en\/wp-json\/wp\/v2\/media?parent=88901"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}