Gartner: GPT-5 is here, but the infrastructure to support true agentic AI isn’t (yet)

Taryn Plumb 2025-08-14 ⁣21:35:00

Want smarter insights in your inbox? Sign up for our​ weekly newsletters‍ to⁣ get only what matters to enterprise AI,data,and security ‍leaders. Subscribe Now


Here’s an analogy: Freeways didn’t exist in the U.S.⁣ until after 1956, when envisioned by President Dwight D. ​Eisenhower’s‍ administration — yet super fast,powerful cars like Porsche,BMW,jaguars,Ferrari and others had been around for decades.

You could say AI is at that same pivot point: While models are becoming increasingly more capable, performant and refined, the critical infrastructure‍ they need to bring about ‌true,​ real-world innovation has yet to be fully built​ out.

“All we have done is create some ‍very⁢ good engines for a⁢ car, and we are getting super ⁤excited,‍ as if we ​have ⁢this fully functional highway system in place,”⁢ Arun Chandrasekaran, Gartner distinguished ⁣VP analyst, ‍told VentureBeat.

This ⁣is leading to a plateauing,of sorts,in model ​capabilities such as OpenAI’s GPT-5: While an important step forward, it only features faint glimmers ⁣of truly agentic AI.


AI Scaling Hits Its⁣ Limits

Power caps, rising token costs, and inference delays are ⁤reshaping enterprise ⁢AI. ‍Join our exclusive salon to ⁣discover how top teams⁣ are:

  • Turning energy into a strategic advantage
  • Architecting efficient inference for ‌real throughput gains
  • Unlocking competitive ROI with sustainable ‌AI ⁣systems
  • Secure your ​spot to stay ahead: https://bit.ly/4mwGngO


    “It is a very capable model,it ⁢is a very versatile model,it has made some very ⁢good progress in ⁢specific domains,” said Chandrasekaran. “But my ⁤view is it’s more of an incremental progress, rather than a ​radical progress or a radical advancement, given all‍ of the high expectations OpenAI has‌ set in ‌the⁢ past.”

    GPT-5 improves in⁢ three key areas

    To‍ be clear, OpenAI has made strides with GPT-5, according to Gartner, including in coding tasks and multi-modal capabilities.

    Chandrasekaran pointed out that OpenAI has pivoted to make GPT-5 “very‍ good” ‌at coding, clearly sensing gen AI’s enormous opportunity​ in enterprise ‍software engineering and taking aim at competitor Anthropic’s leadership in ⁣that area.

    Meanwhile,GPT-5’s progress in modalities beyond text,particularly in speech and images,provides new ⁤integration opportunities for enterprises,Chandrasekaran noted.

    GPT-5 also does, if ⁣subtly, advance AI‌ agent and orchestration design, thanks to improved tool use; the model can call third-party APIs ⁤and‍ tools and perform ‌parallel tool‍ calling (handle multiple tasks simultaneously).However, this‍ means ‌enterprise systems must have ‌the capacity‌ to handle concurrent API requests in a single⁢ session, Chandrasekaran⁢ points out.

    Multistep planning in GPT-5 allows more business⁢ logic to reside within the model itself, reducing ⁤the ⁣need for external​ workflow engines, and ‍its⁣ larger context windows (8K for free users, 32K for ⁤Plus at $20 per month and ⁤128K for Pro ‌at $200 per month)‍ can⁤ “reshape enterprise AI architecture‍ patterns,” he‍ said.

    This means that applications that previously relied on complex retrieval-augmented generation‌ (RAG) pipelines to‍ work around ⁤context limits can now pass much larger datasets directly to the models and simplify some workflows. But this doesn’t mean RAG is⁤ irrelevant; “retrieving ‍only the‌ most relevant data is still faster and more cost-effective than ⁢always sending ‌massive inputs,” Chandrasekaran pointed out.

    Gartner sees a shift to a hybrid approach​ with⁤ less ⁢stringent retrieval, with devs using GPT-5 to ⁤handle “larger, messier contexts” while ⁤improving efficiency.

    On ‌the cost ‍front, ⁣GPT-5 “considerably” reduces API usage fees; top-level ⁤costs are $1.25 per 1 million input tokens ​and $10 per 1 million output tokens, making⁣ it comparable to models like ⁤Gemini 2.5, but seriously‌ undercutting claude Opus. Though, GTP-5’s⁣ input/output price ratio is higher than earlier models, ⁤which AI leaders should take into account when considering‌ GTP-5 ⁢for high-token-usage scenarios, Chandrasekaran advised.

    bye-bye previous GPT versions ⁤(sorta)

    Ultimately, ‍GPT-5 is designed to ‍eventually replace GPT-4o and⁢ the o-series ⁢(they were initially sunset,​ then some reintroduced‌ by OpenAI due⁣ to user dissent).Three model⁣ sizes (pro, mini, nano) will allow⁤ architects to tier⁣ services based on cost and latency needs; simple queries ‌can⁣ be handled by smaller models and complex tasks by⁤ the full model,⁢ Gartner notes.

    However, differences​ in output formats,⁢ memory⁣ and function-calling behaviors may require code review and adjustment, ​and because GPT-5 may render some previous workarounds⁣ obsolete, devs ‍should⁣ audit their ⁤prompt‌ templates and system instructions.

    By eventually sunsetting previous versions,⁤ “I think what OpenAI is⁤ trying to do is abstract that level of⁤ complexity away from the user,” said Chandrasekaran. “Frequently enough we’re ​not the best people to make those decisions, and sometimes we ⁤may even make erroneous decisions, I ⁤would ‌argue.”

    Another fact⁤ behind the phase-outs: “We all know that OpenAI has a ⁢capacity problem,” he ⁤said, and thus has forged partnerships with Microsoft, Oracle ⁢(Project ‌Stargate), Google and others⁣ to​ provision compute capacity. Running multiple generations of ‍models would require multiple generations of infrastructure, creating new cost implications and physical constraints.

    New‌ risks, advice for adopting GPT-5

    OpenAI claims it reduced hallucination rates‍ by up to 65% in GPT-5 compared to​ previous models; this can help reduce compliance risks and​ make the model more suitable for enterprise use cases, and its chain-of-thought (CoT) explanations support auditability and​ regulatory ⁤alignment, Gartner notes.

    At the same time, these⁤ lower hallucination rates as well as GPT-5’s advanced reasoning and multimodal processing could amplify misuse​ such ‌as advanced scam ⁢and phishing generation. Analysts advise that critical workflows remain under⁤ human review, even if with less⁣ sampling.

    The firm also advises that enterprise ‌leaders:

    • Pilot​ and​ benchmark GPT-5 in mission-critical use cases, running side-by-side evaluations against ⁢other models to determine⁢ differences in accuracy, speed and user experiance.
    • Monitor practices like vibe coding that ​risk data exposure (but without ⁣being offensive about it or risking defects or guardrail failures).
    • Revise governance policies ​and guidelines to address new model behaviors, expanded‍ context windows and safe completions, and calibrate oversight mechanisms.
    • Experiment with tool integrations, reasoning parameters, caching and model sizing to⁤ optimize performance, and use inbuilt dynamic routing to determine the right model for the right task.
    • Audit and upgrade ​plans for GPT-5’s expanded capabilities. This includes validating API quotas, audit trails ​and multimodal data pipelines to support new features ⁣and increased throughput. Rigorous ⁣integration testing is also ⁢important.

    Agents don’t just need⁢ more compute; they need infrastructure

    No doubt,⁣ agentic AI is a “super hot topic today,” Chandrasekaran noted, and is one of the top areas for investment in ​Gartner’s 2025 Hype Cycle for Gen AI. At the same time, the technology has hit Gartner’s “Peak ‌of Inflated Expectations,” meaning it⁢ has​ experienced widespread​ publicity due to early ⁣success⁣ stories, in turn building unrealistic​ expectations.

    This trend is typically followed by⁣ what Gartner calls the “Trough of disillusionment,” when⁣ interest, excitement and investment cool off as experiments and implementations fail to deliver (remember: ther ​have been two notable AI winters since the 1980s).

    “A⁣ lot of vendors are hyping products beyond what products are capable of,” said Chandrasekaran. ⁣“It’s almost like they’re‍ positioning them as being production-ready, enterprise-ready and are going to deliver business value in a really short span ⁣of ⁤time.”

    However,in reality,the chasm between product quality relative to expectation is wide,he⁤ noted. Gartner isn’t seeing enterprise-wide agentic deployments; those they are seeing are⁣ in “small,narrow pockets” and specific domains like software engineering‍ or procurement.

    “But even those workflows are not fully ‌autonomous;‌ they⁢ are often either⁤ human-driven or semi-autonomous in nature,” Chandrasekaran explained.

    One of the key culprits is the lack of ‍infrastructure; agents⁣ require access to a wide ⁢set of ⁣enterprise tools and must have the capability ‌to communicate with⁤ data stores and SaaS apps. At the same time, there⁢ must be adequate identity‍ and ⁣access⁤ management systems in place to control⁤ agent behavior and access, ‌and also oversight of the types of data they can access (not ‌personally identifiable⁤ or sensitive), he noted.

    Lastly, enterprises ⁢must ‌be confident that the information the agents are producing is trustworthy, meaning it’s free⁤ of bias and doesn’t contain hallucinations or false information.

    To get there, vendors must collaborate and adopt more open standards for agent-to-enterprise and agent-to-agent tool ⁤dialog, ⁢he advised.

    “While agents or the underlying technologies⁤ may be‍ making progress, this orchestration, governance⁤ and data layer is still waiting to be built out for⁢ agents to ⁤thrive,” said chandrasekaran. “That’s ​where ⁤we see‍ a lot of friction today.”

    Yes, the industry is making progress with AI reasoning, but⁣ still struggles to get AI to understand how the physical ⁤world works. AI mostly operates in a ‍digital world; it doesn’t have strong interfaces to the physical​ world, although improvements are being made in spatial robotics.

    But, “we are very, very, ‍very, very early stage for those kinds of ​environments,”⁢ said Chandrasekaran.

    To truly make significant strides requires a⁤ “revolution” ⁤in⁢ model ‌architecture or ⁢reasoning. “You‌ cannot ⁣be on ⁤the current curve ⁢and just expect more​ data, more compute, and hope to get to AGI,” she ⁣said.

    That’s evident ‍in⁣ the much-anticipated GPT-5 rollout: The ultimate goal that OpenAI defined for itself was AGI, but “it’s really ⁢apparent⁤ that we are nowhere close to ⁣that,” said‍ Chandrasekaran. Ultimately, “we’re still very, very far ‍away from AGI.”

    Leave a Comment