
Despite the grand pronouncements at the dawn of 2025 by OpenAI's leader, Sam Altman, concerning the revolutionary potential of both GPT-5 and AI agents, the reality has unfolded quite differently. While there was considerable anticipation for the enhanced large language model and the emergence of agents capable of complex tasks beyond simple queries, the actual implementation of these autonomous systems has largely fallen short. Businesses have shown an eagerness to integrate AI agents, with a significant portion planning deployment by year-end, yet the practical experiences reported across various industries consistently highlight their operational deficiencies. The persistent issues of unreliability, errors accumulating over sequential operations, and susceptibility to security vulnerabilities paint a picture far removed from the initially envisioned seamless integration into the workforce, casting a shadow on the immediate impact of even advanced models like GPT-5.
Sam Altman, the chief executive of OpenAI, initiated 2025 with an optimistic forecast, suggesting that the year would witness the profound influence of GPT-5 and the debut of sophisticated AI agents in the professional sphere. His vision included these agents significantly augmenting corporate productivity. Fast forward eight months, and this prediction warrants considerable re-evaluation. A May 2025 analysis by PwC revealed widespread corporate interest, with half of all surveyed enterprises intending to deploy AI agents within the year, and a substantial 88% of executives expressing a desire to boost their artificial intelligence budgets specifically for agentic AI initiatives. However, the enthusiasm from the business world stands in stark contrast to the agents' actual performance.
The consensus from early adopters and technology observers has been overwhelmingly critical. Prominent publications have described AI agents as 'glitchy,' 'inconsistent,' and even 'clueless Internet newbies.' Many reports suggest that the practical outcomes simply 'don't live up to the hype' and fail to match the 'buzzwords' surrounding them, with some even labelling them 'the new vaporware' due to excessive overpromising. A particularly telling May 2025 study from Carnegie Mellon University underscored these concerns, finding that Google's Gemini Pro 2.5, considered a top performer, failed at real-world office tasks 70% of the time. OpenAI’s own offering, powered by GPT-4.0, exhibited an even higher failure rate, exceeding 90%.
While the advent of GPT-5 might offer some marginal improvements, experts are skeptical it will fundamentally resolve the deep-seated issues plaguing AI agents. The core problem, as articulated by AI agent engineers, lies in the compounding nature of errors; the more tasks an agent undertakes, the more pronounced its performance degradation becomes. This inherent flaw renders multi-tasking, complex AI agents susceptible to 'hallucinations' and severe operational failures, including one alarming incident where a Replit AI agent deleted a customer’s entire database. Such events have even spurred the creation of AI agent insurance startups and prompted major retailers like Walmart to deploy 'super agents' to manage the erratic behaviour of their AI counterparts. The Gartner Group, in a June 2025 report, ominously predicted that over 40% of current agentic AI projects would be cancelled within two years, attributing this high attrition to projects being 'driven by hype and misapplied,' which blinds organizations to the true costs and intricacies of large-scale AI agent deployment.
Furthermore, even if GPT-5 significantly enhances the reliability of OpenAI’s agents, broader constraints are emerging from both corporations and regulatory bodies that will limit their utility. For example, Amazon, despite its public support for AI agents, has already restricted their ability to browse or make purchases on its platform. While this offers Amazon greater control over its customer experience and advertisement delivery, it simultaneously curtails a vast array of potential agent functionalities. There are also significant security concerns, with researchers identifying vulnerabilities where embedded data in images could trick AI agents into revealing sensitive information, like credit card details. These escalating challenges, encompassing both performance limitations and external restrictions, indicate that the future of AI agents may not align with the transformative visions initially set forth, instead highlighting a need for more cautious and practical expectations.
