Article imageLogo
Chatbots Behaving Badly™

VCs Back Off, Apple Calls BS - Is This the End of the AI Hype?

By Markus Brinsa  |  August 12, 2025

Sources

When Apple quietly dropped a bombshell research paper in June titled The Illusion of Thinking,” it sent ripples through the AI community. Here was tech’s most secretive giant openly questioning whether today’s AI can truly reason – after years of feverish hype claiming it could. The timing was striking: by 2025, investors had poured billions into AI, betting on ever-smarter machines. Yet Apple’s study painted a sobering picture: despite all the money and progress, the latest AI “reasoning” models struggle with problems even a child could solve. It was a splash of cold water on an industry drunk on its own hype. How did we get here, and are investors really pulling back from AI now that its shortcomings are harder to ignore?

Apple’s Reality Check: “The Illusion of Thinking”

For much of 2023–2024, tech leaders touted AI’s imminent leap to human-level intelligence. One well-known CEO even insisted digital superintelligence was just around the corner, while another boldly claimed AI would surpass human capabilities within 2–3 years. In contrast, Apple’s new paper documents several weaknesses in so-called “large reasoning models” (LRMs) – those advanced AI systems that attempt to articulate a step-by-step thinking process. The Apple researchers showed that these models often lack genuine reasoning skills, stumbling over even simple puzzles. It was as if a bucket of ice water had been dumped on the roaring fire of AI enthusiasm. The message: slow down, folks – these machines aren’t as smart as we think, at least not in the way we think we think.

So, what exactly did Apple’s team find? Using cleverly designed puzzle tests, they pushed state-of-the-art reasoning AIs to their limits. And beyond a certain point, those limits hit a wall. The models might do fine with trivial problems and even improve on moderately hard ones – but as soon as the complexity dial turned up high, their performance collapsed completely. In fact, one of the most striking observations was that after a threshold, the AI essentially gives up: its “reasoning effort” (the length and detail of its thought process) actually declines as the problem gets harder, even when it has plenty of computational power and tokens left to utilize. It’s like watching a student who, when faced with a really hard question, just shrugs and writes nothing at all.

The researchers identified three regimes of performance. At low complexity, standard large language models without any special reasoning prompts actually outperformed the fancy “thinking” versions – because the reasoning models tended to overthink easy tasks and sometimes second-guessed themselves into wrong answers. At medium complexity, the reasoning-augmented models did better, justifying all those chain-of-thought techniques on tasks of moderate difficulty. But at high complexityboth types of model failed spectacularly – a “complete accuracy collapse,” as the paper puts it. No matter how much extra step-by-step thinking the AI did, it couldn’t solve the hardest puzzle once it went past a certain point. The title of the paper, “The Illusion of Thinking,” says it all: these systems create the illusion of reasoning by spitting out logical-sounding steps, but fundamentally, “they’re still just very sophisticated pattern matchers” rather than true reasoners.

Apple’s findings hit a nerve. Here was quantitative proof for what many AI engineers had quietly suspected: today’s large language models (LLMs), even those gussied up as reasoning wizards, don’t really understand problems – they imitate reasoning until it gets too hard, then they peter out. The study’s controlled experiments (which included classic puzzles like the Tower of Hanoi) avoid the usual pitfalls – for example, they ruled out training data leaks that plague standard math benchmarks. This made the conclusion harder to ignore. AI critic Gary Marcus pointed out that some quibbles with the paper – like whether the failures were due to the model’s token limits (running out of output length) – miss the forest for the trees: if the model can’t even articulate a solution to a simple puzzle without running out of steam, how on earth will it handle genuinely complex real-world problems? It’s a fair question. As Marcus noted, even if token limits are one issue, the more fundamental issue is the faulty reasoning – the fact that the AI doesn’t know how to find a correct solution once the straightforward path isn’t obvious.

Not everyone agrees on how much this all matters. Some AI researchers responded that puzzles like Tower of Hanoi might not reflect “real” tasks AI needs to do – and that models could be trained differently to handle such systematic challenges. Others countered that Apple’s results underscore a core limitation: these models don’t generalize reasoning well. They can recite an algorithm (the solution was likely in their training data somewhere), but they struggle to apply it step-by-step when required to actually execute on a long sequence of logical moves. In one illustrative example, when asked to solve a 10-disk Tower of Hanoi, the AI quickly “remembered” the formula (2^n – 1 moves) and realized that meant 1,023 moves – and basically threw its hands up, saying in effect, “Listing a thousand moves is impossible, let me try a clever shortcut…,” which of course failed. The AI knew the answer in theory, but wouldn’t grind through the steps – a very non-human type of laziness! This suggests a built-in scaling problem: beyond a certain complexity, the model changes strategy (from brute-force reasoning to hopeless shortcut-seeking), and that’s when it breaks.

For Apple to publish such a candid critique is remarkable in itself.

This is a company known for guarded PR, not tipping its hand. Yet here Apple is, spotlighting the failings of large AI models – including, presumably, models similar to those of rivals like OpenAI and Google. It’s worth noting that one of the paper’s co-authors is Samy Bengio, a respected AI scientist Apple hired after he left Google. Apple has clearly been quietly building an internal AI research unit, and this paper offers a rare glimpse into its thinking. Cynics might wonder if Apple has an ulterior motive: By tempering the hype around AI, perhaps Apple is justifying why it hasn’t rushed out a ChatGPT competitor of its own. (Siri, after all, is nowhere near GPT-4 in capability, and Apple has been publicly quiet while others loudly trumpet their AI breakthroughs.) But more likely, Apple is doing the industry a favor by raising a flag: Hey, let’s be realistic about what today’s AI can do. In the long run, that kind of realism could save everyone time, money, and disappointment.

Billions In, But What’s the Return? The Investor Landscape

The title of the Forbes article that spurred this discussion encapsulates the irony perfectly: Despite Billions in Investment, AI Reasoning Models Are Falling Short.” Indeed, the past couple of years have seen an unprecedented surge of money into AI. In the first half of 2025 alone, venture capitalists plowed over $100 billion into AI startups – almost as much as in all of 2024 combined. By some estimates, a staggering two-thirds of all venture funding in the U.S. this year has been going to AI ventures. It’s as if AI became the only game in town. In the second quarter of 2025, VC deal-making hit a nine-year low in sheer number of deals, yet almost half of all venture dollars for that quarter – about $50 billion – went to AI companies. Fewer startups were getting funded overall, but if you had an AI angle, your odds were a lot better. The average deal size ballooned as investors shoveled big checks into anything with a hint of AI magic.

And we’re not just talking about early rounds for scrappy startups – we’re talking mega-deals that made jaws drop. OpenAI, the poster child of the AI boom, secured a $40 billion funding round in 2025 – the largest such deal on record. Scale AI, a data platform company, got $14.3 billion from Meta in a single swoop. Anthropic, another AI lab, raised $3.5 billion. Even relatively lesser-known players started joining the billionaire club: companies like Infinite Reality raised $3B, Anduril Industries (which blends AI and defense) got $2B, and others like Safe SuperintelligenceThinking Machines, and Groq snagged between $1–2 billion each. These numbers are unheard of in startup land – we’re used to talking about millions, not tens of billions. It truly felt like an AI gold rush, with investors scrambling not to miss the next big strike.

But here’s the rub: for all the money going in, not a lot has been coming out. By mid-2025, investors started exhibiting signs of the very thing they fear most: a bubble. According to financial data, there were 281 VC-backed “exits” in the AI sector in the first half of 2025 (an exit meaning the investors sell their stake, usually via an acquisition or IPO). That might sound like good news – hey, exits mean returns, right? The catch is that most of these were underwhelming. In total, only about $36 billion in cash-out value was realized from those exits, which is dramatically lower than the $104+ billion that went in during the same timeframe. That’s not sustainable math for venture capital. Many of those exits were quiet “lower-value acquisitions” – basically fire-sales or modest acqui-hires rather than grand slam IPOs. The IPO market for AI firms has been nearly frozen (the few exceptions, like an AI-powered insurance firm’s IPO valued at $2.3B, are outliers). So, investors are, in aggregate, left holding a lot of expensive stakes in AI companies with no clear path to profit.

It’s no wonder, then, that anxiety has crept into the investor community. In late July 2025, CNBC and others started reporting that some of the most bullish tech backers were quietly pumping the brakes. The frenzy of blank-check funding was being replaced with hushed questions: “Wait, will any of this make money anytime soon?” Financial analysts have begun openly warning of an “AI bubble.” One prominent economist, Torsten Slok of Apollo Global Management, went so far as to say the market conditions around AI in 2025 look even more extreme than the dot-com bubble of the late 90s. He pointed to the fact that the stock market’s top tech companies (the ones heavily invested in AI) have sky-high valuations that defy traditional metrics – reminiscent of the euphoria before the dot-com crash. Slok’s warning: if the broader economy wobbles or the AI promises don’t pan out fast enough, we could be in for a hard correction.

In fact, a correction may already be underway. Data from startup funding trackers shows that the median valuations for early-stage AI startups have begun to fall from their 2024 peaks. At the seed stage, for example, median valuations and check sizes in early 2025 dipped noticeably compared to the year before. By Q1 2025, a typical AI seed round valuation was around $14M (with $5M invested) – still high, but a “step down” from the giddy highs of 2024. More telling: at the next stage (Series A), valuations were holding high in name (around $42M median), but the actual dollars investors were willing to put in dropped sharply (from ~$19M late last year to ~$16M). In other words, many startups are clinging to lofty valuations on paper, but investors are cutting smaller checks – a sign of growing conservatism. Ronen Solomon, CEO of a firm that tracks these trends, remarked that “we are witnessing the bursting of the AI bubble, as the market begins to digest the hype… and translate it into more realistic valuations”. The party isn’t over, but the lights have been turned up, and people are sobering up.

It’s a nuanced picture. On one hand, funding for AI is still at historic highs in absolute terms – 2025’s quarterly investment levels are above 2022 or 2021 levels, for instance. The difference is in how that money is being allocated and the mood behind it. We’ve shifted from a stage where venture capital was spraying money at any startup with “AI” on a slide deck, to a stage where VCs are making fewer bets and demanding more evidence. A report on Q2 2025 global VC trends noted that the number of deals worldwide plunged (hitting the lowest since 2018), even though total money was still huge. This “fewer but bigger” dealmaking suggests investors are consolidating their bets around companies they perceive as winners, rather than funding a broad swath of experiments. In AI, that means a handful of well-known players and well-capitalized newcomers are hogging the lion’s share of capital, while smaller or more speculative AI projects struggle to get a foot in the door. Selectivity is the new mantra.

Geographically, the dynamic has some wrinkles, too. The United States has led the charge in AI investment, but it’s also where the retreat is most evident in raw numbers – U.S. venture funding overall contracted in early 2025, and exits dropped dramatically year-on-year. Europe, by comparison, saw venture funding dip slightly less severely (down to €27.5B in H1 2025 from €32.7B the year prior). European deal counts fell more than total funding, implying that, like in the U.S., European VCs concentrated money into fewer, larger deals. Culturally, Europe has been a bit more cautious on the AI craze – partly due to a strong regulatory push (the EU’s draft AI Act, for instance, reflects a healthy skepticism about unfettered AI deployment). That might have prevented some of the more fantastical investments from ever taking root in Europe. Even so, Europe had its AI hype moments too (witness the $600M raised by France’s Mistral AI in 2024). But in 2025, absent such headline-grabbing one-offs, the European AI investment scene also cooled to a more modest temperature. The pause in early-stage deals was worldwide – in France, for example, the number of seed financings in the first half of 2025 was barely one-third of the year-ago volume. Clearly, investors across both the US and EU are stepping back to some degree, recalibrating their risk appetite.

From Hype to Reality: A New Phase for AI

So, are venture capitalists and private equity investors really pulling back from AI? Yes and no. They’re not abandoning the field by any means – the consensus is that AI will be as transformative as the internet or electricity, so completely jumping ship would be foolish. But they are getting choosier and harder-nosed. The wild exuberance of 2024 has been replaced with pointed questions about ROI and technical feasibility. Investors are no longer impressed just because a startup says, “We’ve got a big language model that can do X.” They want to know: Can it solve a real problem better, faster, cheaper than anything else? They’ve been spooked by the fact that reportedly over 80% of AI projects in corporations fail to deliver on their promises. That failure rate is about twice that of regular IT projects – a sobering statistic that boards and CIOs have surely noted. It means a lot of POCs (proofs of concept) never turn into deployed solutions, often because the AI wasn’t as competent as hoped, or it was too hard to integrate, or employees didn’t trust it. This is the kind of cold reality that cuts through hype.

For AI startup founders, especially those in the US and Europe now feeling the funding pinch, the message is to adapt. If in 2024 the strategy was “pitch the dream” and cash in before anyone asks too many questions, in 2025 the strategy has to be “prove you can actually do it (and preferably, make money doing it).” We’re seeing a shift from vision to traction. In practical terms, that might mean focusing on narrower applications where AI clearly adds value, rather than broad claims of artificial general intelligence on the horizon. It might mean emphasizing how you’ll overcome the “illusion of thinking” problem – e.g., maybe your AI product uses a hybrid approach with symbolic logic or human-in-the-loop oversight to ensure reliability, thereby avoiding the pitfall Apple identified. The bar is higher now to get that check, but that’s not necessarily a bad thing. It could nudge the whole industry into a healthier place, where claims are grounded and the tech is vetted.

Importantly, being more realistic about AI’s current limitations doesn’t equate to doom and gloom about AI’s future. We can acknowledge that today’s models often fake reasoning and still be optimistic that tomorrow’s models will do better. In fact, the path forward likely involves precisely the kind of research Apple and others are doing: identifying where the flaws are and innovating our way around them. Google’s DeepMind, for instance, has been exploring combinations of neural networks with explicit reasoning modules. They’ve experimented with letting language models use external tools like calculators or code execution to handle tasks that require precision or lengthy logic. Researchers are also looking at techniques to improve the chain-of-thought process itself – for example, using reinforcement learning to teach an AI when to “think” and when to act, so it doesn’t waste time on unnecessary steps. Others suggest that integrating symbolic AI (good old-fashioned logic rules) or graph-based reasoning could help overcome the brittleness seen in purely neural approaches. It’s very possible that the next generation of AI systems – say, Google DeepMind’s upcoming Gemini model – will blend these ideas: the raw pattern recognition power that current LLMs excel at, plus new architectures for planning and reasoning. DeepMind’s track record (like mastering Go with AlphaGo or solving protein folding with AlphaFold) shows that adding domain-specific strategies to brute-force AI can yield astonishing results. So a fusion of LLMs with more structured problem-solving might be just around the corner.

While big U.S. players forge ahead, the European AI scene might carve a different niche grounded in trustworthy AI. Europe’s emphasis on privacy, ethics, and safety could lead to innovations that make AI more reliable – something businesses and regulators on both continents would cheer. And interestingly, Apple – an American company – often shares some of these European sensibilities (privacy by design, etc.). Apple’s cautionary paper could be seen as aligning with a more careful, human-centric approach to AI that many European thinkers advocate. In that sense, there’s a convergence of thought: whether in California’s Cupertino or Brussels’ EU headquarters, thought leaders are saying “we need to level-set on what AI can really do, and work from there.”

Striking a Balance Between Hype and Reality

The current moment in AI can be described as a balancing act between hype and reality. On one hand, we have marvelous breakthroughs – just think about it: a few years ago, no machine could write a coherent article or pass a medical licensing exam, and now they can. On the other hand, the limitations are glaring – the same machine that writes an essay can’t be trusted to do simple reasoning without sometimes going off the rails. Being pro-AI today means holding these two truths simultaneously. It means being excited about AI’s potential to augment human capabilities, while also being vigilant about its very real pitfalls.

Apple’s frank assessment helps cut through the noise. It reminds executives and investors that if they believed the most breathless promises (like AI imminently becoming an all-seeing, all-knowing oracle), they need to recalibrate. A business leader reading “The Illusion of Thinking” might wisely conclude that deploying AI in their company will require human oversight and incremental expectations – not because AI is bad, but because it’s not magic. A venture capitalist reading it might still invest in the next big AI startup, but perhaps with terms that factor in longer development timelines or higher technical risk, rather than assuming a quick unicorn exit. This tempering of expectations is healthy. Recall that in the dot-com era, after the bubble burst, we didn’t abandon the internet – we abandoned the illusions around the internet. What remained was a more realistic growth curve that ultimately gave us the Web 2.0 and mobile revolutions. AI could follow a similar trajectory: a hype bubble deflates, a lot of speculative fluff falls away, and what’s left is an industry that actually delivers on its promises over time.

It’s telling that Hessie Jones, the author of the Forbes piece, provocatively asked if “AI reasoning” is an oxymoron. That captures what many in the AI field have been wrestling with. We see AI systems do impressively “smart” things one moment and then faceplant the next. Does that mean AI can’t reason, full stop? Not necessarily – it might mean we haven’t yet discovered the right approach. After all, human reasoning is still not fully understood; it’s a bit rich to expect our first-generation fake brains to nail it. But acknowledging that current AI reasoning falls short is the first step to improving it. Skepticism, in this context, isn’t anti-AI – it’s what spurs the AI community to work harder and smarter.

One might wonder how developers and companies struggling to get funding feel about all this.

Imagine being an AI entrepreneur in 2025. Last year, VCs were throwing money at you; this year, they’re asking you to prove yourself ten different ways. Frustrating? Sure. But it also weeds out the pretenders. The startups that survive this phase will be those with solid fundamentals – perhaps a clever new algorithm, or a proprietary data advantage, or a product that fills a real market need despite not being perfect AI. In a way, the belt-tightening separates the wheat from the chaff. If you can address the hard questions (“What happens when your AI doesn’t know the answer?” or “How do you prevent it from making stuff up?”), You’ll find investors who are still eager to back you. If you gloss over those, you’ll find doors closing. And that’s how it should be. As one AI researcher quipped, the “illusion of thinking” is fine for a demo, but not for a deployed product that people rely on – so developers must turn the illusion into reality or risk losing the trust of both investors and users.

Ultimately, this moment in the AI industry feels like a coming-of-age. The teenage years of reckless growth and self-confidence are ending, and a more mature phase is beginning. The narrative is no longer just about how awesome AI will be, but also about what it takes to make AI truly awesome – the sweat, the science, the careful investment and testing. We’re likely to see slightly slower, but more sustainable progress. Executives will still integrate AI into their strategies, but perhaps with a chief AI officer whose job is to constantly evaluate its outputs and manage its risks. Investors will still fund the next OpenAI or DeepMind, but maybe they’ll insist on seeing a working prototype and some paying customers first. And regulators – especially in the EU – will push for guardrails that ensure AI is used responsibly, which could prevent some of the hype-driven missteps (and also reassure the public, which is important for long-term adoption).

In the end, the illusion in Apple’s paper title is a gift. It’s a reminder not to be illusioned ourselves. There’s real intelligence in these machines, but it’s not human-like reasoning yet, and pretending otherwise helps nobody. By stripping away the illusion, we can focus on the real thinking that needs to be done – by us, the humans – to build AI that lives up to its incredible potential. As we invest the next billions in AI, armed with a clearer vision, we can hopefully ensure that those dollars yield technology that is not just impressive in a demo, but truly transformative in the world. And that’s no illusion.

About the Author

Markus Brinsa is the Founder and CEO of SEIKOURI Inc., an international strategy consulting firm specializing in early-stage innovation discovery and AI Matchmaking. He is also the creator of Chatbots Behaving Badly, a platform and podcast that investigates the real-world failures, risks, and ethical challenges of artificial intelligence. With over 15 years of experience bridging technology, business strategy, and market expansion in the U.S. and Europe, Markus works with executives, investors, and developers to turn AI’s potential into sustainable, real-world impact.

©2025 Copyright by Markus Brinsa | Chatbots Behaving Badly™