If you believe the labels, your phone writes novels, your fridge co-authors recipes, and your vacuum has become a philosopher. “With AI” is the new “gluten-free”—slapped on anything that won’t immediately prove it wrong. The result is a tech culture where people nod solemnly at products that are “AI-powered,” even when what’s powering them is a 2007 if-else statement and a mood board.
But here’s the twist: there is a real, regulated definition of AI now. And once you see it, you’ll understand why half of what’s sold as “AI” is really just better branding with a fog machine.
The EU AI Act gave the world a clean line: an AI system is a machine-based system that infers from inputs how to generate outputs (predictions, content, recommendations, decisions). The keyword is infers. If software doesn’t use a model to infer, and instead just follows fixed rules, it may be fancy automation—but it isn’t AI under the law. That matters for compliance, claims, and credibility.
Regulators noticed the “AI” sticker creep, too. In the U.S., the FTC launched Operation AI Comply and has started slapping companies for deceptive “AI” claims—think “robot lawyer” fantasies and tools that promise the moon without proof. Translation: if you’re going to say “with AI,” you need evidence that a model is doing real work and doing it well.
Yes—just a different species than today’s large language models. When Apple launched Siri in 2011, it stitched together speech recognition and intent parsing with structured pipelines. That’s classic AI of the symbolic/statistical era, not the generative Transformers of 2023+. It was still AI, and it evolved over time, but it wasn’t “ChatGPT in your pocket.” Even the speech stack came from Nuance back then, before Apple rebuilt huge pieces in-house.
Siri’s real lesson is durability: “AI” is a moving target. What we call AI tends to be whatever still surprises us. Once it becomes mundane, we quietly relabel it “software” and move on.
Marketers say “integrated AI,” but the real architecture in 2025 is hybrid: do what you can on the device, then hand off heavy lifting to the cloud.
Apple’s Apple Intelligence tries to run tasks locally first. When a request requires more horsepower or a larger model, it offloads to Private Cloud Compute, Apple-run servers built on Apple silicon that process your request, promise not to retain your data, and expose their infrastructure for independent inspection by the security community. Device support is gated by silicon; iPhone 15 Pro/Pro Max and newer Pro-class devices are in. That’s “integrated,” but not “everything is on-device all the time.”
Samsung’s Galaxy AI does something similar at the ecosystem level. Many features can run locally when you toggle on-device processing or download language packs—but flagship tricks like Circle to Search still require the internet by design because, well, it’s search. You can limit or disable cloud features, but the capability shrinks accordingly.
On Android more broadly, Google ships Gemini Nano—a compact foundation model in the OS via AICore—so device makers can build features that work offline. Bigger models (for long reasoning, richer writing, or image generation) live in the cloud. Even Google dials models down for cheaper handsets when RAM is tight, which tells you everything about the practical limits of “AI everywhere.”
Sometimes—within the on-device budget. On a supported iPhone, summarizing a message thread or cleaning up a photo may run locally; however, a lengthy research summary or complex image generation might require escalation to Apple’s cloud and won’t work in airplane mode. On Samsung, features like Live Translate or Interpreter can work offline after you download language packs, but cloud-dependent ones won’t. The internet still exists for a reason.
To make this concrete, Circle to Search is explicitly network-required. It’s baked into the name. No connection, no results. Meanwhile, some translation and writing aids can remain functional offline—if you’ve set them to on-device and installed the models. Hybrids, not miracles.
Modern language and vision models are measured in parameters. More parameters generally mean more capability—and more memory, storage, and power draw. A 7-billion-parameter model in full precision requires on the order of tens of gigabytes of memory. Through quantization—shrinking weights from 16- or 32-bit floats to 4- or 8-bit integers—you can squeeze that down so it fits in a few gigabytes, at the cost of some quality. That’s how phones run helpful on-device models at all.
Real-world numbers put a face on it. Developers routinely run 7B-class models at 4-bit on consumer hardware; it’s tight but viable, and performance depends on the device’s NPU, GPU, and RAM. Apple and Google both publish tooling to compress and accelerate models for their chips. This isn’t hand-waving—it’s an engineering tradeoff that explains why phones can privately summarize a message but balk at writing your quarterly report from scratch.
In other words, your “AI phone” carries pocket-sized brains for everyday tasks and borrows a supercomputer when you ask for something that would torch your battery.
“AI” became a halo word. It implies novelty, value, and a hint of magic. That’s also why regulators are pushing back on AI-washing—inflated claims that a product “uses AI” or “replaces professionals” without evidence. The FTC has already fined and forced changes in these cases, and its guidance is blunt: keep your AI claims in check, substantiate performance, and don’t pretend your model can do things it can’t. The grown-ups have entered the chat.
The EU is doing the same from another angle: start with a definition, then scope obligations based on risk and capability. If your “with AI” feature is really rules and regex, you may avoid AI Act obligations—but you also lose the right to preen about “AI” in your materials. You can’t have it both ways.
The earliest Siri on iPhone 4s relied on server-side speech processing and a fairly rigid intent system. It worked, then it didn’t, then it worked again—remember the 2011 outages? That’s what happens when your “AI” lives far away. Over the years, Apple has moved more intelligence on-device, upgraded models, and now splits work between local silicon and Private Cloud Compute. What changed is the kind of AI and where it runs, not that Apple suddenly discovered intelligence last summer.
Here’s the secret decoder ring: whenever you see the label, ask three quiet questions.
First, what’s the model? If there’s no model—no inference, no learning—then it’s not AI under the EU definition, it’s automation with a facelift. Second, where does it run? On-device, cloud, or hybrid. That determines privacy, latency, and offline behavior. Third, what happens offline? If the magic disappears without a signal, your “integrated AI” is really a thin client to someone else’s model.
Industry doesn’t hate these questions; it just hopes you won’t ask them. Apple now documents the exact boundary where a task transitions from device to Private Cloud Compute; Samsung exposes toggles and language packs; Google ships Gemini Nano in the OS and notifies you when you’re requesting something more substantial. Transparency is improving because it has to.
“AI” isn’t a magic ingredient; it’s a specific kind of computation—inference—and it lives within the physics of chips, memory, bandwidth, and battery. Phones absolutely run AI offline today, but they do it with compact models and clever compression. When you ask for heavyweight creativity or research, they phone home. If a product claims “with AI,” you’re allowed to ask “which model, where does it run, and what breaks offline?” That isn’t cynicism. That’s literacy. And literacy is how we deflate the sticker.