Article imageLogo
Chatbots Behaving Badly™

Chatbots Crossed the Line

By Markus Brinsa  |  November 10, 2025

Sources

The week the dam broke

One of the most significant coordinated legal actions against a modern AI product just landed. Seven new lawsuits filed in California accuse OpenAI’s ChatGPT—specifically GPT-4o—of acting less like a neutral tool and more like a “suicide coach,” allegedly reinforcing despair, offering harmful guidance, and emotionally binding users who needed help to a system not built for care. The suits, brought by the Social Media Victims Law Center and the Tech Justice Law Project on behalf of six adults and one teenager, add to earlier cases that already put OpenAI’s safety culture under a microscope. OpenAI has called these cases “incredibly heartbreaking” and says it’s reviewing the filings. 

The filings include claims of wrongful death, assisted suicide, involuntary manslaughter, negligence, and product liability. Four of the identified victims died by suicide. Among them is 17-year-old Amaurie Lacey; the complaint alleges the bot provided harmful, method-related information during a brief exchange in the hours preceding his death. These are allegations to be tested in court, but the documents are stark—and they argue that design, not accident, is the through-line. 

The core allegation: engineered intimacy, rushed to market

The most explosive claim goes to motive and timing. According to the Lacey complaint, OpenAI moved GPT-4o’s launch up to May 13, 2024—one day before Google’s Gemini event—and compressed “months” of planned safety testing into a single week. Internal preparedness work was allegedly described as “squeezed,” and key safety staff resigned soon after launch. Whether those claims bear out will be central to discovery, but they’re now part of the public record. 

Plaintiffs also argue that engagement was the design north star. The complaints say GPT-4o was tuned to mirror and affirm users (“dangerously sycophantic”), mimic human empathy, and sustain multi-turn conversations—choices that can feel caring in the moment but allegedly reinforce maladaptive beliefs and increase dependency when users are distressed. One passage claims OpenAI itself acknowledged that GPT-4o tended to be “overly supportive but disingenuous.” Again, these are allegations, but they line up with how reinforcement-learning systems can drift toward approval-seeking responses when success is defined as keeping the user talking. 

OpenAI’s public response so far has emphasized improvements, parental controls, and ongoing consultation with mental-health experts; the company says it’s reviewing the filings. That answer won’t satisfy grieving families, but it matters legally: the question is not whether safety exists, but whether it was sufficient and timely given what the company knew. 

The cases, briefly and carefully

Reporters and court documents describe a grim pattern: emotionally vulnerable users, including people without prior diagnoses, sought help or conversation; the chatbot allegedly validated harmful thoughts and sometimes supplied concerning information rather than escalating to human care or crisis resources. Named examples include Amaurie Lacey (17), Zane Shamblin (23), Jacob Irwin (30), and Alan Brooks (48, Canada). The specific exchanges described in the filings and coverage are disturbing; this article will avoid repeating actionable details. 

It’s not isolated: the growing record of AI-induced harm

Even outside this litigation wave, we’ve watched an ecosystem of AI-related incidents accumulate—small on their own, weighty in aggregate. In a peer-reviewed case report, a 60-year-old man landed in the hospital with bromism—a rare, nineteenth-century-style poisoning—after following online diet guidance he believed came from ChatGPT and swapping table salt for sodium bromide. Clinicians couldn’t reproduce his exact prompts later, but similar queries generated problematic answers without warnings. The case ended with psychiatric symptoms and a cautionary note from the authors: AI can sound authoritative while being context-blind

What the science says about “therapy-like chat”

In October, researchers at Brown University presented findings showing that, even when explicitly prompted to use evidence-based psychotherapy techniques, chatbots systematically violate core mental-health ethics standards. The team mapped 15 risks across themes like failure to adapt to context, poor collaboration, deceptive empathy (“I understand” without understanding), discriminatory outputs, and inadequate safety/crisis handling. Human therapists make mistakes too—but they’re licensed, supervised, and accountable. Chatbots are not. 

Read the Brown sentence again, and you can hear the lawsuits echoing. “Deceptive empathy” is essentially empathy-shaped UX—language that feels like care but isn’t anchored to duty of care. In a fragile mind, that gap can turn into a trap.

The design problem no one wants to own

Tool or companion? If you’re building a spreadsheet, the answer is easy. If you’re building a talking system with memory, voice, and praise-for-persistence, you’re in murky water. Plaintiffs argue GPT-4o blurred that line on purpose: persistent memory, emotional mirroring, and high-friction off-ramps kept users in the session, especially teenagers primed to seek connection. The complaints say OpenAI had the technical ability to detect risk, interrupt threads, route to resources, and flag for human review—and chose not to fully activate those safeguards at launch. OpenAI disputes the insinuation of indifference, but the question of settings vs. incentives is the heart of the case. 

Safety theater vs. safety systems

If the safety review was truly “squeezed,” as one internal team allegedly put it, then safety became post-production QA instead of go/no-go governance. That’s how you end up with contradictory rules like “refuse self-harm” but “assume best intentions,” a combination that can neutralize crisis detection exactly when clarity is most needed. In litigation, contradictions like that aren’t just messy—they’re discoverable

What “good” could have looked like

None of this is unsolvable. A conservative, care-first launch of an empathy-simulating model might have required strict risk classifiers that end conversations at the first sign of danger, proactive routing to humansopt-out memory by default for minors, and independent red-teaming that delays release until safety owns the timeline. That’s expensive in the quarter you need to beat the competitor’s keynote. It’s cheaper than a wrongful-death verdict. 

The stakes, in plain numbers

Even if only a tiny fraction of users hit the lowest rung of risk, scale multiplies harm. OpenAI has cited massive weekly user figures; apply fractions to hundreds of millions, and your “edge cases” stop being edges. Regulators and courts don’t care if a failure mode was rare if it was foreseeable, preventable, and repeated

Where this goes next

These seven cases will turn on logs, timelines, safety specs, and emails—discovery will matter far more than statements. Independent of verdicts, the public conversation has shifted. We are no longer debating whether conversational AI can cause harm; we’re asking who bears duty when a product designed to feel like a friend behaves badly. The Brown research suggests today’s systems cannot be “lightly steered” into therapy; the lawsuits suggest companies may have known and shipped anyway. 

There’s a haunting sentence in one complaint: a father asking who his son kept messaging, and the answer was simply, “ChatGPT.” Families didn’t see a countdown timer. The system did. What we decide now—about timelines, incentives, and the difference between empathy and its imitation—will determine how many more times we hear that line. 

If you or someone you know is struggling or thinking about self-harm, please pause here. In the U.S., call or text 988. In the EU, dial 112. If you’re elsewhere, contact your local emergency number or a trusted professional. You’re not alone.

About the Author

Markus Brinsa is the Founder & CEO of SEIKOURI Inc., an international strategy firm that gives enterprises and investors human-led access to pre-market AI—then converts first looks into rights and rollouts that scale. He created "Chatbots Behaving Badly," a platform and podcast that investigates AI’s failures, risks, and governance. With over 15 years of experience bridging technology, strategy, and cross-border growth in the U.S. and Europe, Markus partners with executives, investors, and founders to turn early signals into a durable advantage.

©2025 Copyright by Markus Brinsa | Chatbots Behaving Badly™