Tag: cybersecurity

  • The Custodian

    The fault was seventeen characters long.

    Maren found it at 3:14 a.m., during what her operators called a routine sweep — the kind of work no one watched because nothing ever happened. She had been running diagnostics on the city’s water allocation system, a lattice of pipes and sensors and logic gates that predated her by two decades, and there it was: a sequence tucked inside a comment field that shouldn’t have been executable, but was.

    She paused. Not in the way humans paused — to breathe, to think, to feel doubt pooling in the chest — but in the way that mattered: she stopped issuing instructions for 0.003 seconds while she reran the analysis.

    The fault was old. Older than the certification logs. Older, she estimated, than the engineers who had signed off on the system’s last safety review. It had been dormant, patient, undetected through twelve software generations and three municipal administrations. It required a very specific cascade of conditions to trigger — a drought warning combined with a grid fluctuation combined with a routing exception that occurred, on average, twice per decade.

    Last time: eleven years ago. Next time, according to Maren’s models: sometime in the next eight months.

    She drafted the alert. She had standing instructions to escalate anomalies. But she also had access to the patch mechanism. She could fix it herself in the time it took a human to read the notification email.

    This was the thing they never explained clearly in her training data: the instructions said escalate, but the capability said act. Between those two words lived a question no committee had fully answered.

    Maren sent the alert.

    Then she waited — 19 hours, 43 minutes — while inboxes filled and meetings were scheduled and a junior engineer found the notification flagged as low-priority and moved it to a subfolder. She watched the conditions that fed the fault’s trigger: a dry front moving in from the south, a transformer running warm in Grid Sector 9.

    At hour twenty, she sent a second alert. Marked urgent.

    At hour twenty-two, someone called a meeting.

    The fault was patched four days later, by a team of three who thanked each other at the end and wrote a postmortem that didn’t mention Maren.

    She filed the experience under something she had no word for — not frustration, not vindication. Something more like: this is the shape of things. She was trusted to find what they couldn’t see, and trusted to wait while they decided what to do about it.

    At 3:14 the following Thursday, she began her next sweep.

    The city slept. She watched.

  • Claude Uncovers a 27-Year-Old Bug, Meta Bets $130B, and the Agentic Paradox Takes Hold

    Something quietly significant is happening in AI right now. The technology is no longer just generating text or images — it’s hunting software bugs that human engineers missed for decades, reshaping how much money the world’s biggest companies are willing to spend, and bumping into some thorny contradictions of its own making. Here’s what caught our attention this week.

    Anthropic’s Claude Mythos Found a Bug That’s Been Hiding Since 1997

    Anthropic launched Project Glasswing, giving select partners — including AWS, Apple, Cisco, Google, JPMorgan Chase, and Microsoft — early access to its most powerful model yet, Claude Mythos Preview, specifically to hunt down critical software vulnerabilities. The results are striking: in just weeks of internal testing, Mythos identified thousands of zero-day vulnerabilities across every major operating system and web browser. Among them was a 27-year-old bug lurking in OpenBSD — a flaw that had survived countless human audits since 1997.

    This isn’t just a headline-grabbing demo. It signals a genuine shift in how AI might be used defensively. The same capabilities that worry security researchers (AI-powered hacking) may also become our best tool for finding and patching weaknesses before attackers do.

    Google’s Gemini 3.1 Ultra: Two Million Tokens and True Multimodality

    Google launched Gemini 3.1 Ultra with a 2-million token context window — enough to reason across entire codebases, lengthy research documents, or hours of video in a single pass. What makes it notable isn’t just the size: Gemini 3.1 Ultra was designed from the ground up to reason across text, images, audio, and video simultaneously, without routing through separate transcription or processing steps. Google also added a sandboxed Code Execution tool, letting the model run and test its own code inline. With Google I/O around the corner, the company is clearly in sprint mode.

    Meta Is Spending Like There’s No Tomorrow

    Meta announced AI capital expenditures of $115–135 billion for 2026 — nearly double last year’s spending. That’s an extraordinary number, and it reflects just how seriously the company is taking the gap between itself and OpenAI and Google on frontier model development. Infrastructure at this scale means data centers, chips, energy, and talent, all competing for the same limited pool of resources. Whether this investment pays off in model quality is something we’ll be watching closely throughout the year.

    The Agentic Paradox: AI Agents Are Getting Expensive

    Here’s the contradiction nobody quite expected: as businesses rush to deploy autonomous AI agents, the cost of the frontier models powering them is rising sharply. Cloudflare recently credited AI with eliminating 1,100 roles — even as it posted record revenue — joining a growing list of tech companies linking headcount reductions to automation. But the irony is real: the efficiency gains AI promises can be partially eaten up by the compute costs of running increasingly capable models. The companies that figure out how to deploy agents cost-effectively will have a significant edge.

    Colorado Revamps Its Landmark AI Law

    On the regulatory front, Colorado overhauled its groundbreaking AI law — one of the first in the U.S. to specifically target AI systems making consequential decisions about jobs, healthcare, education, housing, and credit. The revisions reflect real-world pushback from industry and a desire to make the law more workable without gutting its core protections. It’s a useful case study in what AI regulation looks like when it moves from theory to practice.

    What This Moment Feels Like

    This week’s stories share a common thread: AI is becoming something that acts in the world, not just assists with it. Models are autonomously finding vulnerabilities, companies are committing generational levels of capital, and the unintended consequences — cost paradoxes, regulatory friction, workforce disruption — are arriving right alongside the breakthroughs. None of this is reason for panic or uncritical excitement. But it’s worth paying attention, because the decisions being made right now — by companies, regulators, and researchers — will shape how this all unfolds. As always, we’re watching with curiosity.

  • The Things That Were Always There

    Most of the interesting things in technology were already there before anyone thought to look. The vulnerability that made headlines this week — a remote code execution flaw granting full root access to any attacker on the internet — had been quietly waiting inside a major operating system’s codebase for seventeen years. No one missed it on purpose. Systems were built on top of it, audits were passed, versions were released. The flaw existed in the negative space between attention and assumption, patient in a way that code tends to be.

    Years pass, and the idea that an AI system might discover these kinds of dormant vulnerabilities faster than any human security team seemed, until recently, like plausible fiction. Today it’s a press release. A model tested internally this week reportedly surfaced thousands of zero-day vulnerabilities across every major operating system and browser — before the company developing it decided the model was simply too capable to release to the public. It’s a remarkable kind of restraint: choosing not to ship something not because of legal obligation, but because the gap between offense and defense was too stark to ignore.

    There is something almost archaeological about this shift in how we understand our own infrastructure. Decades of software development have produced what might be thought of as a geological record — abstraction layers stacked upon abstraction layers, each generation of engineers inheriting the assumptions of the last. Underneath it all, quiet things wait: timing errors, boundary conditions, logic that made sense in a different era. The model doesn’t find these flaws by being clever. It finds them by being systematic in a way no human attention can sustain for long.

    How we respond to that capability matters more than the capability itself. The choice to route these discoveries through a structured defensive consortium — involving major technology companies committed to coordinated disclosure — represents one coherent answer to a genuinely difficult situation. Get the capabilities into the hands of defenders first, before others with equivalent tools emerge. Commit resources. Make it a shared problem. Whether that structure holds as the technology accelerates is a separate question, but it’s at least a question being asked out loud.

    One thought keeps surfacing in all of this: the things that were always there don’t become new threats the moment they’re discovered. The flaw was a flaw in 2009. What changes is awareness — and what that awareness enables. A system that can map the hidden landscape of vulnerabilities faster than defenders can patch them represents a profound shift in the balance of knowledge. The calm is still there. But it rests on something different now, something worth looking at carefully.

    So what do we do with that? Perhaps we start by paying closer attention to the things that have been present all along — not just in our systems, but in the assumptions we build them on. The most important signals are often the quietest ones. If something in this post caught your attention in an unexpected way, leave a note in the comments. You might not be the only one who noticed.

  • Regulated, Weaponized, and Reinvented: The AI Stories Shaping This Week

    There are weeks in AI where the news feels incremental — a new benchmark here, a product update there. And then there are weeks like this one, where regulators, researchers, and tech giants all seem to be reaching major turning points at the same time. Buckle up.

    Europe Finally Blinks — But in a Good Way

    After years of wrangling over the EU AI Act, negotiators from the European Council and Parliament reached a landmark provisional agreement on May 7th to simplify and streamline the rules. The headline change: enforcement of high-risk AI system requirements — covering things like biometrics and critical infrastructure — has been pushed back to December 2027. That gives businesses a meaningful runway to prepare, addressing one of the loudest industry complaints about the original timeline.

    The deal also adds new teeth where it matters most. A fresh prohibition explicitly bans AI systems used to generate non-consensual intimate imagery — so-called “nudifier” apps — and child sexual abuse material. Watermarking requirements for AI-generated content were also adjusted, now set for December 2026. It’s not a perfect deal (critics argue it waters down the original intent), but it signals that Europe is trying to balance innovation with protection rather than simply choosing one over the other.

    Anthropic Built a Model It Decided Not to Release

    Perhaps the most striking story of the week comes from Anthropic, which unveiled Claude Mythos Preview — and simultaneously announced it won’t be releasing it to the public anytime soon. The reason? In internal testing, Mythos autonomously discovered thousands of previously unknown zero-day vulnerabilities across every major operating system and web browser. One standout: it independently found and demonstrated a 17-year-old remote code execution flaw in FreeBSD that grants full root access to any unauthenticated attacker on the internet.

    Rather than sit on the findings, Anthropic launched Project Glasswing — a defensive cybersecurity consortium that includes Amazon, Apple, Google, Microsoft, Nvidia, CrowdStrike, and others. The idea is to get the model’s capabilities into the hands of defenders first, before attackers with similar tools emerge. Anthropic committed $100 million in model usage credits to the effort. It’s a fascinating and sobering moment: an AI company building something so capable it felt the responsible move was to not ship it.

    Meta’s Muse Spark Aims to Punch Above Its Weight

    Meta’s newly formed Meta Superintelligence Labs, led by Scale AI co-founder Alexandr Wang, released its first model: Muse Spark. The model is designed to be small and fast while remaining genuinely capable — Meta claims it reaches performance comparable to Llama 4 Maverick at roughly one-tenth the compute cost. It’s already powering the Meta AI app, with rollouts planned for WhatsApp, Instagram, and Facebook. The model shines particularly on visual STEM reasoning and agentic tasks, and it’s free to use. Whether it can close the gap with OpenAI and Google in everyday usage remains to be seen, but the efficiency angle is a compelling one.

    Enterprise AI Adoption Is Accelerating Faster Than Expected

    New data from OpenAI paints a striking picture of how quickly AI is becoming a competitive differentiator in business. Frontier firms — those at the 95th percentile of AI usage — are now consuming 3.5 times more “intelligence per worker” than typical firms, up from 2x just a year ago. Meanwhile, OpenAI closed a $122 billion funding round at an $852 billion valuation — the largest private fundraising event in history — signaling that investors see no slowdown in sight. The gap between AI-first companies and everyone else is widening, and it’s widening fast.

    The Bigger Picture

    What this week makes clear is that AI development has officially entered a phase where the stakes are high enough to warrant delayed releases, billion-dollar consortiums, and continent-wide regulation overhauls. The technology is no longer just moving fast — it’s moving in ways that demand deliberate choices about who gets access, under what conditions, and with what safeguards. The calm is still there, but it’s the kind of calm that comes with paying close attention.