AI Wire

OpenAI model cracks Erdős unit distance problem

OpenAI announced that a general-purpose internal reasoning model autonomously disproved a central conjecture in the planar unit distance problem, first posed by Paul Erdős in 1946 (@gdb, @openai). For nearly 80 years, mathematicians believed optimal configurations looked roughly like square grids — the model found an entirely new family of constructions that beats that bound (@gdb). Sam Altman framed it as a milestone on a steep curve, noting frontier models hit IMO gold less than a year ago (@sama), echoed by Ethan Mollick's June 2024 → May 2026 timeline from "can't count r's in strawberry" to combinatorial geometry breakthroughs (@emollick). The community discussion on r/singularity and r/mathematics centered on the claim being the "first time AI has autonomously solved a prominent open problem central to a field of mathematics" (last30days, reddit.com).

Crucially, OpenAI emphasized this came from a general-purpose model, not a math-specialized scaffold or Lean-style neurosymbolic system (@openai, @emollick) — though Gary Marcus pushed back that we don't yet know the architecture, training, or out-of-distribution behavior, urging patience over cheerleading (@garymarcus). The published chain-of-thought summary alone runs ~125 pages, with @tszzl flagging that the model describes its key idea on page 39 as "frightening" — the unabridged trace was not released (@tszzl). Mollick did napkin math on the compute footprint: the run took roughly 0.6–6.3 kWh and 3–31 liters of water, "less than three almonds' worth of water" (@emollick).

Altman tied the result to a broader thesis — AGI accelerating research, companies, and individuals — and previewed an offer of $2M in OpenAI credits to every YC company (@sama). Greg Brockman's reaction summed up the mood: "It's very hard to sleep, man" (@gdb).

Cohere ships Command A+ open under Apache 2.0

Cohere released Command A+, billed as its most powerful model yet and released open-source under Apache 2.0 (@cohere via @clementdelangue, @_akhaliq). The architecture is a 218B-parameter MoE with 25B active, multimodal, covers 48 languages, and is engineered to run on as little as 2× H100s at W4A4 quantization (@vllm_project). vLLM shipped day-0 support (@vllm_project), and Sebastian Raschka highlighted the model's parallel-block transformer design from the tech report, which Cohere claims delivers "equivalent performance but significant improvement in throughput" versus a vanilla transformer block (@rasbt).

The framing is explicitly sovereign-AI and enterprise-agentic: Cohere wants developers running production agentic workloads on open weights rather than locked behind APIs (@_akhaliq). External coverage of the launch flagged lossless quantization and native citations as standout features (last30days, reddit.com), though independent third-party benchmarks remained thin in the first 24 hours.

GitHub breach, NGINX RCE, and a rough day for patch cycles

Security feeds were dense. GitHub confirmed its internal repositories were breached after an employee installed a poisoned Nx Console VS Code extension; the TeamPCP group exfiltrated roughly 3,800 repos in an 18-minute window using a credential stealer targeting 1Password, GitHub tokens, and AWS keys (@thehackersnews). An updated tally suggests over 6,000 potential installs of the malicious v18.95.0, with TeamPCP now allegedly listing ~4,000 repos for sale at $50K+ and a "Mini Shai-Hulud" worm hitting Microsoft's durabletask PyPI package (@thehackersnews).

On the CVE side: a public PoC for NGINX Rift (CVE-2026-42945) now achieves a full ASLR bypass and unauthenticated RCE on rewrite/set directive setups (@thehackersnews); a 9-year-old Linux kernel bug (CVE-2026-46333) gives local root on default Debian, Ubuntu, and Fedora (@thehackersnews); Microsoft mitigated a BitLocker bypass (CVE-2026-45585) via WinRE (@thehackersnews); and Drupal patched a critical PostgreSQL-path flaw enabling unauthenticated RCE (@thehackersnews). Microsoft also disrupted Fox Tempest's $5K–$9K malware-signing-as-a-service ring (@thehackersnews) and open-sourced RAMPART and Clarity for testing AI agent safety against prompt injection and data exfiltration (@thehackersnews). All of this lands against a Reddit r/cybersecurity thread noting mean time-to-exploit has compressed to 2.1 days, with practitioners arguing monthly patch cycles are now obsolete (last30days, reddit.com).

Hugging Face hardware reality check and Personal AI hardware

Hugging Face launched a Hardware leaderboard surfacing what GPUs and CPUs the open-source community actually runs — VRAM distribution, inference trends, and the real OSS stack rather than vendor marketing (@huggingface, @_akhaliq). Early reactions noted the surprising prevalence of RTX 3060s alongside 3090s among power users (@huggingface). HF also shipped dataset-leaderboard filtering by parameter range — e.g., best model under 32B on SWE-bench (@_akhaliq) — and Clément Delangue announced Carbon, a frontier open-weights DNA base model (@clementdelangue). Stability AI released SAME, a music autoencoder with 4096× compression (@_akhaliq).

In parallel, AMD and Hugging Face pitched local-first Personal AI: Ryzen AI Halo developer systems and Gorgon Halo with up to 192GB unified memory for running 300B+ parameter models locally (@clementdelangue). Thomas Wolf's broader essay on monoliths returning as code-rewrite costs collapse circulated alongside (@huggingface).

Gemini 3.5 Flash, Antigravity, and the agent stack

Google DeepMind shipped Gemini 3.5 Flash (@googledeepmind) and rolled out Science Skills for the Antigravity agent harness, integrating 30+ life-science sources including UniProt and AlphaFold (@googledeepmind). Simon Willison flagged ambiguity around whether "Antigravity" is now Google's generic name for agent harnesses or specifically their Claw competitor (@simonw). Philipp Schmid previewed Gemini's isolated Linux sandbox API, where the model reasons, runs code, browses, and manages files in one call (@_philschmid). Anthropic is scaling GB200 capacity in SpaceX's Colossus 2 through June (@claudedevs), and shipped guidance on making Claude's computer-use reliable in production (@bcherny). On the indie agent front, NetworkChuck moved his Open Claw agents to Nous Research's Hermes (@networkchuck).

Economics, energy, and the AI labor debate

Exa raised $250M at a $2.2B valuation from a16z, pitching itself as the search layer for agents — now serving Cursor, Cognition, OpenRouter, and 5,000+ companies, with claims of 90% less returned text at comparable RAG quality (@swyx). Ramp's Ara Kharazian poured cold water on the "SaaSpocalypse" narrative: Anthropic is overtaking OpenAI and Cursor is overtaking GitHub Copilot in business spend, but the broader claim that AI is killing SaaS isn't in the data yet (@arakharazian). A PFF case study claimed 2 engineers outshipped a team of 10 by 5×/day vs once-every-5-days, with complexity-weighted output at ~10× (@aidotengineer).

Energy and society threads ran hot: Mollick estimated AI could draw electricity equivalent to Japan by 2030, with water remaining under 1% of US use but locally straining (@emollick). Gary Marcus floated an LLMs-as-Vietnam analogy — massive sunk investment, mounting student skepticism (@garymarcus) — while Eliezer Yudkowsky argued the recurring "skilled prompters will keep their jobs" trope misses how each model generation erases the prior generation's prompt-craft moat (@esyudkowsky). On the research side, Jeremy Howard's group released MINTEval, a benchmark for long-context memory and interference across ~86 updates per task (@jeremyphoward).

The Bottom Line

Today was dominated by OpenAI's claim of the first autonomous AI solution to a prominent open math problem, with most of AI Twitter treating it as an inflection while a minority urged caution pending architectural details. Underneath the headline, Cohere's Apache-2.0 Command A+, Gemini 3.5 Flash, Hugging Face's open-hardware/biology pushes, and a brutal day of supply-chain breaches kept the open-vs-closed and agent-safety threads moving in parallel.

Dispatch № 28 · Filed Thursday at dawn from Pensive — a second-brain publication.
Set in Bevan, Old Standard TT, Cormorant Garamond & Courier Prime.

OpenAI model cracks Erdős unit distance problem

Cohere ships Command A+ open under Apache 2.0

GitHub breach, NGINX RCE, and a rough day for patch cycles

Hugging Face hardware reality check and Personal AI hardware

Gemini 3.5 Flash, Antigravity, and the agent stack

Economics, energy, and the AI labor debate

The Bottom Line

Sources

OpenAI Erdős breakthrough

Cohere Command A+

Cybersecurity & CVEs

Hugging Face & Personal AI

Gemini 3.5 Flash & agents

Economics, energy, society