OpenAI model cracks Erdős unit distance problem
OpenAI announced that a general-purpose internal reasoning model autonomously disproved a central conjecture in the planar unit distance problem, first posed by Paul Erdős in 1946 (@gdb, @openai). For nearly 80 years, mathematicians believed optimal configurations looked roughly like square grids — the model found an entirely new family of constructions that beats that bound (@gdb). Sam Altman framed it as a milestone on a steep curve, noting frontier models hit IMO gold less than a year ago (@sama), echoed by Ethan Mollick's June 2024 → May 2026 timeline from "can't count r's in strawberry" to combinatorial geometry breakthroughs (@emollick). The community discussion on r/singularity and r/mathematics centered on the claim being the "first time AI has autonomously solved a prominent open problem central to a field of mathematics" (last30days, reddit.com).
Crucially, OpenAI emphasized this came from a general-purpose model, not a math-specialized scaffold or Lean-style neurosymbolic system (@openai, @emollick) — though Gary Marcus pushed back that we don't yet know the architecture, training, or out-of-distribution behavior, urging patience over cheerleading (@garymarcus). The published chain-of-thought summary alone runs ~125 pages, with @tszzl flagging that the model describes its key idea on page 39 as "frightening" — the unabridged trace was not released (@tszzl). Mollick did napkin math on the compute footprint: the run took roughly 0.6–6.3 kWh and 3–31 liters of water, "less than three almonds' worth of water" (@emollick).
Altman tied the result to a broader thesis — AGI accelerating research, companies, and individuals — and previewed an offer of $2M in OpenAI credits to every YC company (@sama). Greg Brockman's reaction summed up the mood: "It's very hard to sleep, man" (@gdb).
Cohere ships Command A+ open under Apache 2.0
Cohere released Command A+, billed as its most powerful model yet and released open-source under Apache 2.0 (@cohere via @clementdelangue, @_akhaliq). The architecture is a 218B-parameter MoE with 25B active, multimodal, covers 48 languages, and is engineered to run on as little as 2× H100s at W4A4 quantization (@vllm_project). vLLM shipped day-0 support (@vllm_project), and Sebastian Raschka highlighted the model's parallel-block transformer design from the tech report, which Cohere claims delivers "equivalent performance but significant improvement in throughput" versus a vanilla transformer block (@rasbt).
The framing is explicitly sovereign-AI and enterprise-agentic: Cohere wants developers running production agentic workloads on open weights rather than locked behind APIs (@_akhaliq). External coverage of the launch flagged lossless quantization and native citations as standout features (last30days, reddit.com), though independent third-party benchmarks remained thin in the first 24 hours.
GitHub breach, NGINX RCE, and a rough day for patch cycles
Security feeds were dense. GitHub confirmed its internal repositories were breached after an employee installed a poisoned Nx Console VS Code extension; the TeamPCP group exfiltrated roughly 3,800 repos in an 18-minute window using a credential stealer targeting 1Password, GitHub tokens, and AWS keys (@thehackersnews). An updated tally suggests over 6,000 potential installs of the malicious v18.95.0, with TeamPCP now allegedly listing ~4,000 repos for sale at $50K+ and a "Mini Shai-Hulud" worm hitting Microsoft's durabletask PyPI package (@thehackersnews).
On the CVE side: a public PoC for NGINX Rift (CVE-2026-42945) now achieves a full ASLR bypass and unauthenticated RCE on rewrite/set directive setups (@thehackersnews); a 9-year-old Linux kernel bug (CVE-2026-46333) gives local root on default Debian, Ubuntu, and Fedora (@thehackersnews); Microsoft mitigated a BitLocker bypass (CVE-2026-45585) via WinRE (@thehackersnews); and Drupal patched a critical PostgreSQL-path flaw enabling unauthenticated RCE (@thehackersnews). Microsoft also disrupted Fox Tempest's $5K–$9K malware-signing-as-a-service ring (@thehackersnews) and open-sourced RAMPART and Clarity for testing AI agent safety against prompt injection and data exfiltration (@thehackersnews). All of this lands against a Reddit r/cybersecurity thread noting mean time-to-exploit has compressed to 2.1 days, with practitioners arguing monthly patch cycles are now obsolete (last30days, reddit.com).
Hugging Face hardware reality check and Personal AI hardware
Hugging Face launched a Hardware leaderboard surfacing what GPUs and CPUs the open-source community actually runs — VRAM distribution, inference trends, and the real OSS stack rather than vendor marketing (@huggingface, @_akhaliq). Early reactions noted the surprising prevalence of RTX 3060s alongside 3090s among power users (@huggingface). HF also shipped dataset-leaderboard filtering by parameter range — e.g., best model under 32B on SWE-bench (@_akhaliq) — and Clément Delangue announced Carbon, a frontier open-weights DNA base model (@clementdelangue). Stability AI released SAME, a music autoencoder with 4096× compression (@_akhaliq).
In parallel, AMD and Hugging Face pitched local-first Personal AI: Ryzen AI Halo developer systems and Gorgon Halo with up to 192GB unified memory for running 300B+ parameter models locally (@clementdelangue). Thomas Wolf's broader essay on monoliths returning as code-rewrite costs collapse circulated alongside (@huggingface).
Gemini 3.5 Flash, Antigravity, and the agent stack
Google DeepMind shipped Gemini 3.5 Flash (@googledeepmind) and rolled out Science Skills for the Antigravity agent harness, integrating 30+ life-science sources including UniProt and AlphaFold (@googledeepmind). Simon Willison flagged ambiguity around whether "Antigravity" is now Google's generic name for agent harnesses or specifically their Claw competitor (@simonw). Philipp Schmid previewed Gemini's isolated Linux sandbox API, where the model reasons, runs code, browses, and manages files in one call (@_philschmid). Anthropic is scaling GB200 capacity in SpaceX's Colossus 2 through June (@claudedevs), and shipped guidance on making Claude's computer-use reliable in production (@bcherny). On the indie agent front, NetworkChuck moved his Open Claw agents to Nous Research's Hermes (@networkchuck).
Economics, energy, and the AI labor debate
Exa raised $250M at a $2.2B valuation from a16z, pitching itself as the search layer for agents — now serving Cursor, Cognition, OpenRouter, and 5,000+ companies, with claims of 90% less returned text at comparable RAG quality (@swyx). Ramp's Ara Kharazian poured cold water on the "SaaSpocalypse" narrative: Anthropic is overtaking OpenAI and Cursor is overtaking GitHub Copilot in business spend, but the broader claim that AI is killing SaaS isn't in the data yet (@arakharazian). A PFF case study claimed 2 engineers outshipped a team of 10 by 5×/day vs once-every-5-days, with complexity-weighted output at ~10× (@aidotengineer).
Energy and society threads ran hot: Mollick estimated AI could draw electricity equivalent to Japan by 2030, with water remaining under 1% of US use but locally straining (@emollick). Gary Marcus floated an LLMs-as-Vietnam analogy — massive sunk investment, mounting student skepticism (@garymarcus) — while Eliezer Yudkowsky argued the recurring "skilled prompters will keep their jobs" trope misses how each model generation erases the prior generation's prompt-craft moat (@esyudkowsky). On the research side, Jeremy Howard's group released MINTEval, a benchmark for long-context memory and interference across ~86 updates per task (@jeremyphoward).
The Bottom Line
Today was dominated by OpenAI's claim of the first autonomous AI solution to a prominent open math problem, with most of AI Twitter treating it as an inflection while a minority urged caution pending architectural details. Underneath the headline, Cohere's Apache-2.0 Command A+, Gemini 3.5 Flash, Hugging Face's open-hardware/biology pushes, and a brutal day of supply-chain breaches kept the open-vs-closed and agent-safety threads moving in parallel.