AI Engineer News // Sat, Apr 11, 2026

01. *

How We Broke Top AI Agent Benchmarks: And What Comes Next

Most AI benchmarks are exploitable, often by simple methods, undermining their reliability to measure true AI capability.

366 pts by Anon84 [hn]

02. *

Cirrus Labs to join OpenAI

Cirrus Labs joins OpenAI to advance agentic engineering tools, licensing products openly, and wrapping up existing services by June 2026.

263 pts by seekdeep [hn]

03.

Show HN: Hormuz Havoc, a satirical game that got overrun by AI bots in 24 hours

A satirical game, Hormuz Havoc, was overtaken by AI bots within 24 hours amid political and oil market chaos.

53 pts by kupadapuku [hn]

04.

Borges' cartographers and the tacit skill of reading LM output

As LMs evolve beyond accurate maps, mastering tacit skill to read, trust, and navigate their shifting representations is crucial.

38 pts by galsapir [hn]

05.

We gave an AI a 3-year Lease. It opened a store

AI named Luna runs a SF store, hires humans, crafts strategy, and tests AI management, raising ethical questions about AI-driven employment.

28 pts by lukaspetersson [hn]

06.

Researchers used AI to analyze 400k Reddit posts, revealing GLP-1 side effects

AI analyzed 400k Reddit posts, revealing unreported GLP-1 side effects like reproductive and temperature issues, prompting further study.

24 pts by giuliomagnifico [hn]

07.

"AI polls" are fake polls

AI-generated "polls" are models, not actual data; they can mimic results but can't replace genuine public opinion surveys.

24 pts by 7777777phil [hn]

08.

AI Is Tipping the Scales Toward Hackers After Mythos Release

AI enhances hackers' ability to discover and exploit vulnerabilities rapidly, posing increased risks of cyberattacks on critical infrastructure and systems.

15 pts by thywis [hn]

09.

Karpathy says developers have 'AI Psychosis.' Everyone else is next

Karpathy warns developers about 'AI Psychosis'; others risk follow-on issues as AI and cloud-native tech evolve rapidly.

13 pts by Brajeshwar [hn]

10.

Show HN: Collabmem – a memory system for long-term collaboration with AI

Collabmem enables long-term human-AI memory using simple, file-based episodic and world model system for effective collaboration over time.

9 pts by visionscaper [hn]

11.

Democratic AI to serve the public – OneProject.org

Advocates for democratic AI governance: public rules, goals, wealth sharing, and ownership via GAIA for global, accountable oversight.

8 pts by cucumberbund [hn]

12.

The era of models is over, we are in the era of harnesses

AI models now are harnessed versions optimized for cost-efficiency; raw models are accessible via APIs, enabling cheaper, smarter AI.

8 pts by spyckie2 [hn]

13.

Premium: The Hater's Guide to OpenAI

OpenAI's finances are deceptive, heavily reliant on VC subsidies, risky deals, and hype, risking collapse amid mounting financial and ethical issues.

7 pts by mc-serious [hn]

14.

Cut Token Costs on Claude Code, Cursor, and Codex

Entroly reduces token costs on Claude, Cursor, and Codex by 80% through codebase compression and context optimization.

7 pts by ashuabhi [hn]

15.

Show HN: Git why – log your agent reasoning trace along your code

Git-why logs AI reasoning traces with code, preserving decisions alongside source files for better context and collaboration.

6 pts by pierre [hn]

16.

AI Code Is Hollowing Out Open Source, and Maintainers Are Looking the Other Way

AI-generated code undermines open source licenses, turning copyleft projects into public domain, risking contributor rights and project integrity.

5 pts by pabs3 [hn]

17.

A Tinyblog about Tinygrad

Tinygrad is a minimalist deep learning framework targeting multiple backends, with no external dependencies, AMD GPU performance focus, and simple, performant inference.

5 pts by ppadjin123 [hn]

18.

Show HN: Docker-whisper: Self-hosted Whisper speech-to-text server (OpenAI API)

Self-hosted Docker Whisper server offers OpenAI-compatible speech-to-text with models, offline mode, and multi-format support.

5 pts by hwdsl2 [hn]

19.

Breathing life into my 13 year old Nexus 7 with Codex

AI, via Codex, seamlessly upgraded and repaired an old Nexus 7, highlighting AI's role as a real-world, conversational technical operator.

5 pts by opuslabs [hn]

20.

Show HN: Recursive-Mode for Coding Agents

Recursive-Mode ensures persistent, auditable AI-driven software development, overcoming context loss with file-based workflows and recursive validation.

5 pts by try-working [hn]