Claude Code Opus 4.5 daily benchmarks track performance on SWE tasks, detecting statistically significant degradations over 30 days.
// curated from Hacker News with AI
Claude Code Opus 4.5 daily benchmarks track performance on SWE tasks, detecting statistically significant degradations over 30 days.
US cybersecurity chief leaked sensitive files to ChatGPT, triggering alerts and a federal review.
Tech market rot predated AI; layoffs signal financial toxicity, excess hiring, and fragile valuation tactics, not AI's fault.
Moltworker enables running Moltbot AI across Cloudflare Workers, using Sandboxes, R2, and AI Gateway for secure, scalable personal AI.
AI models struggle with OpenTelemetry tracing tasks, achieving only 29% success, highlighting challenges in automating distributed system observability.
Mozilla is building a rebel alliance with startups to promote open, trustworthy AI and challenge dominant companies like OpenAI and Anthropic.
AI may reshape engineering jobs, automating routine tasks, enabling faster learning, and allowing new grads to start higher.
White House uses AI for provocative memes: political trolling, deepfakes, and propaganda, exposing AI’s role in mainstream misinformation.
SpaceX considers merging with xAI, potentially consolidating Musk’s ventures and boosting space-based AI and data centers, with IPO plans in motion.
Microsoft beats earnings but stock drops 11% over concerns about slowing cloud growth and rising AI spending.
Google introduces Gemini AI in Chrome for multitasking, image transformation, app integration, personal intelligence, and agentic browsing.
Cryptographic warrants provide verifiable, scope-limited authorization for AI actions, improving accountability over logs alone.
OpenAI’s GPT-5 shows that AI model development is currently loss-making, but long-term profitability depends on sustained growth, innovation, and market dynamics.
Deep learning game AI runs on a Motorola 6809 8-bit CPU, reaching GNU Go level on a Thomson MO5 microcomputer.
Google adds Gemini AI sidebar to Chrome for automated browsing, multitasking, and shopping, with plans for personalized, context-aware AI assistance.
Agent-shell: Emacs-based interface for ACP-powered LLM agents like GPT, Claude, Mistral; supports various agents, setup, customization, and traffic monitoring.
Apple buys secretive AI startup Q.ai for $2B, boosting facial and silent communication tech for future wearable and device integration.
Security flaw exposed children's chat data from AI toy Bondu, risking privacy, safety, and potential misuse of sensitive information.
Anthropic's Claude is treated as a potential sentient being, blending strategic ambiguity with ethical questions about AI consciousness and responsibility.
Critical of TDD cult, highlighting its psychological appeal and limitations; warns AI coding agents may amplify false confidence and shallow success.
Open-source macOS tool transcribes videos from YouTube, TikTok, Instagram, and local files, building an auto-organizing knowledge base.
Apple acquires Israeli startup Q.AI near $2B to advance AI device development.
Apple invests $2B in Israeli AI facial tracking startup Q.ai, potentially enhancing future devices like AirPods and FaceTime.
Waymo launches autonomous rides at SFO, enhancing airport travel with safe, convenient, and scalable service for Bay Area travelers.
Facebook's Code World Model (CWM) is a 32-billion-parameter open-source large language model for code generation, reasoning, and system interaction.
Nvidia co-designed AI with DeepSeek, bypassing export controls, enabling China's access to advanced chips and undermining US restrictions.
AI speeds coding but may hinder skill development, especially when relied on too heavily or without active understanding.
AI monitoring system autonomously recovers long-running distributed training jobs on Kubernetes, Slurm, or TensorPool, avoiding GPU waste.
AI model profitability is complex; GPT-5 likely runs at a loss, but overall AI growth and strategic moves suggest long-term profitability potential.
Elon Musk’s SpaceX, Tesla, and xAI are discussing potential mergers, possibly combining space, AI, and EV businesses under one entity.
Apple acquires Israeli AI startup q.ai, focusing on advanced machine learning for improved audio, communication, and device interaction.
AI partner Einsia reduces LaTeX hassle in Overleaf, letting researchers focus on research instead of formatting and debugging.
Open-source, lightning-fast in-process vector database built on Alibaba's Proxima for scalable similarity search.
Bank of America struggles with Nvidia AI deployment due to regulatory, operational challenges and lack of in-house MLOps skills.
Microsoft's AI momentum faded as Nadella's initial lead waned, shifting investments toward Anthropic and OpenAI's cloud rivals.
XAI launches 1M digital workers for scalable, affordable automation of tasks like data entry, customer support, and software coding.
Open-source project reveals 57.2% AI-assisted PRs, mainly for refactoring and features; collaboration over full automation dominates.
Amazon found and reported hundreds of thousands of child abuse materials in its AI training data, but source details remain undisclosed.
Provides tools to evaluate frontier AI agents through ARC-AGI-3, enabling exploration, memory, and goal assessment in new environments.
Credyt offers real-time, usage-based billing for AI companies, enabling flexible pricing, instant top-ups, and transparent customer balances.
AI benchmarks vary in focus, from bug fixing to real-world tasks; scores need careful interpretation to gauge true AI capabilities.
A visual, step-by-step guide to understanding zero-knowledge proofs, generating real Groth16 proofs in-browser for educational purposes.
News publishers limit Internet Archive access to prevent AI scraping, seeking to protect copyrighted content amid growing concerns.
Experts warn AI bot swarms threaten democracy by manipulating opinion, infiltrating communities, and potentially disrupting elections, demanding global countermeasures.
AI swarms may craft synthetic consensus online, threatening democracy by mimicking real activities and influencing public opinion at scale.