AI Week In Review: Is This AGI?
This might’ve been the craziest week in AI we’ve ever seen—yet.
This might’ve been the craziest week in AI we’ve ever seen—yet.
From models that feel AGI-adjacent to open-source coding agents and a $32B startup built on pure vision, this week is truly historic.
Grab a coffee, sit somewhere comfortable - because we’re diving into the biggest shifts happening this week that will undoubtedly shape the future of AI.
The Most Historic Week at OpenAI… Yet
The News: This is one of the biggest weeks for OpenAI since GPT-3.5, introducing the most advanced reasoning models yet (o3 and o4-mini), an open-source command-line coding agent (Codex CLI), and the GPT-4.1 model family with 1 million-token context windows and drastically reduced pricing. Together, these launches push the boundaries of agentic AI, developer tooling, and multimodal reasoning in a way we’ve never seen before.
What You Need to Know:
o3 now holds SOTA performance, scoring over 99% on AIME math benchmarks and achieving more than a 25% success rate on "humanity’s last exam."
o3 redefines benchmarks with SWE-bench Verified jumping from 48.9% to 71.7%, a Codeforces Elo of 2727, and 12x gains in advanced math — thanks to a new architecture leveraging both training-time and test-time compute.
Both models now operate as fully agentic systems - o3 uses tools, executes workflows, and integrates multimodal reasoning. In a live demo, it executed over 600 tool calls in a single response.
Open-source terminal agent designed to bring o3’s coding capabilities to developer workflows.
Supports real-time execution of commands, workflows, and tool use directly in the terminal.
Empowers devs to interact with agentic reasoning locally and extend model capabilities through shell environments.
GPT-4.1, 4.1 Mini, and 4.1 Nano:
The suite includes GPT-4.1, 4.1 mini, and 4.1 nano, all designed to outperform GPT-4o on real-world dev tasks.
Each model supports 1M-token context windows, allowing them to process the equivalent of 8 full React codebases in one go.
Pricing drops are substantial—GPT-4.1 now costs $2 per 1M input tokens with nano offering the lowest price at $0.10 per 1M input tokens.
Why it Matters: This week may go down as the most pivotal since GPT-3.5. With the arrival of o3, we’re staring directly at a new frontier of AGI-adjacent intelligence—one that doesn’t just answer, but acts, reasons, and builds. Meanwhile, GPT-4.1 is democratizing developer access to powerful AI, allowing for long-context software engineering at record-low prices. Combined, these launches position OpenAI not just as a model maker, but as the infrastructure backbone for a fully agentic, AI-native future.
Google Strikes Back: Gemini 2.5 & Veo 2
The News: Announced in the wake of Google Cloud Next 2025, and in direct response to OpenAI’s recent releases, Gemini 2.5 Flash introduces a novel "thinking budget" feature for hybrid reasoning, offering developers more control over cost and compute. Simultaneously, Veo 2—Google’s answer to OpenAI’s Sora—is now live across the company’s AI suite, unlocking cinematic text-to-video generation for users and developers. While OpenAI dominated headlines, Google is doubling down on practical, flexible, and creative AI use cases that meet consumers and builders where they are.
Inside The Launches:
A major leap over 2.0 Flash, Gemini 2.5 introduces a toggle to enable or disable advanced reasoning on demand.
It delivers strong results across STEM, multi-step reasoning, and visual understanding, at significantly reduced cost levels.
Developers can set a token cap up to 24,576 to balance speed, accuracy, and budget constraints.
Now available in preview through Google AI Studio and Vertex AI, plus as an experimental toggle in the Gemini app.
Paid users can now generate 8-second, 720p cinematic clips using text prompts. Rollout began April 15, 2025 with global access phased in.
Videos are watermarked with SynthID and sharable directly to YouTube Shorts or TikTok. A usage cap applies.
Whisk Animate allows Google One AI Premium users to animate static images into video clips.
Developers can use Veo 2 in Google’s AI Studio, with support for both text and image prompts.
Why it Matters: While OpenAI's breakthroughs made waves this week, Google's strategy is quietly powerful: bring high-utility AI to everyone. Gemini 2.5 Flash is a flexible, cost-conscious reasoning model built for real-world use, and Veo 2 broadens the frontier of generative video in a responsible and accessible way. These drops reinforce Google’s long-game: AI that’s not just smart, but usable, scalable, and everywhere you build.
NVIDIA brings AI chipmaking to the U.S.
The News: NVIDIA has announced its FIRST major U.S. manufacturing effort for AI chips and supercomputers, partnering with TSMC in Arizona and Foxconn and Wistron in Texas. The move is backed by a $500 billion commitment over four years.
The Details:
Arizona’s TSMC Phoenix site began producing Blackwell GPUs using 4nm tech, with packaging handled by Amkor and SPIL.
Over 1 million sq ft dedicated to manufacturing and chip testing.
Foxconn in Houston and Wistron in Dallas are building supercomputer assembly and AI server facilities, with production expected in 12–15 months.
Why it matters: NVIDIA’s expansion marks a strategic pivot toward supply chain independence, national security alignment, and economic revitalization. Backed by incentives and public praise from policymakers, the move supports job creation and advances U.S. leadership in foundational AI infrastructure.
Ilya’s SSI $32B valuation With No Product
The News: Safe Superintelligence Inc. (SSI), launched by former OpenAI chief scientist Ilya Sutskever, has reportedly raised $2 billion at a $32 billion valuation, marking one of the fastest-growing AI startups to date.
The Details:
The round was led by Greenoaks with a $500M contribution. Lightspeed Venture Partners, Andreessen Horowitz, and DST Global also participated.
Alphabet and Nvidia joined the round, though their investment amounts remain undisclosed.
The company is focused entirely on building superintelligence while maintaining a safety-first mission.
Sutskever previously described SSI’s unique path forward as a “different mountain to climb”, hinting at a bold divergence from traditional AGI strategies.
Why it matters: SSI’s valuation has grown sixfold since its $5B mark in September 2024, without a product in market. The explosive growth reflects massive investor appetite for AI startups, particularly those helmed by OpenAI alumni and focused on long-term superintelligence goals.
Thanks for reading this far! Stay ahead of the curve with my daily AI newsletter—bringing you the latest in AI news, innovation, and leadership every single day, 365 days a year. See you in the next edition!
I think it's really fair to say that o3 is proto-AGI... it meets the expert level intelligence, it has the most advanced tooling AND reasoning that we've ever seen... if it's not AGI - what is?
"We will have AGI by 2025" - Sam Altman
Now look where we are today! I think that o3 is AGI. It hits all the criterion that 99% of experts agree on. It'd be great if everyone could agree on the same things though (this is supposed to be computer SCIENCE at the end of the day).