Close Menu
Must Have Gadgets –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    This Roomba robot vacuum is 50% off and actually makes “set it and forget it” cleaning realistic

    December 2, 2025

    India orders device makers to put government-run security app on all phones

    December 2, 2025

    Apple Music Replay 2025 is live ahead of Spotify Wrapped — here’s how to get your recap

    December 2, 2025
    Facebook X (Twitter) Instagram
    Must Have Gadgets –
    Trending
    • This Roomba robot vacuum is 50% off and actually makes “set it and forget it” cleaning realistic
    • India orders device makers to put government-run security app on all phones
    • Apple Music Replay 2025 is live ahead of Spotify Wrapped — here’s how to get your recap
    • Mad Men’s 4K debut botched by HBO Max streaming episode with visible crewmembers
    • Amazon’s bet that AI benchmarks don’t matter
    • Don’t Let a Sluggish Android Ruin Your Holiday. Give It a Much-Needed Boost
    • Crush Holiday Cleaning With This Robot Vac for $200, a Record-Low Price on Amazon
    • ‘High Potential’ Season 2 Hiatus: When Does the Next Episode Premiere?
    • Home
    • Shop
      • Earbuds & Headphones
      • Smartwatches
      • Mobile Accessories
      • Smart Home Devices
      • Laptops & Tablets
    • Gadget Reviews
    • How-To Guides
    • Mobile Accessories
    • Smart Devices
    • More
      • Top Deals
      • Smart Home
      • Tech News
      • Trending Tech
    Facebook X (Twitter) Instagram
    Must Have Gadgets –
    Home»Gadget Reviews»Amazon’s bet that AI benchmarks don’t matter
    Gadget Reviews

    Amazon’s bet that AI benchmarks don’t matter

    adminBy adminDecember 2, 2025No Comments5 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Amazon’s bet that AI benchmarks don’t matter
    Share
    Facebook Twitter LinkedIn Pinterest Email

    This is an excerpt of Sources by Alex Heath, a newsletter about AI and the tech industry, syndicated just for The Verge subscribers once a week.

    Amazon’s AI chief has a message for the model benchmark obsessives: Stop looking at the leaderboards.

    “I want real-world utility. None of these benchmarks are real,” Rohit Prasad, Amazon’s SVP of AGI, told me ahead of today’s announcements at AWS re:Invent in Las Vegas. “The only way to do real benchmarking is if everyone conforms to the same training data and the evals are completely held out. That’s not what’s happening. The evals are frankly getting noisy, and they’re not showing the real power of these models.”

    It’s a contrarian stance when every other AI lab is quick to boast about how their new models quickly climb the leaderboards. It’s also convenient for Amazon, given that the previous version of Nova, its flagship model, was sitting at spot 79 on LMArena when Prasad and I spoke last week. Still, dismissing benchmarks only works if Amazon can offer a different story about what progress looks like.

    “They’re not showing the real power of these models.”

    The centerpiece of today’s re:Invent announcements is Nova Forge, a service that Amazon claims lets companies train custom AI models in ways previously impossible without spending billions of dollars. The problem Forge addresses is real. Most companies trying to customize AI models face three bad options: fine-tune a closed model (but only at the edges), train on open-weight models (but without the original training data and risking capability regression, where the AI becomes an expert on new data but forgets original, broader skills), or build a model from scratch at enormous cost.

    Forge offers something else: access to Amazon’s Nova model checkpoints at the pre-training, mid-training, and post-training stages. Companies can inject their proprietary data early in the process, when the model’s “learning capacity is highest,” as Prasad put it, rather than just tweaking model behavior at the end.

    “What we have done is democratize AI and frontier model development for your use cases at fractions of what it would cost [before],” Prasad said. Forge was created because Amazon’s internal teams wanted a tool to inject their domain expertise into a base model without having to build from scratch.

    “We built Forge because our internal teams wanted Forge,” he said. It’s a familiar Amazon pattern. AWS itself famously began as infrastructure built for Amazon’s own retail operation before becoming the company’s profit engine.

    Reddit has been using Forge to build custom safety models trained on 23 years of community moderation data. “I haven’t seen anything like it yet,” Chris Slowe, Reddit’s CTO and first employee, told me. “We’ve had a distinguished engineer who’s just been like a kid in the candy shop.”

    Slowe said Reddit ran a continued pre-training job last week that’s “looking really promising.” The goal: Replace multiple bespoke safety models with a single Reddit-expert model that understands the nuances of community moderation, including the notoriously subjective rule that appears across subreddits everywhere: “Don’t be a jerk.”

    “Having an expert model, it’s going to understand the community,” Slowe said. “It’s gonna have a pretty good notion of what jerk means.”

    That’s the thread Amazon wants developers to pull on: not raw IQ points, but control and specialization.

    He explained that Forge enables Reddit to control its models, avoid surprises from API changes, retain ownership of its weights, and avoid sending sensitive data to third-party model providers. He said Reddit is already exploring using the same approach for Reddit Answers and other products.

    When I asked Slowe whether it mattered that Nova isn’t a top-tier model on benchmarks, he was blunt: “In this context, what matters is the Reddit expertness of the model.” That’s the thread Amazon wants developers to pull on: not raw IQ points, but control and specialization.

    With Forge, Amazon is making a calculated bet that the model race has commoditized and that it can succeed by being the place where companies can build specialized AI for specific business problems. It’s a very AWS-shaped view of the world: infrastructure over intelligence and customization over raw capability. The strategy also lets Amazon sidestep direct comparisons with OpenAI and Anthropic, both of which it once hoped to compete with at the model layer.

    Whether Forge is genuinely pioneering or just clever positioning depends, of course, on developer adoption. Amazon insists that the model race, as it’s widely understood, doesn’t matter. If that ends up being true, the scoreboard shifts to something much quieter and harder to game: whether AI models actually deliver real-world utility.

    Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

    • Alex HeathClose

      Alex Heath

      Sources author, Verge contributor

      Posts from this author will be added to your daily email digest and your homepage feed.

      FollowFollow

      See All by Alex Heath

    • AIClose

      AI

      Posts from this topic will be added to your daily email digest and your homepage feed.

      FollowFollow

      See All AI

    • ColumnClose

      Column

      Posts from this topic will be added to your daily email digest and your homepage feed.

      FollowFollow

      See All Column

    • SourcesClose

      Sources

      Posts from this topic will be added to your daily email digest and your homepage feed.

      FollowFollow

      See All Sources

    Amazons benchmarks bet Dont matter
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website

    Related Posts

    Don’t Let a Sluggish Android Ruin Your Holiday. Give It a Much-Needed Boost

    December 2, 2025

    ExpressVPN adds a Fastest Location button and launches a new native Mac app

    December 2, 2025

    You have 12 hours to get the Google Pixel 10 Pro Fold $400 off!

    December 2, 2025
    Leave A Reply Cancel Reply

    Top Posts

    This Roomba robot vacuum is 50% off and actually makes “set it and forget it” cleaning realistic

    December 2, 2025

    PayPal’s blockchain partner accidentally minted $300 trillion in stablecoins

    October 16, 2025

    The best AirPods deals for October 2025

    October 16, 2025
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    How-To Guides

    How to Disable Some or All AI Features on your Samsung Galaxy Phone

    By adminOctober 16, 20250
    Gadget Reviews

    PayPal’s blockchain partner accidentally minted $300 trillion in stablecoins

    By adminOctober 16, 20250
    Smart Devices

    The best AirPods deals for October 2025

    By adminOctober 16, 20250

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Latest Post

    This Roomba robot vacuum is 50% off and actually makes “set it and forget it” cleaning realistic

    December 2, 2025

    India orders device makers to put government-run security app on all phones

    December 2, 2025

    Apple Music Replay 2025 is live ahead of Spotify Wrapped — here’s how to get your recap

    December 2, 2025
    Recent Posts
    • This Roomba robot vacuum is 50% off and actually makes “set it and forget it” cleaning realistic
    • India orders device makers to put government-run security app on all phones
    • Apple Music Replay 2025 is live ahead of Spotify Wrapped — here’s how to get your recap
    • Mad Men’s 4K debut botched by HBO Max streaming episode with visible crewmembers
    • Amazon’s bet that AI benchmarks don’t matter

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2025 must-have-gadgets.

    Type above and press Enter to search. Press Esc to cancel.