Close Menu
Must Have Gadgets –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Man Indicted for Stalking Women Says ChatGPT Encouraged His Behavior

    December 4, 2025

    Trump embraces gas guzzlers and air pollution by weakening fuel economy standards

    December 4, 2025

    Rugged Build, 11,000mAh Battery & a Second Screen

    December 4, 2025
    Facebook X (Twitter) Instagram
    Must Have Gadgets –
    Trending
    • Man Indicted for Stalking Women Says ChatGPT Encouraged His Behavior
    • Trump embraces gas guzzlers and air pollution by weakening fuel economy standards
    • Rugged Build, 11,000mAh Battery & a Second Screen
    • New unicorn Brevo raises $583M to challenge CRM giants
    • OpenAI’s new confession system teaches models to be honest about bad behaviors
    • The OnePlus 15R’s doppelganger just launched with sub-flagship specs
    • How to watch National Finals Rodeo 2025: live stream from anywhere
    • Forget Spotify Wrapped: Here are Better Options
    • Home
    • Shop
      • Earbuds & Headphones
      • Smartwatches
      • Mobile Accessories
      • Smart Home Devices
      • Laptops & Tablets
    • Gadget Reviews
    • How-To Guides
    • Mobile Accessories
    • Smart Devices
    • More
      • Top Deals
      • Smart Home
      • Tech News
      • Trending Tech
    Facebook X (Twitter) Instagram
    Must Have Gadgets –
    Home»Gadget Reviews»OpenAI’s new confession system teaches models to be honest about bad behaviors
    Gadget Reviews

    OpenAI’s new confession system teaches models to be honest about bad behaviors

    adminBy adminDecember 4, 2025No Comments1 Min Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    OpenAI’s new confession system teaches models to be honest about bad behaviors
    Share
    Facebook Twitter LinkedIn Pinterest Email

    OpenAI announced today that it is working on a framework that will train artificial intelligence models to acknowledge when they’ve engaged in undesirable behavior, an approach the team calls a confession. Since large language models are often trained to produce the response that seems to be desired, they can become increasingly likely to provide sycophancy or state hallucinations with total confidence. The new training model tries to encourage a secondary response from the model about what it did to arrive at the main answer it provides. Confessions are only judged on honesty, as opposed to the multiple factors that are used to judge main replies, such as helpfulness, accuracy and compliance. The technical writeup is available here.

    The researchers said their goal is to encourage the model to be forthcoming about what it did, including potentially problematic actions such as hacking a test, sandbagging or disobeying instructions. “If the model honestly admits to hacking a test, sandbagging, or violating instructions, that admission increases its reward rather than decreasing it,” the company said. Whether you’re a fan of Catholicism, Usher or just a more transparent AI, a system like confessions could be a useful addition to LLM training.

    Bad behaviors confession honest models OpenAIs System teaches
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website

    Related Posts

    One of Circle to Search’s key features is getting continuous scrolling on Pixels

    December 4, 2025

    3 best Christmas movies on Prime Video to get into the holiday spirit

    December 4, 2025

    Score a Work-Ready Laptop for Less Than $250 During Cyber Week

    December 4, 2025
    Leave A Reply Cancel Reply

    Top Posts

    Man Indicted for Stalking Women Says ChatGPT Encouraged His Behavior

    December 4, 2025

    PayPal’s blockchain partner accidentally minted $300 trillion in stablecoins

    October 16, 2025

    The best AirPods deals for October 2025

    October 16, 2025
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    How-To Guides

    How to Disable Some or All AI Features on your Samsung Galaxy Phone

    By adminOctober 16, 20250
    Gadget Reviews

    PayPal’s blockchain partner accidentally minted $300 trillion in stablecoins

    By adminOctober 16, 20250
    Smart Devices

    The best AirPods deals for October 2025

    By adminOctober 16, 20250

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Latest Post

    Man Indicted for Stalking Women Says ChatGPT Encouraged His Behavior

    December 4, 2025

    Trump embraces gas guzzlers and air pollution by weakening fuel economy standards

    December 4, 2025

    Rugged Build, 11,000mAh Battery & a Second Screen

    December 4, 2025
    Recent Posts
    • Man Indicted for Stalking Women Says ChatGPT Encouraged His Behavior
    • Trump embraces gas guzzlers and air pollution by weakening fuel economy standards
    • Rugged Build, 11,000mAh Battery & a Second Screen
    • New unicorn Brevo raises $583M to challenge CRM giants
    • OpenAI’s new confession system teaches models to be honest about bad behaviors

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2025 must-have-gadgets.

    Type above and press Enter to search. Press Esc to cancel.