Close Menu
    • Home
    • Events
      • Upcoming Events
      • Videos
        • Machine Can Think Summit 2026
        • Step Dubai Conference 2026
    • Technology & Innovation
    • Business & Marketing
    • Trends & Insights
    • Industry Applications
    • Tutorials & Guides
    What's Hot
    Business & Marketing

    eBay Q2 Revenue Forecast AI Driving Marketplace Success

    By Art RyanApril 30, 20260

    eBay is on track for a strong year with Q2 revenue expected to beat analysts’…

    Pirelli AI Tyre Technology: Revolutionizing Mobility

    April 30, 2026

    Microsoft Cloud Growth AI: Azure Revenue Surge

    April 30, 2026

    Amazon Surprises Investors As Artificial Intelligence Demand Booms

    April 30, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Breaking AI News
    Thursday, April 30
    • Home
    • Events
      • Upcoming Events
      • Videos
        • Machine Can Think Summit 2026
        • Step Dubai Conference 2026
    • Technology & Innovation

      Pirelli AI Tyre Technology: Revolutionizing Mobility

      April 30, 2026

      Pentagon Google AI Deal: Transforming Defense Technology

      April 30, 2026

      SAS Puts AI Governance at the Core of Its Agent Strategy

      April 29, 2026

      Amazon AI Hiring Software Enhances Recruitment Efficiency

      April 29, 2026

      AI Drug Development Johnson & Johnson Impact on Healthcare

      April 28, 2026
    • Business & Marketing

      eBay Q2 Revenue Forecast AI Driving Marketplace Success

      April 30, 2026

      Microsoft Cloud Growth AI: Azure Revenue Surge

      April 30, 2026

      Amazon Surprises Investors As Artificial Intelligence Demand Booms

      April 30, 2026

      Alphabet AI Cloud Revenue Growth Surpasses Expectations

      April 30, 2026

      Big Tech AI Spending 2026: Investment Trends Revealed

      April 29, 2026
    • Trends & Insights

      eBay Q2 Revenue Forecast AI Driving Marketplace Success

      April 30, 2026

      Amazon Surprises Investors As Artificial Intelligence Demand Booms

      April 30, 2026

      SAS Puts AI Governance at the Core of Its Agent Strategy

      April 29, 2026

      Big Tech AI Spending 2026: Investment Trends Revealed

      April 29, 2026

      Oracle & CoreWeave Shares Fall on OpenAI Growth Miss

      April 29, 2026
    • Industry Applications

      Pirelli AI Tyre Technology: Revolutionizing Mobility

      April 30, 2026

      Pentagon Google AI Deal: Transforming Defense Technology

      April 30, 2026

      Amazon AI Hiring Software Enhances Recruitment Efficiency

      April 29, 2026

      AI Drug Development Johnson & Johnson Impact on Healthcare

      April 28, 2026

      Accenture Copilot Rollout Enhances Employee Productivity

      April 28, 2026
    • Tutorials & Guides

      How AI Is Revolutionizing the Future of Travel 2026 with Wellness and Sustainability

      April 19, 2026

      University of Wollongong in Dubai AI initiative boosts future-ready education

      March 31, 2026

      Microsoft AI upgrades Copilot Cowork unveiled for early access users

      March 31, 2026

      Starcloud $11 billion valuation signals AI space race surge

      March 31, 2026

      Flexible AI Factories Power the Future of Energy Grids

      March 30, 2026
    Breaking AI News
    Home » AI Model Issues Highlight Need for More Standards and Testing, Researchers Say
    Technology & Innovation

    AI Model Issues Highlight Need for More Standards and Testing, Researchers Say

    Art RyanBy Art RyanJune 23, 2025No Comments4 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    As the usage of artificial intelligence — benign and adversarial — increases at breakneck speed, more cases of potentially harmful responses are being uncovered. These include hate speech, copyright infringements or sexual content.

    The emergence of these undesirable behaviors is compounded by a lack of regulations and insufficient testing of AI models, researchers told CNBC.

    Getting machine learning models to behave the way it was intended to do so is also a tall order, said Javier Rando, a researcher in AI.

    “The answer, after almost 15 years of research, is, no, we don’t know how to do this, and it doesn’t look like we are getting better,” Rando, who focuses on adversarial machine learning, told CNBC.

    However, there are some ways to evaluate risks in AI, such as red teaming. The practice involves individuals testing and probing artificial intelligence systems to uncover and identify any potential harm — a modus operandi common in cybersecurity circles.

    Shayne Longpre, a researcher in AI and policy and lead of the Data Provenance Initiative, noted that there are currently insufficient people working in red teams.

    While AI startups are now using first-party evaluators or contracted second parties to test their models, opening the testing to third parties such as normal users, journalists, researchers, and ethical hackers would lead to a more robust evaluation, according to a paper published by Longpre and researchers.

    “Some of the flaws in the systems that people were finding required lawyers, medical doctors to actually vet, actual scientists who are specialized subject matter experts to figure out if this was a flaw or not, because the common person probably couldn’t or wouldn’t have sufficient expertise,” Longpre said.

    Adopting standardized ‘AI flaw’ reports, incentives and ways to disseminate information on these ‘flaws’ in AI systems are some of the recommendations put forth in the paper.

    With this practice having been successfully adopted in other sectors such as software security, “we need that in AI now,” Longpre added.

    Marrying this user-centred practice with governance, policy and other tools would ensure a better understanding of the risks posed by AI tools and users, said Rando.

    No longer a moonshot

    Project Moonshot is one such approach, combining technical solutions with policy mechanisms. Launched by Singapore’s Infocomm Media Development Authority, Project Moonshot is a large language model evaluation toolkit developed with industry players such as IBM and Boston-based DataRobot.

    The toolkit integrates benchmarking, red teaming and testing baselines. There is also an evaluation mechanism which allows AI startups to ensure that their models can be trusted and do no harm to users, Anup Kumar, head of client engineering for data and AI at IBM Asia Pacific, told CNBC.

    Evaluation is a continuous process that should be done both prior to and following the deployment of models, said Kumar, who noted that the response to the toolkit has been mixed.

    “A lot of startups took this as a platform because it was open source, and they started leveraging that. But I think, you know, we can do a lot more.”

    Moving forward, Project Moonshot aims to include customization for specific industry use cases and enable multilingual and multicultural red teaming.

    Higher standards

    Pierre Alquier, Professor of Statistics at the ESSEC Business School, Asia-Pacific, said that tech companies are currently rushing to release their latest AI models without proper evaluation.

    “When a pharmaceutical company designs a new drug, they need months of tests and very serious proof that it is useful and not harmful before they get approved by the government,” he noted, adding that a similar process is in place in the aviation sector.

    AI models need to meet a strict set of conditions before they are approved, Alquier added. A shift away from broad AI tools to developing ones that are designed for more specific tasks would make it easier to anticipate and control their misuse, said Alquier.

    “LLMs can do too many things, but they are not targeted at tasks that are specific enough,” he said. As a result, “the number of possible misuses is too big for the developers to anticipate all of them.”

    Such broad models make defining what counts as safe and secure difficult, according to a research that Rando was involved in.

    Tech companies should therefore avoid overclaiming that “their defenses are better than they are,” said Rando.

    Source: https://www.cnbc.com/

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Art Ryan

    Related Posts

    Pirelli AI Tyre Technology: Revolutionizing Mobility

    April 30, 2026

    Pentagon Google AI Deal: Transforming Defense Technology

    April 30, 2026

    SAS Puts AI Governance at the Core of Its Agent Strategy

    April 29, 2026

    Comments are closed.

    Latest News

    eBay Q2 Revenue Forecast AI Driving Marketplace Success

    April 30, 2026

    Pirelli AI Tyre Technology: Revolutionizing Mobility

    April 30, 2026

    Microsoft Cloud Growth AI: Azure Revenue Surge

    April 30, 2026

    Amazon Surprises Investors As Artificial Intelligence Demand Booms

    April 30, 2026
    Facebook X (Twitter) Pinterest Vimeo WhatsApp TikTok Instagram LinkedIn YouTube Spotify Reddit Snapchat Threads

    AI University

    • Global Universities
    • Universities in Africa
    • Universities in Asia
    • Universities in Europe
    • Universities in Latin America
    • Universities in Middle East
    • Universities in North America
    • Universities in Oceania

    AI Tools & Apps Directory

    • AI Productivity Tools
    • AI Coding Tools
    • AI Voice Tools
    • AI Video Tools
    • AI Image Generators
    • AI Writing Tools

    Info

    • Home
    • About Us
    • AI Organizations & Associations
    • Contact Us

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    © 2026 Breaking AI News.
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.

    Sign Up

    Want to stay ahead In Artificial Intelligence?

     Sign up now and get exclusive breaking AI news and special updates—FREE!