Close Menu
    What's Hot
    AI Events

    Web Summit Rio 2026 Starts Today in Brazil

    By Art RyanJune 8, 20260

    Web Summit Rio 2026 starts today in Brazil. It is bringing thousands of technology leaders,…

    Claude Chemist: Anthropic Tests AI for Advanced Chemistry

    June 8, 2026

    SpaceX Google Cloud Deal Boosts AI Compute Race

    June 8, 2026

    Breaking News: Xiamen Airlines to Host 83rd IATA AGM in 2027

    June 8, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Breaking AI News
    Monday, June 8
    • Home
    • Events
    • Videos
      • Machine Can Think Summit 2026
      • Step Dubai Conference 2026
    • Technology & Innovation

      Web Summit Rio 2026 Starts Today in Brazil

      June 8, 2026

      Claude Chemist: Anthropic Tests AI for Advanced Chemistry

      June 8, 2026

      SpaceX Google Cloud Deal Boosts AI Compute Race

      June 8, 2026

      Breaking News: Xiamen Airlines to Host 83rd IATA AGM in 2027

      June 8, 2026

      Middle East Disruptions and High Fuel Prices Hit Airlines

      June 8, 2026
    • Business & Marketing

      SpaceX Google Cloud Deal Boosts AI Compute Race

      June 8, 2026

      Middle East Disruptions and High Fuel Prices Hit Airlines

      June 8, 2026

      Willie Walsh Report Warns Airline Profits to Halve in 2026

      June 8, 2026

      IATA AGM 2026: China’s Aviation Market Sees Major Growth

      June 7, 2026

      Philippine Airlines Joins oneworld Alliance as 16th Member Airline

      June 7, 2026
    • Industry Applications

      Claude Chemist: Anthropic Tests AI for Advanced Chemistry

      June 8, 2026

      IATA Says SAF Production Volumes Remain Disappointing in 2026

      June 7, 2026

      IATA Expands Cargo Services in Brazil, Mexico and Paraguay

      June 6, 2026

      Pegasus Airlines Invests in AI-Powered Operations Platform

      June 6, 2026

      Breeze Airways Adds AI Platform to Dispatch Operations

      June 6, 2026
    • Trends & Insights

      ChatGPT Reaches 1 Billion Users Faster Than Any App

      June 4, 2026

      Sam Altman Warns Companies Wasting Money on Enterprise AI

      June 3, 2026

      Emirati AI Experts Advance UAE AI Strategy 2031

      June 2, 2026

      Anthropic Files Confidentially for U.S. IPO After $965B Valuation

      June 2, 2026

      Google Gemini Spark: 24/7 AI Assistant Real Productivity Potential

      June 1, 2026
    • AI in Travel

      Breaking News: Xiamen Airlines to Host 83rd IATA AGM in 2027

      June 8, 2026

      Middle East Disruptions and High Fuel Prices Hit Airlines

      June 8, 2026

      Willie Walsh Report Warns Airline Profits to Halve in 2026

      June 8, 2026

      IATA AGM 2026: China’s Aviation Market Sees Major Growth

      June 7, 2026

      Philippine Airlines Joins oneworld Alliance as 16th Member Airline

      June 7, 2026
    Breaking AI News
    Home » Stanford CRFM partners with Arabic AI on HELM Arabic leaderboard
    Technology & Innovation

    Stanford CRFM partners with Arabic AI on HELM Arabic leaderboard

    Art RyanBy Art RyanJanuary 29, 2026No Comments2 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Stanford CRFM partners with Arabic AI on HELM Arabic leaderboard
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Stanford CRFM collaborates with Arabic.AI to create a new evaluation platform focused on Arabic large language models. The collaboration resulted in HELM Arabic, a public leaderboard designed to measure model performance using standardized Arabic benchmarks. The project was developed by Stanford University’s Center for Research on Foundation Models (CRFM) together with Arabic.AI. The platform extends the existing HELM evaluation framework to Arabic language tasks.

    Stanford CRFM Collaborates with Arabic.AI on HELM Arabic

    The HELM Arabic leaderboard evaluates models across seven Arabic-language tasks. These tasks include AlGhafa, ArabicMMLU, Arabic EXAMS, MadinahQA, AraTrust, ALRAGE, and ArbMMLU-HT. Each benchmark measures different language capabilities. These include multiple-choice reasoning, question answering, grammar understanding, safety evaluation, and academic knowledge. The benchmarks are drawn from established Arabic datasets.

    Evaluation Methodology Used by Stanford CRFM and Arabic.AI

    Stanford CRFM collaborates with Arabic.AI using a standardized evaluation process. The system applies zero-shot prompting for instruction-tuned models. Multiple-choice tasks use Arabic letter options rather than Latin characters. The evaluation samples 1,000 examples per task subset to balance dataset distributions. Optional reasoning features are disabled to maintain consistency across models. The leaderboard records full model prompts and outputs to support reproducibility.

    Model Rankings and Benchmark Results

    In the initial HELM Arabic results, Arabic.AI LLM-X (Pronoia) achieved the highest overall score across all seven tasks. Among open-weights models, Qwen3 235B ranked highest by mean score. Other open-weights models appearing in the top ten include Llama 4 Maverick, Qwen3-Next 80B, and DeepSeek v3.1. Several Arabic-focused models, such as AceGPT-v2, ALLaM, JAIS, and SILMA, were evaluated but did not rank above leading multilingual models.

    Purpose of the HELM Arabic Platform

    Stanford CRFM collaborates with Arabic.AI to address gaps in Arabic model evaluation infrastructure. HELM Arabic provides a transparent system for comparing both proprietary and open models. The platform allows researchers to replicate results and track progress in Arabic language modeling using consistent benchmarks.

    Source: https://www.middleeastainews.com/p/stanford-crfm-collabs-with-arabic

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Art Ryan

    Related Posts

    Web Summit Rio 2026 Starts Today in Brazil

    June 8, 2026

    Claude Chemist: Anthropic Tests AI for Advanced Chemistry

    June 8, 2026

    SpaceX Google Cloud Deal Boosts AI Compute Race

    June 8, 2026

    Comments are closed.

    Latest News

    Web Summit Rio 2026 Starts Today in Brazil

    June 8, 2026

    Claude Chemist: Anthropic Tests AI for Advanced Chemistry

    June 8, 2026

    SpaceX Google Cloud Deal Boosts AI Compute Race

    June 8, 2026

    Breaking News: Xiamen Airlines to Host 83rd IATA AGM in 2027

    June 8, 2026
    Facebook X (Twitter) Pinterest Vimeo WhatsApp TikTok Instagram LinkedIn YouTube Spotify Reddit Snapchat Threads

    AI University

    • Global Universities
    • Universities in Africa
    • Universities in Asia
    • Universities in Europe
    • Universities in Latin America
    • Universities in Middle East
    • Universities in North America
    • Universities in Oceania

    AI Tools & Apps Directory

    • AI Productivity Tools
    • AI Coding Tools
    • AI Voice Tools
    • AI Video Tools
    • AI Image Generators
    • AI Writing Tools

    Info

    • Home
    • About Us
    • AI Organizations & Associations
    • Contact Us
    • Cookie Policy
    • Copyright Policy
    • Disclaimer
    • Editorial Policy
    • Terms and Conditions

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    © 2026 Breaking AI News.
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.

    Sign Up

    Want to stay ahead In Artificial Intelligence?

     Sign up now and get exclusive breaking AI news and special updates—FREE!