Close Menu
    • Home
    • Events
      • Upcoming Events
      • Videos
        • Machine Can Think Summit 2026
        • Step Dubai Conference 2026
    • Technology & Innovation
    • Business & Marketing
    • Trends & Insights
    • Industry Applications
    • Tutorials & Guides
    What's Hot
    Technology & Innovation

    SAS Puts AI Governance at the Core of Its Agent Strategy

    By Art RyanApril 29, 20260

    As it moves deeper into the era of agentic AI, SAS is making governance a…

    Big Tech AI Spending 2026: Investment Trends Revealed

    April 29, 2026

    Amazon AI Hiring Software Enhances Recruitment Efficiency

    April 29, 2026

    Oracle & CoreWeave Shares Fall on OpenAI Growth Miss

    April 29, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Breaking AI News
    Wednesday, April 29
    • Home
    • Events
      • Upcoming Events
      • Videos
        • Machine Can Think Summit 2026
        • Step Dubai Conference 2026
    • Technology & Innovation

      SAS Puts AI Governance at the Core of Its Agent Strategy

      April 29, 2026

      Amazon AI Hiring Software Enhances Recruitment Efficiency

      April 29, 2026

      AI Drug Development Johnson & Johnson Impact on Healthcare

      April 28, 2026

      Qualcomm OpenAI AI Smartphone Processors Partnership News

      April 28, 2026

      Google AI Campus South Korea and Its Development Plans

      April 28, 2026
    • Business & Marketing

      Big Tech AI Spending 2026: Investment Trends Revealed

      April 29, 2026

      Oracle & CoreWeave Shares Fall on OpenAI Growth Miss

      April 29, 2026

      Authentic Brands Group Could Hit $50 Billion in Retail Sales by 2026, CEO Says

      April 29, 2026

      UK AI Startup Ineffable Secures $1.1B in Europe’s Largest Seed Round

      April 28, 2026

      Meta Manus AI Acquisition Blocked Over Strategic Concerns

      April 28, 2026
    • Trends & Insights

      SAS Puts AI Governance at the Core of Its Agent Strategy

      April 29, 2026

      Big Tech AI Spending 2026: Investment Trends Revealed

      April 29, 2026

      Oracle & CoreWeave Shares Fall on OpenAI Growth Miss

      April 29, 2026

      Google AI Campus South Korea and Its Development Plans

      April 28, 2026

      Meta Manus AI Acquisition Blocked Over Strategic Concerns

      April 28, 2026
    • Industry Applications

      Amazon AI Hiring Software Enhances Recruitment Efficiency

      April 29, 2026

      AI Drug Development Johnson & Johnson Impact on Healthcare

      April 28, 2026

      Accenture Copilot Rollout Enhances Employee Productivity

      April 28, 2026

      HomeLight AI Real Estate Closings Transforming the Market

      April 27, 2026

      UiPath & Databricks Partner to Transform Enterprise Operations through Automation and Data Intelligence

      April 27, 2026
    • Tutorials & Guides

      How AI Is Revolutionizing the Future of Travel 2026 with Wellness and Sustainability

      April 19, 2026

      University of Wollongong in Dubai AI initiative boosts future-ready education

      March 31, 2026

      Microsoft AI upgrades Copilot Cowork unveiled for early access users

      March 31, 2026

      Starcloud $11 billion valuation signals AI space race surge

      March 31, 2026

      Flexible AI Factories Power the Future of Energy Grids

      March 30, 2026
    Breaking AI News
    Home » Video-STaR: AI Trains Itself To Comprehend Video
    Technology & Innovation

    Video-STaR: AI Trains Itself To Comprehend Video

    Art RyanBy Art RyanApril 19, 2025No Comments4 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    The advance could lead to visually aware AI coaches and teachers that can do everything from correct a golf swing to train better surgeons.

    Today’s large language models show impressive capabilities in interpreting and generating text. Many scholars likewise dream of similar tools that interpret videos and images, but those hopes have remained elusive mostly because there is too little textual information describing videos that can then train models. The prospect of hiring humans to provide text descriptions —  a process known as “labeling” —  would be a time-consuming and expensive task.

    That may change with the release of Video Self-Training with augmented Reasoning — or  Video-STaR — which was recently published at the  International Conference on Learning Representations by a team of researchers from Google Research and Stanford University. Video-STaR enables models to train themselves to accurately reason about and describe the actions in video and images given only auxiliary video metadata and labels. It could lead to a massive, GPT-like dataset for videos, the researchers said.

    “Video-STaR allows AI to engage with dynamic, real-world actions in a way that previous models simply couldn’t,” said doctoral student Orr Zohar, first author of the paper describing Video-STaR. “It could open up exciting new avenues for how AI learns from video data,” said senior author of the paper, Serena Yeung-Levy, an assistant professor of biomedical data science at Stanford School of Medicine. Her lab is developing tools for AI-assisted surgery and surgical skill evaluation. “We hope that these methods will enable a lot of biomedical applications,” Yeung-Levy said.

    The researchers imagine visually aware AI instructors that can analyze videos of human activities ranging from a golf swing to surgery, to provide real-time technique analysis and corrective feedback.

    Reasonable Outcomes

    Existing video datasets have proved overly simplistic and, at best, provide mere descriptions rather than deep reasoning about the content and actions in the videos. The key innovation in Video-STaR is its ability to take advantage of any labeled video dataset, no matter how extensive the labeling.

    Video-STaR uses self-training cycles to improve its comprehension. It is first prompted to answer questions about video content. Those answers are then filtered to only those that contain the original video labels — that is, Video STaR filters out incorrect labels. Then, Video-STaR re-trains itself on these newly generated answers to improve its analytical skills.

    “In effect, Video-STaR utilizes the existing labels as a form of supervision, a way to check its work,” Zohar said. “We found that these models learn to reason as an emergent behavior.”

    “The model continuously refines itself,”  Zohar added. “And over time, generates richer and more accurate responses. This self-training mechanism not only reduces the need for costly human annotations but also makes it possible to train video-language models on a much larger and more diverse set of data.”

    In one example presented in the paper, Video-STaR analyzed videos of a diving competition, correctly assessing the number of somersaults performed, identifying the diver’s tuck position, and evaluating the entry. It then appraised the dive as “quite tough” and issued a reasonably accurate degree-of-difficulty score of 64.68, for a dive that had been rated at 65.6 by human competition judges.

    Future Focus

    Video-STaR’s potential applications extend beyond improving AI’s ability to answer questions about videos, opening a world of possibilities in fields such as robotics, sports performance analysis, education, and even surgery. In sports, a tool like this could evaluate a golfer’s swing, a tennis player’s stroke, or a gymnast’s routine and offer insights on how to improve their techniques.

    Professor Yeung-Levy imagines similar AI-enabled medical instructors. “For me, one major goal is being able to assess the quality of surgical performance through video analysis,” Yeung-Levy explained. Video-STaR could lead to AI systems that provide constructive feedback on a surgeon’s technique and train more and better surgeons. “Ultimately, it could improve outcomes for patients,” she said.

    Future research will likely focus on improving the label filtering process and extending Video-STaR’s capabilities to more complex, longer-form videos, Zohar noted.

    “The goal is for AI to be able to engage in real conversations about video content, where the user can ask follow-up questions and the model is able to make deeper connections between actions and events in the video content,” Zohar said. “That’s the next frontier.”

    Source: https://hai.stanford.edu/

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Art Ryan

    Related Posts

    SAS Puts AI Governance at the Core of Its Agent Strategy

    April 29, 2026

    Amazon AI Hiring Software Enhances Recruitment Efficiency

    April 29, 2026

    AI Drug Development Johnson & Johnson Impact on Healthcare

    April 28, 2026

    Comments are closed.

    Latest News

    SAS Puts AI Governance at the Core of Its Agent Strategy

    April 29, 2026

    Big Tech AI Spending 2026: Investment Trends Revealed

    April 29, 2026

    Amazon AI Hiring Software Enhances Recruitment Efficiency

    April 29, 2026

    Oracle & CoreWeave Shares Fall on OpenAI Growth Miss

    April 29, 2026
    Facebook X (Twitter) Pinterest Vimeo WhatsApp TikTok Instagram LinkedIn YouTube Spotify Reddit Snapchat Threads

    AI University

    • Global Universities
    • Universities in Africa
    • Universities in Asia
    • Universities in Europe
    • Universities in Latin America
    • Universities in Middle East
    • Universities in North America
    • Universities in Oceania

    AI Tools & Apps Directory

    • AI Productivity Tools
    • AI Coding Tools
    • AI Voice Tools
    • AI Video Tools
    • AI Image Generators
    • AI Writing Tools

    Info

    • Home
    • About Us
    • AI Organizations & Associations
    • Contact Us

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    © 2026 Breaking AI News.
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.

    Sign Up

    Want to stay ahead In Artificial Intelligence?

     Sign up now and get exclusive breaking AI news and special updates—FREE!