Close Menu
    • Home
    • Events
      • Upcoming Events
      • Videos
        • Machine Can Think Summit 2026
        • Step Dubai Conference 2026
    • Technology & Innovation
    • Business & Marketing
    • Trends & Insights
    • Industry Applications
    • Tutorials & Guides
    What's Hot
    Industry Applications

    AI Drug Development Johnson & Johnson Impact on Healthcare

    By Art RyanApril 28, 20260

    Johnson & Johnson (J&J) has unveiled new information about the future of AI in healthcare,…

    Qualcomm OpenAI AI Smartphone Processors Partnership News

    April 28, 2026

    Google AI Campus South Korea and Its Development Plans

    April 28, 2026

    Accenture Copilot Rollout Enhances Employee Productivity

    April 28, 2026
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    Breaking AI News
    Wednesday, April 29
    • Home
    • Events
      • Upcoming Events
      • Videos
        • Machine Can Think Summit 2026
        • Step Dubai Conference 2026
    • Technology & Innovation

      AI Drug Development Johnson & Johnson Impact on Healthcare

      April 28, 2026

      Qualcomm OpenAI AI Smartphone Processors Partnership News

      April 28, 2026

      Google AI Campus South Korea and Its Development Plans

      April 28, 2026

      New AI-Based Solution Launched by Box to Revolutionize Enterprise Workflows

      April 28, 2026

      Meta AWS Graviton AI Partnership: Revolutionizing Infrastructure

      April 28, 2026
    • Business & Marketing

      UK AI Startup Ineffable Secures $1.1B in Europe’s Largest Seed Round

      April 28, 2026

      Meta Manus AI Acquisition Blocked Over Strategic Concerns

      April 28, 2026

      Microsoft Ceases Revenue Split With OpenAI in Landmark AI Partnership Move

      April 28, 2026

      ZainTECH Named a Leader in IDC MarketScape: Gulf Countries AI Professional Services

      April 28, 2026

      AI Job Cuts Forecast: Shocking Prediction That 50% of UK Executives Expect Workforce Reduction

      April 20, 2026
    • Trends & Insights

      Google AI Campus South Korea and Its Development Plans

      April 28, 2026

      Meta Manus AI Acquisition Blocked Over Strategic Concerns

      April 28, 2026

      Emirati Inventor AI UAE: Bridging Culture and Technology

      April 28, 2026

      Cursor’s $50 Billion Ambition: Explosive AI Coding Demand Fuels Massive Growth

      April 19, 2026

      Dubai AI-powered government will change your daily life in the UAE

      April 3, 2026
    • Industry Applications

      AI Drug Development Johnson & Johnson Impact on Healthcare

      April 28, 2026

      Accenture Copilot Rollout Enhances Employee Productivity

      April 28, 2026

      HomeLight AI Real Estate Closings Transforming the Market

      April 27, 2026

      UiPath & Databricks Partner to Transform Enterprise Operations through Automation and Data Intelligence

      April 27, 2026

      Visit Oman Launches Revolutionary AI Digital Hub and Global Collaboration to Transform Tourism Industry

      April 27, 2026
    • Tutorials & Guides

      How AI Is Revolutionizing the Future of Travel 2026 with Wellness and Sustainability

      April 19, 2026

      University of Wollongong in Dubai AI initiative boosts future-ready education

      March 31, 2026

      Microsoft AI upgrades Copilot Cowork unveiled for early access users

      March 31, 2026

      Starcloud $11 billion valuation signals AI space race surge

      March 31, 2026

      Flexible AI Factories Power the Future of Energy Grids

      March 30, 2026
    Breaking AI News
    Home » AI tool generates high-quality images faster than state-of-the-art approaches
    Technology & Innovation

    AI tool generates high-quality images faster than state-of-the-art approaches

    Art RyanBy Art RyanMarch 21, 2025No Comments6 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    The ability to generate high-quality images quickly is crucial for producing realistic simulated environments that can be used to train self-driving cars to avoid unpredictable hazards, making them safer on real streets.

    But the generative artificial intelligence techniques increasingly being used to produce such images have drawbacks. One popular type of model, called a diffusion model, can create stunningly realistic images but is too slow and computationally intensive for many applications. On the other hand, the autoregressive models that power LLMs like ChatGPT are much faster, but they produce poorer-quality images that are often riddled with errors.

    Researchers from MIT and NVIDIA developed a new approach that brings together the best of both methods. Their hybrid image-generation tool uses an autoregressive model to quickly capture the big picture and then a small diffusion model to refine the details of the image.

    Their tool, known as HART (short for hybrid autoregressive transformer), can generate images that match or exceed the quality of state-of-the-art diffusion models, but do so about nine times faster.

    The generation process consumes fewer computational resources than typical diffusion models, enabling HART to run locally on a commercial laptop or smartphone. A user only needs to enter one natural language prompt into the HART interface to generate an image.

    HART could have a wide range of applications, such as helping researchers train robots to complete complex real-world tasks and aiding designers in producing striking scenes for video games.

    “If you are painting a landscape, and you just paint the entire canvas once, it might not look very good. But if you paint the big picture and then refine the image with smaller brush strokes, your painting could look a lot better. That is the basic idea with HART,” says Haotian Tang SM ’22, PhD ’25, co-lead author of a new paper on HART.

    He is joined by co-lead author Yecheng Wu, an undergraduate student at Tsinghua University; senior author Song Han, an associate professor in the MIT Department of Electrical Engineering and Computer Science (EECS), a member of the MIT-IBM Watson AI Lab, and a distinguished scientist of NVIDIA; as well as others at MIT, Tsinghua University, and NVIDIA. The research will be presented at the International Conference on Learning Representations.

    The best of both worlds

    Popular diffusion models, such as Stable Diffusion and DALL-E, are known to produce highly detailed images. These models generate images through an iterative process where they predict some amount of random noise on each pixel, subtract the noise, then repeat the process of predicting and “de-noising” multiple times until they generate a new image that is completely free of noise.

    Because the diffusion model de-noises all pixels in an image at each step, and there may be 30 or more steps, the process is slow and computationally expensive. But because the model has multiple chances to correct details it got wrong, the images are high-quality.

    Autoregressive models, commonly used for predicting text, can generate images by predicting patches of an image sequentially, a few pixels at a time. They can’t go back and correct their mistakes, but the sequential prediction process is much faster than diffusion.

    These models use representations known as tokens to make predictions. An autoregressive model utilizes an autoencoder to compress raw image pixels into discrete tokens as well as reconstruct the image from predicted tokens. While this boosts the model’s speed, the information loss that occurs during compression causes errors when the model generates a new image.

    With HART, the researchers developed a hybrid approach that uses an autoregressive model to predict compressed, discrete image tokens, then a small diffusion model to predict residual tokens. Residual tokens compensate for the model’s information loss by capturing details left out by discrete tokens.

    “We can achieve a huge boost in terms of reconstruction quality. Our residual tokens learn high-frequency details, like edges of an object, or a person’s hair, eyes, or mouth. These are places where discrete tokens can make mistakes,” says Tang.

    Because the diffusion model only predicts the remaining details after the autoregressive model has done its job, it can accomplish the task in eight steps, instead of the usual 30 or more a standard diffusion model requires to generate an entire image. This minimal overhead of the additional diffusion model allows HART to retain the speed advantage of the autoregressive model while significantly enhancing its ability to generate intricate image details.

    “The diffusion model has an easier job to do, which leads to more efficiency,” he adds.

    Outperforming larger models

    During the development of HART, the researchers encountered challenges in effectively integrating the diffusion model to enhance the autoregressive model. They found that incorporating the diffusion model in the early stages of the autoregressive process resulted in an accumulation of errors. Instead, their final design of applying the diffusion model to predict only residual tokens as the final step significantly improved generation quality.

    Their method, which uses a combination of an autoregressive transformer model with 700 million parameters and a lightweight diffusion model with 37 million parameters, can generate images of the same quality as those created by a diffusion model with 2 billion parameters, but it does so about nine times faster. It uses about 31 percent less computation than state-of-the-art models.

    Moreover, because HART uses an autoregressive model to do the bulk of the work — the same type of model that powers LLMs — it is more compatible for integration with the new class of unified vision-language generative models. In the future, one could interact with a unified vision-language generative model, perhaps by asking it to show the intermediate steps required to assemble a piece of furniture.

    “LLMs are a good interface for all sorts of models, like multimodal models and models that can reason. This is a way to push the intelligence to a new frontier. An efficient image-generation model would unlock a lot of possibilities,” he says.

    In the future, the researchers want to go down this path and build vision-language models on top of the HART architecture. Since HART is scalable and generalizable to multiple modalities, they also want to apply it for video generation and audio prediction tasks.

    This research was funded, in part, by the MIT-IBM Watson AI Lab, the MIT and Amazon Science Hub, the MIT AI Hardware Program, and the U.S. National Science Foundation. The GPU infrastructure for training this model was donated by NVIDIA. 

    Source: https://news.mit.edu/

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Art Ryan

    Related Posts

    AI Drug Development Johnson & Johnson Impact on Healthcare

    April 28, 2026

    Qualcomm OpenAI AI Smartphone Processors Partnership News

    April 28, 2026

    Google AI Campus South Korea and Its Development Plans

    April 28, 2026

    Comments are closed.

    Latest News

    AI Drug Development Johnson & Johnson Impact on Healthcare

    April 28, 2026

    Qualcomm OpenAI AI Smartphone Processors Partnership News

    April 28, 2026

    Google AI Campus South Korea and Its Development Plans

    April 28, 2026

    Accenture Copilot Rollout Enhances Employee Productivity

    April 28, 2026
    Facebook X (Twitter) Pinterest Vimeo WhatsApp TikTok Instagram LinkedIn YouTube Spotify Reddit Snapchat Threads

    AI University

    • Global Universities
    • Universities in Africa
    • Universities in Asia
    • Universities in Europe
    • Universities in Latin America
    • Universities in Middle East
    • Universities in North America
    • Universities in Oceania

    AI Tools & Apps Directory

    • AI Productivity Tools
    • AI Coding Tools
    • AI Voice Tools
    • AI Video Tools
    • AI Image Generators
    • AI Writing Tools

    Info

    • Home
    • About Us
    • AI Organizations & Associations
    • Contact Us

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    © 2026 Breaking AI News.
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.