ADD ANI AS A TRUSTED SOURCE
googleads
ANI Logo
Menu
Business

Yandex researchers develop new methods for compressing large language models, cutting AI deployment costs by up to 8 times

Bangalore (Karnataka) [India], July 29: The Yandex Research team, in collaboration with researchers from IST Austria, NeuralMagic, and KAUST, have developed two innovative compression methods for large language models: Additive Quantization for Language Models (AQLM) and PV-Tuning. When combined, these methods allow for a reduction in model size by up to 8 times while preserving response quality by 95%. The methods aim to optimize resources and enhance efficiency in running large language models. The research article detailing this approach has been featured at the International Conference on Machine Learning (ICML), currently underway in Vienna, Austria.

ANI Jul 29, 2024 18:54 IST googleads

Yandex researchers develop new methods for compressing large language models, cutting AI deployment costs by up to 8 times

VMPL
Bangalore (Karnataka) [India], July 29: The Yandex Research team, in collaboration with researchers from IST Austria, NeuralMagic, and KAUST, have developed two innovative compression methods for large language models: Additive Quantization for Language Models (AQLM) and PV-Tuning. When combined, these methods allow for a reduction in model size by up to 8 times while preserving response quality by 95%. The methods aim to optimize resources and enhance efficiency in running large language models. The research article detailing this approach has been featured at the International Conference on Machine Learning (ICML), currently underway in Vienna, Austria.
Key features of AQLM and PV-Tuning
AQLM leverages additive quantization, traditionally used for information retrieval, for LLM compression. The resulting method preserves and even improves model accuracy under extreme compression, making it possible to deploy LLMs on everyday devices like home computers and smartphones. This results in a significant reduction in memory consumption.
PV-Tuning addresses errors that may arise during the model compression process. When combined, AQLM and PV-Tuning deliver optimal results -- compact models capable of providing high-quality responses even on limited computing resources.
Method evaluation and recognition
The effectiveness of the methods was rigorously assessed using popular open source models such as LLama 2, Llama 3, Mistral, and others. Researchers compressed these large language models and evaluated answer quality against English-language benchmarks -- WikiText2 and C4 -- maintaining an impressive 95% answer quality as the models were compressed by 8 times.
Who can benefit from AQLM and PV-Tuning
The new methods offer substantial resource savings for companies involved in developing and deploying proprietary language models and open-source LLMs. For instance, the Llama 2 model with 13 billion parameters, post-compression, can now run on just 1 GPU instead of 4, reducing hardware costs by up to 8 times. This means that startups, individual researchers, and LLM enthusiasts can run advanced LLMs such as Llama on their everyday computers.
Exploring new LLM applications
AQLM and PV-Tuning make it possible to deploy models offline on devices with limited computing resources, enabling new use cases for smartphones, smart speakers, and more. With advanced LLMs integrated into them, users can use text and image generation, voice assistance, personalized recommendations, and even real-time language translation without needing an active internet connection.
Moreover, models compressed using the methods can operate up to 4 times faster, as they require fewer computations.
Implementation and access
Developers and researchers worldwide can already use AQLM and PV-Tuning, which are available on GitHub. Demo materials provided by the authors offer guidance for effectively training compressed LLMs for various applications. Additionally, developers can download popular open-source models that have already been compressed using the methods.
ICML highlight
A scientific article by Yandex Research on the AQLM compression method has been featured at ICML, one of the world's most prestigious machine learning conferences. Co-authored with researchers from IST Austria and experts from AI startup Neural Magic, this work signifies a significant advancement in LLM compression technology.
About Yandex
Yandex is a global technology company that builds intelligent products and services powered by machine learning. The company aims to help consumers and businesses better navigate the online and offline world. Since 1997, Yandex has been delivering world-class, locally relevant search and information services and has also developed market-leading on-demand transportation services, navigation products, and other mobile applications for millions of consumers across the globe.
For reference [additional details for media & journalists]
Deploying large language models (LLMs) on consumer hardware is challenging due to the inherent trade-off between model size and computational efficiency. Compression methods, such as quantization, have offered partial solutions, but often compromise model performance.
To address this challenge, researchers from Yandex Research, IST Austria, KAUST, and NeuralMagic developed two compression methods -- Additive Quantization for Language Models (AQLM) and PV-Tuning. AQLM reduces the bit count per model parameter to 2-3 bits while preserving or even enhancing model accuracy, particularly in extreme compression scenarios. PV-Tuning is a representation-agnostic framework that generalizes and improves upon existing fine-tuning strategies.
AQLM's key innovations include learned additive quantization of weight matrices, which adapts to input variability and joint optimization of codebook parameters across layer blocks. This dual strategy enables AQLM to outperform other compression techniques, setting new benchmarks in the field.
AQLM's practicality is demonstrated by its implementations on GPU and CPU architectures, making it suitable for real-world applications. Comparative analysis shows that AQLM can achieve extreme compression without compromising model performance, as evidenced by its superior results in metrics like model perplexity and accuracy in zero-shot tasks.
PV-Tuning provides convergence guarantees in restricted cases, and has been shown to outperform previous methods when used for 1-2 bit vector quantization on highly-performant models such as Llama and Mistral. By leveraging PV-Tuning, the researchers achieved the first Pareto-optimal quantization for Llama 2 models at 2 bits per parameter.
The effectiveness of the methods was rigorously assessed using popular open-source models such as LLama 2, Mistral, and Mixtral. Researchers compressed these large language models and evaluated answer quality against English-language benchmarks -- WikiText2 and C4 -- maintaining an impressive 95% answer quality as the models were compressed by 8 times.

* The closer the average accuracy of answers in tests is to the original model, the better the new methods are at preserving the quality of answers. The figures above show the combined results of the two methods, which compress the models by, on average, 8 times.
(ADVERTORIAL DISCLAIMER: The above press release has been provided by VMPL. ANI will not be responsible in any way for the content of the same)

Get the App

What to Read Next

Business

India Emerging as Stable Investment Anchor in Turbulent Global

India Emerging as Stable Investment Anchor in Turbulent Global

Mumbai (Maharashtra) [India], March 12: As military conflict in West Asia disrupts energy supplies through the Strait of Hormuz and global liquidity tightens, leading investors, policymakers and capital markets leaders gathered at IGF Mumbai 2026: Catalysing Capital to assess India's position in an increasingly fragmented global economy.

Read More
Business

India pushes for green ship recycling, euro-compliant yards

India pushes for green ship recycling, euro-compliant yards

India is rapidly expanding its ship recycling sector and upgrading shipbreaking yards to meet European environmental standards, as part of a broader effort to strengthen its maritime industry and reduce logistics costs, Sushant Kumar Purohit, Chairperson of VO Chidambaranar Port Authority, said today.

Read More
Business

Sarbabharatiya Sangeet O Sanskriti Parishad Convenes 48th Annual

Sarbabharatiya Sangeet O Sanskriti Parishad Convenes 48th Annual

Kolkata (West Bengal) [India], March 12: Sarbabharatiya Sangeet O Sanskriti Parishad officially commenced its 48th Annual Convocation yesterday, March 11, at the historic Mahajati Sadan, Kolkata. The three-day event, running from March 11 to 13, celebrates the institution's legacy of cultural service and its mission to bridge traditional heritage with a modernized future.

Read More
Business

Indian envoy in Shanghai meets Ant Group top official

Indian envoy in Shanghai meets Ant Group top official

Consulate General of India in Shanghai Pratik Mathur on Thursday met Carrie Suen, Vice President and Head of Global Affairs and Strategic Development of Ant Group.

Read More
Business

Nandita Desai Unveils a Unique Painting Exhibition on Vintage

Nandita Desai Unveils a Unique Painting Exhibition on Vintage

New Delhi [India], March 12: There is something quietly powerful about a window... It neither confines nor escapes. It simply allows us to look, to pause, to breathe between inner and outer worlds. In The Painted Window, multi-award-winning contemporary artist Nandita Desai turns this everyday architectural element into the soul of her fifth solo exhibition, transforming vintage and handcrafted windows into luminous works of art. Running from 16th to 21st March 2026 at the Kamalnayan Bajaj Art Gallery, Nariman Point, Mumbai, the exhibition brings together 50 artworks - windows that look outward at the world, and inward at memory and quiet reflection.

Read More
Business

AdvantageClub.ai Celebrates 100 Global Women HR Leaders Driving

AdvantageClub.ai Celebrates 100 Global Women HR Leaders Driving

Gurugram (Haryana) [India], March 12: AdvantageClub.ai, a global AI-powered employee rewards, recognition and wellbeing platform, has unveiled the winners of the Most Admired Women Awards (MAW) 2026, honouring 100 outstanding women HR leaders who are driving transformation across the global workplace landscape.

Read More
Business

Kody Technolab Ltd. Launches Medigo Robot

Kody Technolab Ltd. Launches Medigo Robot

Ahmedabad (Gujarat) [India], March 12: Kody Technolab Limited today announced the launch of Medigo Robot, a health screening robot developed to enable rapid preventive health assessments and expand access to routine screening across healthcare, public, and institutional environments.

Read More
Business

Finkurve Financial Services Limited (Arvog)

Finkurve Financial Services Limited (Arvog)

Mumbai (Maharashtra) [India], March 12: Finkurve Financial Services Limited (BSE: 508954), among leading Tech-first Gold Loan NBFC, announced that the Company has crossed Rs. 1,035 crore+ in Assets Under Management (AUM) surged by nearly 10x compared to FY23, marking a significant milestone in the company's growth trajectory within India's secured lending ecosystem.

Read More
Business

PM Narendra Modi To Headline NXT Summit 2026 Today

PM Narendra Modi To Headline NXT Summit 2026 Today

New Delhi [India], March 12: Prime Minister Narendra Modi will headline the NXT Summit 2026 in New Delhi today, March 12. PM Modi will inaugurate the Bharat Progress Report and deliver the chief guest's address at the three-day global leadership forum.

Read More
Business

Gold Winner Expands Legacy with Launch of New Edible Oil Range

Gold Winner Expands Legacy with Launch of New Edible Oil Range

Chennai (Tamil Nadu) [India], March 12: Gold Winner, one of South India's most trusted edible oil brands, is expanding its legacy of quality and reliability with the launch of four traditional oils -- Gold Winner Groundnut Oil, Gold Winner Gingelly Oil, Gold Winner Coconut Oil, and Gold Winner Rice Bran Oil. With this expansion, the brand aims to position itself as the single trusted name for all cooking oil needs in Indian households.

Read More
Home About Us Our Products Advertise Contact Us Terms & Condition Privacy Policy

Copyright © aninews.in | All Rights Reserved.