Model Update2026-03-08VentureBeat

New Technique Cuts LLM Memory Use 50x Without Loss

Researchers at MIT have made a breakthrough that could dramatically lower the cost and expand the reach of large language models in enterprise settings. They developed a novel Key-Value (KV) cache compaction technique that can reduce the memory footprint of LLMs by up to 50 times without sacrificing accuracy. The KV cache is a critical memory component that stores temporary data during text generation, and its size grows linearly with the length of the conversation or document, becoming a major

Noticias relacionadas

Más noticias de IA

AIStart.ai · Tu Launchpad personal de IA