New Technique Cuts LLM Memory Use 50x Without Loss

Researchers at MIT have made a breakthrough that could dramatically lower the cost and expand the reach of large language models in enterprise settings. They developed a novel Key-Value (KV) cache compaction technique that can reduce the memory footprint of LLMs by up to 50 times without sacrificing accuracy. The KV cache is a critical memory component that stores temporary data during text generation, and its size grows linearly with the length of the conversation or document, becoming a major

Leer original

Noticias relacionadas

Amazon bringt OpenAI auf AWS und signalisiert Cloud-Wandel2026-04-30 · VentureBeat
Xiaomi MiMo-V2.5 Open-Source-Modelle zeichnen sich bei agentischen Aufgaben aus2026-04-29 · VentureBeat
Poolside bringt kostenloses Open-Source-Modell für lokales agentisches Programmieren auf den Markt2026-04-29 · VentureBeat
NVIDIA bringt multimodales Modell Nemotron 3 Nano Omni auf den Markt2026-04-29 · NVIDIA AI Blog
Microsoft und OpenAI lösen exklusive Partnerschaftsvereinbarung auf2026-04-28 · VentureBeat

Más noticias de IA

AIStart.ai · Tu Launchpad personal de IA