Dynamic memory compression: Retrofitting LLMs for accelerated inference

Piotr Nawrot*, Adrian Łańcucki, Marcin Chochowski, David Tarjan, Edoardo M. Ponti

*Corresponding author for this work

Research output: Contribution to conferencePosterpeer-review

Original languageEnglish
DOIs
Publication statusAccepted/In press - 2 May 2024
EventThe 41st International Conference on Machine Learning - Vienna, Austria
Duration: 21 Jul 202427 Jul 2024
https://icml.cc/

Conference

ConferenceThe 41st International Conference on Machine Learning
Abbreviated titleICML 2024
Country/TerritoryAustria
CityVienna
Period21/07/2427/07/24
Internet address

Cite this