Extreme Compression of Large Language Models via Additive Quantization

Extreme Compression of Large Language Models via Additive Quantization