Benchmarks for lots of quantization types in llama.cpp
Recently, I noticed that lots of new quantization types were added to llama.cpp. So just curious, I decided to some simple tests on every llama.cpp’s quantization types. There are total 27 types of quantization in llama.cpp including F16 and F3