LLM Quantization
8bit-Quantization Implementation for LLama-2-7b Model
8bit-Quantization Implementation for LLama-2-7b Model
Notes on topics related to linear attention.
Lecture Notes on ML Compilation by Tianqi Chen
Write a memory allocator from scratch.
Guide to implement FFT in C++ with parallelism