LLM Quantization

8bit-Quantization Implementation for LLama-2-7b Model

February 23, 2025 · 18 min · Zhiyang Shen

Write a Memory Allocator for PyTorch

Write a memory allocator from scratch.

May 16, 2024 · 7 min · Zhiyang Shen

FFT On the Road

Guide to implement FFT in C++ with parallelism

April 20, 2024 · 5 min · Zhiyang Shen

Random Number Generator

Lecture notes on RNG

April 20, 2024 · 3 min · Zhiyang Shen

Alias Method for Sampling from Discrete Distribution

An Introduction to Alias Method

April 13, 2024 · 1 min · Zhiyang Shen