Hello friends, welcome to my website.
This is mainly for recording my learning notes, sharing my thoughts and experiences, and connecting with like-minded individuals.
Feel free to explore the different sections of my website and don't hesitate to reach out if you have any questions or would like to connect.
Blog Posts

Meta Llama family
Description of the construction of Meta Llama family models
Jun 24, 2025
LLM
Llama
Gumbel-Softmax
Jun 17, 2025
Probability

Efficient LLM inference (Episode 1)
On tackling long sequence
Jun 13, 2025
LLM
Inference
Long sequence

Online softmax
A classical tool for computing block-wise attention
Jun 5, 2025
LLM
Technique

Overview of Large Model Lightweighting Techniques
MCP generated
May 29, 2025
LLM
Lightweighting

Model Context Protocol (MCP)
A brief introduction of Model Context Protocol
May 28, 2025
MCP
LLM

Understanding transformer—Statistics
On the storage and computation overhead
May 22, 2025
Transformer
LLM
Foundation

Understanding transformer—KV cache
Accelerating inference by using KV cache
May 22, 2025
Transformer
Foundation
LLM

Understanding transformer
refer to "Attention is all you need." Vaswani, Ashish, et al. Advances in NeurIPS 30 (2017).
May 18, 2025
Foundation
LLM

Operator
This is the basis of operator theory (from Stephen Boyd, course EE364b, Stanford University)
Oct 17, 2023
Operator
Optimization