LLM Theory
Published:
Awesome LLM theory articles I chanced upon
- State of LLMs in 2023 by Hyung Won Chung
- Transformer Math 101 by Eleuther AI.
- Transformer Inference Arithmetic by Kipply.
less than 1 minute read
Published:
Awesome LLM theory articles I chanced upon
2 minute read
Published:
Retrieval Augmented Generation (RAG) is a framework for retrieving facts from an external knowledge base to ground large language models (LLMs) on the most accurate, up-to-date information. RAG is increasingly popular in industry as it’s simple to implement yet powerful. Here I’ll share some tricks to improve RAG systems.
less than 1 minute read
Published:
If we are allowed to train till convergence, we know that full finetuning is better than parameter efficient finetuning (PEFT). But what if we have a fixed compute budget? Given a fixed budget, PEFT can go through significantly more tokens. Will full finetuning still be better than PEFT?
2 minute read
Published:
How do we train causal language models (e.g. Alpaca, LLaMA, gpt-neox-20b…) with seq2seq objective? This goal is important because we want to instruction-tune our causal LMs, especially since Alpaca is the best open model at time of writing.
2 minute read
Published:
Compilation of tricks I find useful.