Week 8: LLM Inference and Scaling Laws

Reminders and announcements

Tutorial 3 group meetings are this week. This tutorial will tackle issues with web-scale data, dialect and discrimination, and possible mismatches between the preferences of users and those of developers/researchers.
This week, we also have a guest lecture by Edoardo's PhD student, Piotr Nawrot. The guest lecture will focus on sparse attention in Transformers and its accuracy–efficiency trade-offs. The contents of the guest lecture are not examinable.
Readings for next week (Post-training: Instruction Tuning, Alignment, and Test-Time Compute):
- If you're using the pdf: 10.1-10.3
- If you're using the website: 9.1-9.3

Overview of the Week

This week, we continue our explorations of LLMs. Specifically, we will discuss how GPT3 introduced the notion of in-context learning (ICL), a test-time strategy to adapt a model towards a specific task without updating its parameters. This is achieved by prepending a small number of input–output examples to the LLM's context. Afterwards, we will see how the accuracy of a model across tasks can be predicted reliably by scaling laws that depend on the model size (number of parameters), number of datapoints, and number of training steps. The Friday lecture (a non-examinable guest lecture) will then touch upon efficient variants of attention, where the attention weights are sparse.

Lectures and reading

Lecture #	Who?	Slides	Reading
1	EP	Prompting and In-context Learning	7.3-7.5 (*)
2	EP	Scaling Laws and LLM Evaluation	8.8.1 (), 7.6 ()
3	Piotr Nawrot (Guest lecturer)	Memory Compression and Attention Sparsity (not examinable)	Optional reading: The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference

Lecture #

Who?

Slides

Reading

Prompting and In-context Learning

7.3-7.5 (*)

Scaling Laws and LLM Evaluation

8.8.1 (*), 7.6 (*)

Piotr Nawrot (Guest lecturer)

Memory Compression and Attention Sparsity (not examinable)

Optional reading:

License

Reminders and announcements

Overview of the Week

Lectures and reading

Search

Navigation