A short introduction to RLHF and post-training focused on language models by Nathan Lambert
https://rlhfbook.com/