LLM Post-Training with TRL: SFT, DPO, and GRPO | encorp.ai