Tyler's Technical Blog

follow: @[email protected]

Posts

NanoGPT Speedrun Living Worklog

Reducing VRAM Footprint in PPO and GRPO Using Selective Log-Softmax

An Extension to BADGE Active Learning for Variable-Sized Batches

Direct Preference Optimization Explained In-depth