I’m Trung Vu. a photo of me, circa 2024 
I’m currently at Bespoke Labs, where I’m leading multiple efforts on data curation and reinforcement learning for LLMs. Prior to Bespoke Labs, I was a tech lead on the recommendations team at YouTube.
Selected projects:
- Open Thoughts: A state-of-the-art open reasoning dataset and data recipe that comes with an extensive set of ablations.
- Stratos: One of the first DeepSeek-R1 distilled datasets, outperforming o1-preview with a 32B model trained on 17k examples.
- Improving Multi-Turn Tool Use with Reinforcement Learning: An experiment in using reinforcement learning to improve multi-turn tool use capabilities for LLM agents.
- Semantic IDs: A tokenization algorithm developed at YouTube that allowed us to use LLMs as recommenders and mitigate cold start problems. This approach has quickly become the industry standard and adopted by other companies (Meta, Tencent, Snap, Spotify, to name a few).