Hi Ronan, how is watching your video different from the fine-tuning repo on trelis.com? Did you skip any parts of the code in the video, or are they the same?
sorry for the slow reply and thanks for the q on the livestream today. The answer is that the repos provide the code that goes with the videos. If you want all the detail of the code, or to modify it, then that’s what the repos provide - as well as support via Github issues. Generally I try to cover most content in the videos, but often there isn’t time to cover all detail.
Thanks! That would be great. It seems like there's some issue right now -- the SFT works, but trying to use vLLM on TRL seems buggy (I think maybe something with the MoE or weights not being setup to easily be updated on the vLLM side?)
Seems like a great model size for GRPO training though on a single node.
Hi Ronan, how is watching your video different from the fine-tuning repo on trelis.com? Did you skip any parts of the code in the video, or are they the same?
sorry for the slow reply and thanks for the q on the livestream today. The answer is that the repos provide the code that goes with the videos. If you want all the detail of the code, or to modify it, then that’s what the repos provide - as well as support via Github issues. Generally I try to cover most content in the videos, but often there isn’t time to cover all detail.
Hey question on GPT OSS. Have you been able to get it to work in GRPO with TRL / vLLM?
Howdy!
I haven't run with TRL / vLLM for GRPO but in principle it should work. I'm planning to wait on unsloth and perhaps do a type of run then.
Thanks! That would be great. It seems like there's some issue right now -- the SFT works, but trying to use vLLM on TRL seems buggy (I think maybe something with the MoE or weights not being setup to easily be updated on the vLLM side?)
Seems like a great model size for GRPO training though on a single node.