6 Comments
User's avatar
Damom's avatar

+ https://arxiv.org/abs/2404.16811 = perfect FT model?? Reg - Damon

Expand full comment
Trelis Research's avatar

Nice yeah it makes intuitive sense that a custom dataset requiring answers from questions/data in the middle of text can improve the lost-in-the-middle problem.

Expand full comment
Damom's avatar

Hi Roman, look at https://arxiv.org/abs/2405.16684 new logic concept: scale law/parameters/gzip

Expand full comment
Trelis Research's avatar

Interesting, so compressibility gives some sense of the data quality. Seems that if your data is more compressible, you need more of it (or at least should weigh towards more data than increasing model size).

Expand full comment
codeforfun's avatar

is there a recording for this session? Thanks

Expand full comment