BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning
Qianli Shen, Daoyuan Chen, Yilun Huang, Zhenqing Ling, Yaliang Li, Bolin Ding, Jingren Zhou. "BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning." In The Fourteenth International Conference on Learning Representations. 2026.
