Publications
You can also find my articles on my Google Scholar profile.
Jiao, Qirui, Daoyuan Chen, Yilun Huang, Yaliang Li, and Ying Shen. "Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models." arXiv preprint arXiv:2408.04594 (2024). [link][code]
Chen, Daoyuan, Haibin Wang, Yilun Huang, Ce Ge, Yaliang Li, Bolin Ding, and Jingren Zhou. "Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development." arXiv preprint arXiv:2407.11784 (2024). [link][code]
Qin, Zhen, Daoyuan Chen, Wenhao Zhang, Liuyi Yao, Yilun Huang, Bolin Ding, Yaliang Li, and Shuiguang Deng. "The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective." arXiv preprint arXiv:2407.08583 (2024). [link][code]
Jiao, Qirui, Daoyuan Chen, Yilun Huang, Yaliang Li, and Ying Shen. "Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study." arXiv preprint arXiv:2401.17981 (2024). [link]
Chen, Daoyuan, Yilun Huang, Zhijian Ma, Hesen Chen, Xuchen Pan, Ce Ge, Dawei Gao et al. "Data-juicer: A one-stop data processing system for large language models." In Companion of the 2024 International Conference on Management of Data, pp. 120-134. 2024. [link][code]
Shen, Xuan*, Yaohua Wang*, Ming Lin, Yilun Huang, Hao Tang, Xiuyu Sun, and Yanzhi Wang. "DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6163-6173. 2023. [link][code]
Xu, Xianzhe*, Yiqi Jiang*, Weihua Chen*, Yilun Huang*, Yuan Zhang*, and Xiuyu Sun. "DAMO-YOLO: A Report on Real-Time Object Detection Design." arXiv preprint arXiv:2211.15444 (2022). [link][code]
Li, Yibo, Yilun Huang, Ziyi Zhu, Lemeng Pan, Yongshuai Huang, Lin Du, Zhi Tang, and Liangcai Gao. "Rethinking table structure recognition using sequence labeling methods." In International Conference on Document Analysis and Recognition, pp. 541-553. Springer, Cham, 2021. [link][code]
Zhu, Ziyi, Liangcai Gao, Yibo Li, Yilun Huang, Lin Du, Ning Lu, and Xianfeng Wang. "NTable: A Dataset for Camera-Based Table Detection." In International Conference on Document Analysis and Recognition, pp. 117-129. Springer, Cham, 2021. [link][code]
Huang, Yilun, Qinqin Yan, Yibo Li, Yifan Chen, Xiong Wang, Liangcai Gao, and Zhi Tang. "A YOLO-based table detection method." In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 813-818. IEEE, 2019. [link]
Li, Yibo, Liangcai Gao, Zhi Tang, Qinqin Yan, and Yilun Huang. "A GAN-based feature generator for table detection." In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 763-768. IEEE, 2019. [link]
Gao, Liangcai, Yilun Huang, Hervé Déjean, Jean-Luc Meunier, Qinqin Yan, Yu Fang, Florian Kleber, and Eva Lang. "ICDAR 2019 competition on table detection and recognition (cTDaR)." In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1510-1515. IEEE, 2019. [link][code]