NTable: A Dataset for Camera-Based Table Detection
Published in International Conference on Document Analysis and Recognition, 2021
Recommended citation: Zhu, Ziyi, Liangcai Gao, Yibo Li, Yilun Huang, Lin Du, Ning Lu, and Xianfeng Wang. "NTable: A Dataset for Camera-Based Table Detection." In International Conference on Document Analysis and Recognition, pp. 117-129. Springer, Cham, 2021.
Abstraction: Comparing with raw textual data, information in tabular format is more compact and concise, and easier for comparison, retrieval, and understanding. Furthermore, there are many demands to detect and extract tables from photos in the era of Mobile Internet. However, most of the existing table detection methods are designed for scanned document images or Portable Document Format (PDF). And tables in the real world are seldom collected in the current mainstream table detection datasets. Therefore, we construct a dataset named NTable for camera-based table detection. NTable consists of a smaller-scale dataset NTable-ori, an augmented dataset NTable-cam, and a generated dataset NTable-gen. The experiments demonstrate deep learning methods trained on NTable improve the performance of spotting tables in the real world. We will release the dataset to support the development and evaluation of more advanced methods for table detection and other further applications in the future.