Труды Института системного программирования РАН (Oct 2018)

Extracting Objects and Their Attributes from Tables in Text Documents

  • Nikita Astrakhantsev

Journal volume & issue
Vol. 21, no. 0

Abstract

Read online

Extracting information from tables is an important and rather complex part of information retrieval. For the task of objects extraction from HTML tables we introduce the following methods: determining table orientation, processing of aggregating objects (like Total) and scattered headers (super row labels, subheaders).

Keywords