Jurnal Nasional Teknik Elektro dan Teknologi Informasi (Aug 2016)

Pengembangan Engine Integrasi Tabel HTML pada Halaman Web

  • Memen Akbar,
  • Fazat Nur Azizah ,
  • G. A. Putri Saptawati

DOI
https://doi.org/10.22146/jnteti.v5i3.254
Journal volume & issue
Vol. 5, no. 3
pp. 177 – 183

Abstract

Read online

Two problems are arisen while integrating number of tables from number of web pages, i.e. structural conflict and semantic conflict. To tackle those problems, the proposed study combines some existing methods that are already proven to solve problems in integrating process. The proposed integration process of HTML table consists of 4 phases: (1) locating the table in web pages, (2) separating attributes and data values, (3) integrating the table scheme, (4) migrating the data values into integrated scheme. Table location in web page is determined using heuristic approach. This approach also can separate the attributes and the data values of the table. Semantic conflict that is apparent while integrating the table scheme is handled using domain specific ontology. The resulted data value, then, is migrated to table scheme in line with duplication data checking using vector space model. Result of the integration is presented as single HTML table. This approach is implemented as an engine that is coded using Phyton language. Result of experiment shows that the proposed approach can be used to integrate number of HTML table from number of web pages into a single integrated table.

Keywords