MP-SPILDL: A Massively Parallel Inductive Logic Learner in Description Logic

Eyad Algahtani

doi:10.1109/ACCESS.2024.3458814

IEEE Access (Jan 2024)

MP-SPILDL: A Massively Parallel Inductive Logic Learner in Description Logic

Eyad Algahtani

Affiliations

Eyad Algahtani: ORCiD; Department of Information Systems, College of Applied Computer Science, King Saud University, Riyadh, Saudi Arabia

DOI: https://doi.org/10.1109/ACCESS.2024.3458814
Journal volume & issue: Vol. 12
pp. 130884 – 130895

Abstract

Read online

This article presents MP-SPILDL, a massively parallel inductive logic learner in Description Logic (DL). MP-SPILDL is a scalable inductive Logic Programming (ILP) algorithm that exploits existing Big Data infrastructure to perform large-scale inductive logic learning in DL (the $\mathcal {ALCQI}^{\mathcal {(D)}}$ DL language in particular). MP-SPILDL targets accelerating both hypothesis search and hypothesis evaluation by aggregating the computing power of multi-core CPUs with their vector/SIMD instructions and multi-GPUs in a Hadoop cluster. In terms of hypothesis search, MP-SPILDL employs a novel MapReduce-based algorithm that performs distributed parallel hypothesis search. MP-SPILDL also employs a novel MapReduce-based procedure that eliminates all redundant hypotheses generated after each learning iteration. Moreover, MP-SPILDL utilizes deterministic ordering of hypotheses’ operands to avoid exploring redundant areas of the search space, similar to the DL-Learner, the state of the art in DL-based ILP literature. In terms of hypothesis evaluation, MP-SPILDL performs parallel hypothesis evaluation, which uses all CPU cores combined with their vector instructions and all multi-GPUs of all machines in the Hadoop cluster. According to the experimental results using an Apache Spark implementation on a Hadoop cluster of three worker machines (36 total CPU cores, 7 total GPUs), MP-SPILDL achieved speedups of up to 13.3 folds using parallel beam search with $beamWidth = 32$ and CPU-based vectorized hypothesis evaluation – the best-case scenario. On small datasets such as Michalski’s trains, MP-SPILDL achieved a slower performance than the baseline, representing the worst-case scenario.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords