Data-Driven Intelligence System for General Recommendations of Deep Learning Architectures

Gjorgji Noveski; Tome Eftimov; Kostadin Mishev; Monika Simjanoska

doi:10.1109/access.2021.3124633

IEEE Access (Jan 2021)

Data-Driven Intelligence System for General Recommendations of Deep Learning Architectures

Gjorgji Noveski,
Tome Eftimov,
Kostadin Mishev,
Monika Simjanoska

Affiliations

Gjorgji Noveski: ORCiD; Department of Intelligent Systems, Jožef Stefan Institute, Ljubljana, Slovenia
Tome Eftimov: ORCiD; Computer Systems Department, Jožef Stefan Institute, Ljubljana, Slovenia
Kostadin Mishev: ORCiD; Faculty of Computer Science and Engineering, Saints Cyril and Methodius University in Skopje, Skopje, North Macedonia
Monika Simjanoska: ORCiD; Faculty of Computer Science and Engineering, Saints Cyril and Methodius University in Skopje, Skopje, North Macedonia

DOI: https://doi.org/10.1109/access.2021.3124633
Journal volume & issue: Vol. 9
pp. 148710 – 148720

Abstract

Read online

Choosing optimal Deep Learning (DL) architecture and hyperparameters for a particular problem is still not a trivial task among researchers. The most common approach relies on popular architectures proven to work on specific problem domains led on the same experiment environment and setup. However, this limits the opportunity to choose or invent novel DL networks that could lead to better results. This paper proposes a novel approach for providing general recommendations of an appropriate DL architecture and its hyperparameters based on different configurations presented in thousands of published research papers that examine various problem domains. This architecture can further serve as a starting point of investigating DL architecture for a concrete data set. Natural language processing (NLP) methods are used to create structured data from unstructured scientific papers upon which intelligent models are learned to propose optimal DL architecture, layer type, and activation functions. The advantage of the proposed methodology is multifold. The first is the ability to eventually use the knowledge and experience from thousands of DL papers published through the years. The second is the contribution to the forthcoming novel researches by aiding the process of choosing optimal DL setup based on the particular problem to be analyzed. The third advantage is the scalability and flexibility of the model, meaning that it can be easily retrained as new papers are published in the future, and therefore to be constantly improved.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords