Exploring the applicability of low-shot learning in mining software repositories

Jordan Ott; Abigail Atchison; Erik J. Linstead

doi:10.1186/s40537-019-0198-z

Journal of Big Data (May 2019)

Exploring the applicability of low-shot learning in mining software repositories

Jordan Ott,
Abigail Atchison,
Erik J. Linstead

Affiliations

Jordan Ott: School of Information and Computer Science, University of California, Irvine
Abigail Atchison: Fowler School of Engineering, Chapman University
Erik J. Linstead: Fowler School of Engineering, Chapman University

DOI: https://doi.org/10.1186/s40537-019-0198-z
Journal volume & issue: Vol. 6, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background Despite the well-documented and numerous recent successes of deep learning, the application of standard deep architectures to many classification problems within empirical software engineering remains problematic due to the large volumes of labeled data required for training. Here we make the argument that, for some problems, this hurdle can be overcome by taking advantage of low-shot learning in combination with simpler deep architectures that reduce the total number of parameters that need to be learned. Findings We apply low-shot learning to the task of classifying UML class and sequence diagrams from Github, and demonstrate that surprisingly good performance can be achieved by using only tens or hundreds of examples for each category when paired with an appropriate architecture. Using a large, off-the-shelf architecture, on the other hand, doesn’t perform beyond random guessing even when trained on thousands of samples. Conclusion Our findings suggest that identifying problems within empirical software engineering that lend themselves to low-shot learning could accelerate the adoption of deep learning algorithms within the empirical software engineering community.

Published in Journal of Big Data

ISSN: 2196-1115 (Online)
Publisher: SpringerOpen
Country of publisher: United Kingdom
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://journalofbigdata.springeropen.com

About the journal

Abstract

Keywords