Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features

GAO Zhen-zhuo, WANG Zhi-hai, LIU Hai-yang

doi:10.11896/jsjkx.210700226

Jisuanji kexue (Jul 2022)

Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features

GAO Zhen-zhuo, WANG Zhi-hai, LIU Hai-yang

Affiliations

GAO Zhen-zhuo, WANG Zhi-hai, LIU Hai-yang: School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China;Beijing Key Laboratory of Traffic Data Analysis and Mining,Beijing 100044,China

DOI: https://doi.org/10.11896/jsjkx.210700226
Journal volume & issue: Vol. 49, no. 7
pp. 40 – 49

Abstract

Read online

In recent years,the research on the classification of time series has attracted more and more attention.Advanced time series classification methods are usually based on great feature representations.Shapelet refers to the discriminative subsequences in time series,which can effectively express the local shape characteristics of time series.However,the high computational cost greatly limits the practicability of the Shapelet-based time series classification methods.In addition,traditional Shapelet can only describe the overall shape characteristics of the subsequence under Euclidean distance metric,so it is easy to be disturbed by noise and is difficult to mine other types of discriminative information contained in the subsequence.To deal with the aforementioned problems,a new time series classification algorithm,named random Shapelet forest embedded with canonical time series features,is proposed in this paper.The proposed algorithm is based on the following three key strategies:1)randomly select Shapelet and limit the scope of Shapelet to improve efficiency;2)embed multiple canonical time series features in Shapelet to improve the adaptability of the algorithm to different classification problems and make up for the accuracy loss caused by the random selection of Shapelet;3)build a random forest classifier based on the new feature representations to ensure the generalization ability of the algorithm.Experimental results on 112 UCR time series datasets show that the proposed algorithm is more accurate than the STC algorithm which is based on Shapelet exact search and the Shapelet transform technique,as well as many other types of state-of-the-art time series classification algorithms.Moreover,extensive experimental comparisons verify the significant advantages of the proposed algorithm in terms of efficiency.

time series|classification|shapelet|canonical time series features|random forest

Published in Jisuanji kexue

ISSN: 1002-137X (Print)
Publisher: Editorial office of Computer Science
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science: Computer software; Technology: Technology (General)
Website: http://www.jsjkx.com/CN/1002-137X/home.shtml

About the journal

Abstract

Keywords