A New Methodology Based on Imbalanced Classification for Predicting Outliers in Electricity Demand Time Series

Francisco Javier Duque-Pintor; Manuel Jesús Fernández-Gómez; Alicia Troncoso; Francisco Martínez-Álvarez

doi:10.3390/en9090752

Energies (Sep 2016)

A New Methodology Based on Imbalanced Classification for Predicting Outliers in Electricity Demand Time Series

Francisco Javier Duque-Pintor,
Manuel Jesús Fernández-Gómez,
Alicia Troncoso,
Francisco Martínez-Álvarez

Affiliations

Francisco Javier Duque-Pintor: Division of Computer Science, Universidad Pablo de Olavide, ES-41013 Seville, Spain
Manuel Jesús Fernández-Gómez: Division of Computer Science, Universidad Pablo de Olavide, ES-41013 Seville, Spain
Alicia Troncoso: Division of Computer Science, Universidad Pablo de Olavide, ES-41013 Seville, Spain
Francisco Martínez-Álvarez: Division of Computer Science, Universidad Pablo de Olavide, ES-41013 Seville, Spain

DOI: https://doi.org/10.3390/en9090752
Journal volume & issue: Vol. 9, no. 9
p. 752

Abstract

Read online

The occurrence of outliers in real-world phenomena is quite usual. If these anomalous data are not properly treated, unreliable models can be generated. Many approaches in the literature are focused on a posteriori detection of outliers. However, a new methodology to a priori predict the occurrence of such data is proposed here. Thus, the main goal of this work is to predict the occurrence of outliers in time series, by using, for the first time, imbalanced classification techniques. In this sense, the problem of forecasting outlying data has been transformed into a binary classification problem, in which the positive class represents the occurrence of outliers. Given that the number of outliers is much lower than the number of common values, the resultant classification problem is imbalanced. To create training and test sets, robust statistical methods have been used to detect outliers in both sets. Once the outliers have been detected, the instances of the dataset are labeled accordingly. Namely, if any of the samples composing the next instance are detected as an outlier, the label is set to one. As a study case, the methodology has been tested on electricity demand time series in the Spanish electricity market, in which most of the outliers were properly forecast.

Published in Energies

ISSN: 1996-1073 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology
Website: http://www.mdpi.com/journal/energies

About the journal

Abstract

Keywords