Applied Sciences (Nov 2024)
Walking Back the Data Quantity Assumption to Improve Time Series Prediction in Deep Learning
Abstract
Deep learning techniques have significantly advanced time series prediction by effectively modeling temporal dependencies, particularly for datasets with numerous observations. Although larger datasets are generally associated with improved accuracy, the results of this study demonstrate that this assumption does not always hold. By progressively increasing the amount of training data in a controlled experimental setup, the best predictive metrics were achieved in intermediate iterations, with variations of up to 66% in RMSE and 44% in MAPE across different models and datasets. The findings challenge the notion that more data necessarily leads to better generalization, showing that additional observations can sometimes result in diminishing returns or even degradation of predictive metrics. These results emphasize the importance of strategically balancing dataset size and model optimization to achieve robust and efficient performance. Such insights offer valuable guidance for time series forecasting, especially in contexts where computational efficiency and predictive accuracy must be optimized.
Keywords