Intelligent Systems with Applications (Feb 2023)

Click prediction boosting via Bayesian hyperparameter optimization-based ensemble learning pipelines

  • Çağatay Demirel,
  • A. Aylin Tokuç,
  • Ahmet Tezcan Tekin

Journal volume & issue
Vol. 17
p. 200185

Abstract

Read online

Online travel agencies (OTA's) advertise their website offers on meta-search bidding engines. The problem of predicting the number of clicks a hotel would receive for a given bid amount is an important step in the management of an OTA's advertisement campaign on a meta-search engine because bid times number of clicks defines the cost to be generated. Various regressors are ensembled in this work to improve click prediction performance. After preprocessing, the entire feature set is divided into 5 groups, with the training set preceding the test set in the time domain, and multi-set validation is applied. The training data for each validation set is then subjected to feature elimination, and the selected models are next validated with separate ensemble models based on the mean and weighted average of the test predictions. Additionally, a stacked meta-regressor is designed and tested, along with the complete train set, whose click prediction values are extracted in accordance with the out-of-fold prediction principle. The original feature set and the stacked input data are then combined, and level-1 regressors are trained once again to form blended meta-regressors. All individually trained models are then compared pairwise with their ensemble variations. Adjusted R2 score is chosen as the main evaluation metric. The meta-models with tree-based ensemble level-1 regressors do not provide any performance improvement over the stand-alone versions, whereas the stack and blended ensemble models with all other non-tree-based models as level-1 regressors boost click prediction (0.114 and 0.124) significantly compared to their stand-alone versions. Additionally, statistical evidence is provided to support the importance of Bayesian hyperparameter optimization to the performance-boosting of level-1 regressors.

Keywords