EURASIP Journal on Wireless Communications and Networking (Dec 2021)

Using machine learning to find the hidden relationship between RTT and TCP throughput in WiFi

  • Aizaz U. Chaudhry

DOI
https://doi.org/10.1186/s13638-021-02076-1
Journal volume & issue
Vol. 2021, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Is it possible to find hidden relationships among variables in WiFi network using machine learning (ML)? Can we use ML to find a variable that significantly affects the TCP throughput in WiFi? In this work, we employ a publicly available WiFi dataset to investigate these questions. We use ML techniques, including principal component analysis (PCA), linear regression (LR), and random forest (RF), to study the effect of link speed, received signal strength, round-trip time (RTT), and number of available access points on TCP throughput in WiFi. More specifically, we are interested in employing ML to find the variable that most accurately predicts and thereby most significantly affects the throughput. Simple correlation analysis indicates that a combination of multiple variables is more likely to act as a reasonable predictor of the throughput, whereas a single variable, such as RTT, alone is not likely to predict the throughput with reasonable accuracy. From PCA, the first principal component (PC1) is seen as highly correlated to RTT. During predictive analysis, it is observed that the LR model is unable to find any hidden relationship between throughput and other variables. However, the RF model discovers that RTT explains the variation in throughput more closely and as such it predicts the throughput more accurately compared to other variables. PC1 captures nearly all of the variation in throughput with the RF model and predicts throughput with very high accuracy, which indirectly confirms RTT as the variable that most significantly affects the TCP throughput in WiFi. Consequently, we discover a very close relationship between RTT and TCP throughput using appropriate ML techniques, and these results can be helpful in developing a better understanding of the relationship between latency and throughput for designing future low-latency networks.

Keywords