Data in Brief (Jun 2024)

Data center TCP dataset

  • Jan Fesl,
  • Tereza Čapková,
  • Michal Konopa,
  • Ladislav Beránek,
  • Jan Fiala,
  • Michal Houda,
  • Petr Chládek,
  • Jana Klicnarová,
  • Radim Remeš,
  • Marek Šulista,
  • Klára Vocetková,
  • Marie Feslová

Journal volume & issue
Vol. 54
p. 110522

Abstract

Read online

In this paper, we would like to introduce a unique dataset that covers thousands of network flow measurements realized through TCP in a data center environment. The TCP protocol is widely used for reliable data transfers and has many different versions. The various versions of TCP are specific in how they deal with link congestion through the congestion control algorithm (CCA). Our dataset represents a unique, comprehensive comparison of the 17 currently used versions of TCP with different CCAs. Each TCP flow was measured precisely 50 times to eliminate the measurement instability. The comparison of the various TCP versions is based on the knowledge of 18 quantitative attributes representing the parameters of a TCP transmission. Our dataset is suitable for testing and comparing different versions of TCP, creating new CCAs based on machine learning models, or creating and testing machine learning models, allowing the identification and optimization of the currently existing versions of TCP.

Keywords