IEEE Access (Jan 2023)

Explainable YouTube Video Identification Using Sufficient Input Subsets

  • Waleed Afandi,
  • Syed Muhammad Ammar Hassan Bukhari,
  • Muhammad U. S. Khan,
  • Tahir Maqsood,
  • Muhammad A. B. Fayyaz,
  • Ali R. Ansari,
  • Raheel Nawaz

DOI
https://doi.org/10.1109/ACCESS.2023.3261562
Journal volume & issue
Vol. 11
pp. 33178 – 33188

Abstract

Read online

Neural network models are black boxes in nature. The mechanics behind these black boxes are practically unexplainable. Having the insight into patterns identified by these algorithms can help unravel important properties of the subject in query. These artificial intelligence based algorithms are used in every domain for prediction. This research focuses on patterns formed in network traffic that can be leveraged to identify videos streaming over the network. The proposed work uses a sufficient input subset (SIS) model on two separate video identification techniques to understand and explain the patterns detected by the techniques. The first technique creates the fingerprints of videos on a period-based algorithm to handle variable bitrate inconsistencies. These fingerprints are passed to a convolutional Neural Network (CNN) for pattern recognition. The second technique is based on traffic pattern plot identification that creates a graph of packet size with respect to time for each stream before passing that to a CNN as an image. For model explainability, a sufficient input subset (SIS) model is used to identify features that are sufficient to reach the same prediction under a certain threshold of confidence by the model. The generated SIS of each input sample is clustered using DBSCAN, K-Means, and cosine-based Hierarchical clustering. The clustered SIS highlight the common patterns for each class. The SIS patterns learnt by each model of three individual videos are discussed. Furthermore, these patterns are used to investigate misclassification and provide a rationale behind it to justify the working of the classifier model.

Keywords