PeerJ Computer Science (Sep 2024)

Harnessing AI and analytics to enhance cybersecurity and privacy for collective intelligence systems

  • Muhammad Rehan Naeem,
  • Rashid Amin,
  • Muhammad Farhan,
  • Faiz Abdullah Alotaibi,
  • Mrim M. Alnfiai,
  • Gabriel Avelino Sampedro,
  • Vincent Karovič

DOI
https://doi.org/10.7717/peerj-cs.2264
Journal volume & issue
Vol. 10
p. e2264

Abstract

Read online Read online

Collective intelligence systems like Chat Generative Pre-Trained Transformer (ChatGPT) have emerged. They have brought both promise and peril to cybersecurity and privacy protection. This study introduces novel approaches to harness the power of artificial intelligence (AI) and big data analytics to enhance security and privacy in this new era. Contributions could explore topics such as: leveraging natural language processing (NLP) in ChatGPT-like systems to strengthen information security; evaluating privacy-enhancing technologies to maximize data utility while minimizing personal data exposure; modeling human behavior and agency to build secure and ethical human-centric systems; applying machine learning to detect threats and vulnerabilities in a data-driven manner; using analytics to preserve privacy in large datasets while enabling value creation; crafting AI techniques that operate in a trustworthy and explainable manner. This article advances the state-of-the-art at the intersection of cybersecurity, privacy, human factors, ethics, and cutting-edge AI, providing impactful solutions to emerging challenges. Our research presents a revolutionary approach to malware detection that leverages deep learning (DL) based methodologies to automatically learn features from raw data. Our approach involves constructing a grayscale image from a malware file and extracting features to minimize its size. This process affords us the ability to discern patterns that might remain hidden from other techniques, enabling us to utilize convolutional neural networks (CNNs) to learn from these grayscale images and a stacking ensemble to classify malware. The goal is to model a highly complex nonlinear function with parameters that can be optimized to achieve superior performance. To test our approach, we ran it on over 6,414 malware variants and 2,050 benign files from the MalImg collection, resulting in an impressive 99.86 percent validation accuracy for malware detection. Furthermore, we conducted a classification experiment on 15 malware families and 13 tests with varying parameters to compare our model to other comparable research. Our model outperformed most of the similar research with detection accuracy ranging from 47.07% to 99.81% and a significant increase in detection performance. Our results demonstrate the efficacy of our approach, which unlocks the hidden patterns that underlie complex systems, advancing the frontiers of computational security.

Keywords