Information (Jul 2025)
A Fundamental Statistics Self-Learning Method with Python Programming for Data Science Implementations
Abstract
The increasing demand for data-driven decision making to maintain the innovations and competitiveness of organizations highlights the need for data science educations across academia and industry. At its core is a solid understanding of statistics, which is necessary for conducting a thorough analysis of data and deriving valuable insights. Unfortunately, conventional statistics learning often lacks practice in real-world applications using computer programs, causing a separation between conceptual knowledge of statistics equations and their hands-on skills. Integrating statistics learning into Python programming can convey an effective solution for this problem, where it has become essential in data science implementations, with extensive and versatile libraries. In this paper, we present a self-learning method for fundamental statistics through Python programming for data science studies. Unlike conventional approaches, our method integrates three types of interactive problems—element fill-in-blank problem (EFP), grammar-concept understanding problem (GUP), and value trace problem (VTP)—in the Programming Learning Assistant System (PLAS). This combination allows students to write code, understand concepts, and trace the output value while obtaining instant feedback so that they can improve retention, knowledge, and practical skills in learning statistics using Python programming. For evaluations, we generated 22 instances using source codes for fundamental statistics topics, and assigned them to 40 first-year undergraduate students at UPN Veteran Jawa Timur, Indonesia. Statistics analytical methods were utilized to analyze the student learning performances. The results show that a significant correlation (ρ0.05) exists between the students who solved our proposal and those who did not. The results confirm that it can effectively assist students in learning fundamental statistics self-learning using Python programming for data science implementations.
Keywords