PLoS ONE (Jan 2021)

Analysis of big data job requirements based on K-means text clustering in China.

  • Dai Debao,
  • Ma Yinxia,
  • Zhao Min

DOI
https://doi.org/10.1371/journal.pone.0255419
Journal volume & issue
Vol. 16, no. 8
p. e0255419

Abstract

Read online

This paper aims to understand the characteristics of domestic big data jobs requirements through k-means text clustering, help enterprises, and employees to identify big data talents, and promote the further development of big data-related research. Firstly, the crawler software is used to crawl the recruitment information about "big data" on the zhaopin.com recruitment website. Then, Jieba word segmentation and K-means text clustering are used to cluster big data recruitment positions, and the number of clustering was determined by the average sum of squares within the group. Finally, big data jobs are divided into 10 categories, and the urban distribution, salary level, education requirements, and experience requirements of big data jobs are discussed and analyzed from the perspectives of the overall data set and clustering results, to clarify the characteristics of big data job demands. The analysis results show that the job demands of big data are mainly distributed in first-tier cities and new first-tier cities. Enterprises are more inclined to job seekers with a college degree or bachelor's degree and more than one year's relevant experience. There are wage differences among different types of jobs. The higher the position, the higher the requirement for education and experience will be.