Development of Smartphone Application for Markerless Three-Dimensional Motion Capture Based on Deep Learning Model
Yukihiko Aoyagi,
Shigeki Yamada,
Shigeo Ueda,
Chifumi Iseki,
Toshiyuki Kondo,
Keisuke Mori,
Yoshiyuki Kobayashi,
Tadanori Fukami,
Minoru Hoshimaru,
Masatsune Ishikawa,
Yasuyuki Ohta
Affiliations
Yukihiko Aoyagi
Digital Standard Co., Ltd., Osaka 536-0013, Japan
Shigeki Yamada
Department of Neurosurgery, Shiga University of Medical Science, Otsu 520-2192, Japan
Shigeo Ueda
Shin-Aikai Spine Center, Katano Hospital, Katano 576-0043, Japan
Chifumi Iseki
Division of Neurology and Clinical Neuroscience, Department of Internal Medicine III, Yamagata University School of Medicine, Yamagata 990-9585, Japan
Toshiyuki Kondo
Division of Neurology and Clinical Neuroscience, Department of Internal Medicine III, Yamagata University School of Medicine, Yamagata 990-9585, Japan
Keisuke Mori
School of Medicine, Shiga University of Medical Science, Otsu 520-2192, Japan
Yoshiyuki Kobayashi
Human Augmentation Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Kashiwa II Campus, University of Tokyo, Kashiwa 277-0882, Japan
Tadanori Fukami
Department of Informatics and Electronics, Faculty of Engineering, Yamagata University, Yamagata 992-8510, Japan
Minoru Hoshimaru
Shin-Aikai Spine Center, Katano Hospital, Katano 576-0043, Japan
Masatsune Ishikawa
Normal Pressure Hydrocephalus Center, Rakuwakai Otowa Hospital, Kyoto 607-8062, Japan
Yasuyuki Ohta
Division of Neurology and Clinical Neuroscience, Department of Internal Medicine III, Yamagata University School of Medicine, Yamagata 990-9585, Japan
To quantitatively assess pathological gait, we developed a novel smartphone application for full-body human motion tracking in real time from markerless video-based images using a smartphone monocular camera and deep learning. As training data for deep learning, the original three-dimensional (3D) dataset comprising more than 1 million captured images from the 3D motion of 90 humanoid characters and the two-dimensional dataset of COCO 2017 were prepared. The 3D heatmap offset data consisting of 28 × 28 × 28 blocks with three red–green–blue colors at the 24 key points of the entire body motion were learned using the convolutional neural network, modified ResNet34. At each key point, the hottest spot deviating from the center of the cell was learned using the tanh function. Our new iOS application could detect the relative tri-axial coordinates of the 24 whole-body key points centered on the navel in real time without any markers for motion capture. By using the relative coordinates, the 3D angles of the neck, lumbar, bilateral hip, knee, and ankle joints were estimated. Any human motion could be quantitatively and easily assessed using a new smartphone application named Three-Dimensional Pose Tracker for Gait Test (TDPT-GT) without any body markers or multipoint cameras.