Exergames have been proposed as a potential tool to improve the current practice of musculoskeletal rehabilitation. Inertial or optical motion capture sensors are commonly used to track the subject’s movements. However, the use of these motion capture tools suffers from the lack of accuracy in estimating joint angles, which could lead to wrong data interpretation. In this study, we proposed a real time quaternion-based fusion scheme, based on the extended Kalman filter, between inertial and visual motion capture sensors, to improve the estimation accuracy of joint angles. The fusion outcome was compared to angles measured using a goniometer. The fusion output shows a better estimation, when compared to inertial measurement units and Kinect outputs. We noted a smaller error (3.96°) compared to the one obtained using inertial sensors (5.04°). The proposed multi-sensor fusion system is therefore accurate enough to be applied, in future works, to our serious game for musculoskeletal rehabilitation.