IEEE Transactions on Neural Systems and Rehabilitation Engineering (Jan 2022)
Intermediate Sensory Feedback Assisted Multi-Step Neural Decoding for Reinforcement Learning Based Brain-Machine Interfaces
Abstract
Reinforcement-learning (RL)-based brain-machine interfaces (BMIs) interpret dynamic neural activity into movement intention without patients’ real limb movements, which is promising for clinical applications. A movement task generally requires the subjects to reach the target within one step and rewards the subjects instantaneously. However, a real BMI scenario involves tasks that require multiple steps, during which sensory feedback is provided to indicate the status of the prosthesis, and the reward is only given at the end of the trial. Actually, subjects internally evaluate the sensory feedback to adjust motor activity. Existing RL-BMI tasks have not fully utilized the internal evaluation from the brain upon the sensory feedback to guide the decoder training, and there lacks an effective tool to assign credit for the multi-step decoding task. We propose first to extract intermediate guidance from the medial prefrontal cortex (mPFC) to assist the learning of multi-step decoding in an RL framework. To effectively explore the neural-action mapping in a large state-action space, a temporal difference (TD) method is incorporated into quantized attention-gated kernel reinforcement learning (QAGKRL) to assign the credit over the temporal sequence of movement, but also discriminate spatially in the Reproducing Kernel Hilbert Space (RKHS). We test our approach on the data collected from the primary motor cortex (M1) and the mPFC of rats when they brain control the cursor to reach the target within multiple steps. Compared with the models which only utilize the final reward, the intermediate evaluation interpreted from the mPFC can help improve the prediction accuracy by 10.9% on average across subjects, with faster convergence and more stability. Moreover, our proposed algorithm further increases 18.2% decoding accuracy compared with existing TD-RL methods. The results reveal the possibility of achieving better multi-step decoding performance for more complicated BMI tasks.
Keywords