Advanced Intelligent Systems (Oct 2023)
Structure‐Aware Image Translation‐Based Long Future Prediction for Enhancement of Ground Robotic Vehicle Teleoperation
Abstract
Predicting future frames through image‐to‐image translation and using these synthetically generated frames for high‐speed ground vehicle teleoperation is a new concept to address latency and enhance operational performance. In the immediate previous work, the image quality of the predicted frames was low and a lot of scene detail was lost. To preserve the structural details of objects and improve overall image quality in the predicted frames, several novel ideas are proposed herein. A filter has been designed to remove noise from dense optical flow components resulting from frame rate inconsistencies. The Pix2Pix base network has been modified and a structure‐aware SSIM‐based perpetual loss function has been implemented. A new dataset of 20 000 training input images and 2000 test input images with a 500 ms delay between the target and input frames has been created. Without any additional video transformation steps, the proposed improved model achieved PSNR of 23.1; SSIM of 0.65; and MS‐SSIM of 0.80, a substantial improvement over our previous work. A Fleiss’ kappa score of >0.40 (0.48 for the modified network and 0.46 for the perpetual loss function) proves the reliability of the model.
Keywords