Sensors (Aug 2023)
A Variable Photo-Model Method for Object Pose and Size Estimation with Stereo Vision in a Complex Home Scene
Abstract
Model-based stereo vision methods can estimate the 6D poses of rigid objects. They can help robots to achieve a target grip in complex home environments. This study presents a novel approach, called the variable photo-model method, to estimate the pose and size of an unknown object using a single photo of the same category. By employing a pre-trained You Only Look Once (YOLO) v4 weight for object detection and 2D model generation in the photo, the method converts the segmented 2D photo-model into 3D flat photo-models assuming different sizes and poses. Through perspective projection and model matching, the method finds the best match between the model and the actual object in the captured stereo images. The matching fitness function is optimized using a genetic algorithm (GA). Unlike data-driven approaches, this approach does not require multiple photos or pre-training time for single object pose recognition, making it more versatile. Indoor experiments demonstrate the effectiveness of the variable photo-model method in estimating the pose and size of the target objects within the same class. The findings of this study have practical implications for object detection prior to robotic grasping, particularly due to its ease of application and the limited data required.
Keywords