Mathematics (Aug 2024)
Evaluation of Deformable Convolution: An Investigation in Image and Video Classification
Abstract
Convolutional Neural Networks (CNNs) present drawbacks for modeling geometric transformations, caused by the convolution operation’s locality. Deformable convolution (DCON) is a mechanism that solves these drawbacks and improves the robustness. In this study, we clarify the optimal way to replace the standard convolution with its deformable counterpart in a CNN model. To this end, we conducted several experiments using DCONs applied in the layers that conform a small four-layer CNN model and on the four-layers of several ResNets with depths 18, 34, 50, and 101. The models were tested in binary balanced classes with 2D and 3D data. If DCON is used on the first layers of the proposal of model, the computational resources will tend to increase and produce bigger misclassification than the standard CNN. However, if the DCON is used at the end layers, the quantity of Flops will decrease, and the classification accuracy will improve by up to 20% about the base model. Moreover, it gains robustness because it can adapt to the object of interest. Also, the best kernel size of the DCON is three. With these results, we propose a guideline and contribute to understanding the impact of DCON on the robustness of CNNs.
Keywords