Applied AI Letters (Dec 2024)
An Application of 3D Vision Transformers and Explainable AI in Prosthetic Dentistry
Abstract
ABSTRACT To create and validate a transformer‐based deep neural network architecture for classifying 3D scans of teeth for computer‐assisted manufacturing and dental prosthetic rehabilitation surpassing previously reported validation accuracies obtained with convolutional neural networks (CNNs). Voxel‐based representation and encoding input data in a high‐dimensional space forms of preprocessing were investigated using 34 3D models of teeth obtained from intraoral scanning. Independent CNNs and vision transformers (ViTs), and their combination (CNN and ViT hybrid model) were implemented to classify the 3D scans directly from standard tessellation language (.stl) files and an Explainable AI (ExAI) model was generated to qualitatively explore the deterministic patterns that influenced the outcomes of the automation process. The results demonstrate that the CNN and ViT hybrid model architecture surpasses conventional supervised CNN, achieving a consistent validation accuracy of 90% through three‐fold cross‐validation. This process validated our initial findings, where each instance had the opportunity to be part of the validation set, ensuring it remained unseen during training. Furthermore, employing high‐dimensional encoding of input data solely with 3DCNN yields a validation accuracy of 80%. When voxel data preprocessing is utilized, ViT outperforms CNN, achieving validation accuracies of 80% and 50%, respectively. The study also highlighted the saliency map's ability to identify areas of tooth cavity preparation of restorative importance, that can theoretically enable more accurate 3D printed prosthetic outputs. The investigation introduced a CNN and ViT hybrid model for classification of 3D tooth models in digital dentistry, and it was the first to employ ExAI in the efforts to automate the process of dental computer‐assisted manufacturing.
Keywords