Autonomous novel class discovery for vision-based recognition in non-interactive environments

Xuelin Zhang; Feng Liu; Xuelian Cheng; Siyuan Yan; Zhibin Liao; Zongyuan Ge

Cognitive Robotics (Jan 2024)

Autonomous novel class discovery for vision-based recognition in non-interactive environments

Xuelin Zhang,
Feng Liu,
Xuelian Cheng,
Siyuan Yan,
Zhibin Liao,
Zongyuan Ge

Affiliations

Xuelin Zhang: Faculty of Information Technology, Monash University, 20 Exhibition Walk, Melbourne 3168, VIC, Australia; Corresponding author.
Feng Liu: The School of Computing and Information Systems, The University of Melbourne, Parkville, Melbourne 3052, VIC, Australia
Xuelian Cheng: Faculty of Information Technology, Monash University, 20 Exhibition Walk, Melbourne 3168, VIC, Australia
Siyuan Yan: Faculty of Information Technology, Monash University, 20 Exhibition Walk, Melbourne 3168, VIC, Australia
Zhibin Liao: Faculty of Sciences, Engineering and Technology, The University of Adelaide, North Terrace, Adelaide, 5001, SA, Australia
Zongyuan Ge: Faculty of Information Technology, Monash University, 20 Exhibition Walk, Melbourne 3168, VIC, Australia

Journal volume & issue: Vol. 4
pp. 191 – 203

Abstract

Read online

Visual recognition with deep learning has recently been shown to be effective in robotic vision. However, these algorithms tend to be build under fixed and structured environment, which is rarely the case in real life. When facing unknown objects, avoidance or human interactions are required, which may miss critical objects or be prohibitively costly to obtain on robots in the real world. We consider a practical problem setting that aims to allow robots to automatically discover novel classes with only labelled known class samples in hand, defined as open-set clustering (OSC). To address the OSC problem, we propose a framework combining three approaches: 1) using selfsupervised vision transformers to mitigate the discard of information needed for clustering unknown classes; 2) adaptive weighting for image patches to prioritize patches with richer textures; and 3) incorporating a temperature scaling strategy to generate more separable feature embeddings for clustering. We demonstrate the efficacy of our approach in six fine-grained image datasets.

Published in Cognitive Robotics

ISSN: 2667-2413 (Online)
Publisher: KeAi Communications Co. Ltd.
Country of publisher: China
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.keaipublishing.com/en/journals/cognitive-robotics/

About the journal

Abstract

Keywords