ELCVIA Electronic Letters on Computer Vision and Image Analysis (Jun 2014)

Towards an interactive index structuring system for content-based image retrieval in large image databases

  • Hien Phuong Lai,
  • Muriel Visani,
  • Alain Boucher,
  • Jean-Marc Ogier

DOI
https://doi.org/10.5565/rev/elcvia.618
Journal volume & issue
Vol. 13, no. 2

Abstract

Read online

In recent years, the expansion of acquisition devices such as digital cameras, the development of storage and transmission techniques and the success of tablet computers facilitate the development of many large image databases as well as the interactions with the users. This thesis [1] deals with the problem of Content-Based Image Retrieval (CBIR) on these huge masses of data. Traditional CBIR systems generally rely on three phases: feature extraction, feature space structuring and retrieval. In this thesis, we are particularly interested in the structuring phase (normally called indexing phase), which plays a very important role in finding information in large databases. This phase aims at organizing the visual feature descriptors of all images into an efficient data structure in order to facilitate, accelerate and improve further retrieval. We assume that the feature extraction phase is completed and the image feature descriptors which are usually low-level features describing the color, shape, texture, etc. of all images are available. Instead of traditional structuring methods, clustering methods which organize image descriptors into groups of similar objects (clusters), without any constraint on the cluster size, are studied. The aim is to obtain an indexed structure more adapted to the retrieval of high dimensional and unbalanced data. Clustering process can be done without prior knowledge (unsupervised clustering) or with a limited amount of prior knowledge (semi-supervised clustering). Due to the “semantic gap” between high-level semantic concepts expressed by the user via the query and the low-level features automatically extracted from the images, the clustering results and therefore the retrieval results are generally different from the wishes of the user. In this thesis, we proposed to involve the user in the clustering phase so that he/she can interact with the system so as to improve the clustering results, and thus improve the performance of the further retrieval. The idea is as follows. Firstly, images are organized into clusters by using an initial clustering. Then, the user visualizes the clustering result and provides feedback to the system in order to guide the re-clustering phase. The system then re-organizes the dataset by using not only the similarity between objects, but also the feedback given by the user in order to reduce the semantic gap. The interactive loop can be iterated until the clustering result satisfies the user. In the case of large database indexing, we assume that the user has no prior knowledge about the image database. Therefore, an unsupervised clustering method is suitable to be used for the initial clustering, when no supervised information is available yet. Then, after receiving the user feedback in each interactive iteration, a semi-supervised clustering can be used for the re-clustering process. Based on a deep study of the state of the art of different unsupervised clustering methods [4] as well as semi- supervised clustering approaches [2, 3], we propose in this thesis a new interactive semi-supervised clustering model [3] involving the user in the clustering phase in order to improve the clustering results. From the formal analysis of different unsupervised clustering methods [4], we chose to experiment some methods which appear to be the most suitable to be used in an incremental context involving the user in the clustering stage. The hierarchical BIRCH unsupervised clustering (Zhang et al., 1996) which gives the best performance from these experiments [4] is chosen to be used as the initial clustering in our model. Then, an interactive loop in which the user provides the feedback to the system and the system re-organizes the image database using the new semi-supervised clustering method proposed in this thesis is iterated until the clustering result satisfies the user. As the user has no prior knowledge about the image database, it is difficult for him/her to label the clusters or the images in the clusters using classes. Therefore, we provide to the user an interactive interface allowing him/her to easily visualize the clustering result and give feedback to the system. Based on the majority of the images displayed for each cluster, the user can specify, by some simple clicks, relevant or non-relevant images for each cluster. The user can also drag and drop images between clusters in order to change the cluster assignment of some images. Then, supervised information is deduced from the user feedback in order to be used for the re-clustering phase using the proposed semi-supervised clustering method. According to our study of the state of the art of different semi-supervised clustering methods, supervised information may consist of class labels for some objects or pairwise constraints (must-link or cannot-link) between objects. The experimental analysis of different semi-supervised clustering methods in the interactive context [2, 3] shows a high performance of the HMRF-kmeans (Basu et al., 2004) which uses pairwise constraints compared with the other methods. Inspired from the HMRF-kmeans method, we proposed a new semi-supervised clustering method [3] for the re-clustering process. Instead of using pairwise constraints between images, our method uses pairwise constraints between the leaf entries (CF entries) of the BIRCH tree as supervised information for guiding the re-organization of the CF entries in the re-clustering phase. As each CF entry groups a set of similar images, pairwise constraints between images can be replaced by a smaller number of pairwise constraints between CF entries, without reducing the quality of supervised information. And therefore, the processing time could be reduced without decreasing the performance. In our model, after receiving user feedback in each interactive iteration, pairwise constraints can be deduced based not only on the user feedback but also on the neighbourhood information. Neighbourhood information groups images according to the willingness of the user to classify them in the same clusters (via user feedback of all interactive iterations). This kind of information helps to maximize the supervised information (pairwise constraints) gained from a same number of user clicks. In order to avoid the subjective dependence of the clustering results on the human user, a software agent simulating the behaviour of the human user for providing feedback to the system is used for the experimental analysis of our system using different image databases of increasing sizes (Wang, PascalVoc2006, Caltech101, Corel30k). Moreover, different strategies for deducing pairwise constraints from user feedback and neighbour- hood information were investigated. Among these strategies, the strategy which keeps only the most “diffi- cult” constraints (must-link between the most distant objects and cannot-links between the closest objects) was shown to give the best trade-off between the performance and the processing time. Furthermore, the experi- mental results show that our model helps to improve the clustering results by involving the user and that our semi-supervised clustering outperforms the HMRF-kmeans, in both performance and processing time. Note that our clustering structure can be used not only for facilitating the further image retrieval, but also for helping the navigation in large image databases. Moreover, in this thesis, we propose a 2D interface for visualizing the group structure of high dimensional image databases.

Keywords