Journal of Techniques (Jun 2023)
Generating Encrypted Document Index Structure Using Tree Browser
Abstract
The document indexing process aims to store documents in a manner that facilitates the process of retrieving specific documents efficiently in terms of accuracy and time complexity. Many information retrieval systems encounter security issues and execution time to retrieve relevant documents. In addition, these systems lead to ample storage. Therefore, it requires combining confidentiality with the indexed document, and a separate process is performed to encrypt the documents. Hence, a new indexing structure named tree browser (TB) was proposed in this paper to be applied to index files of the large document set in an encrypted manner. This method represents the keywords in a variable-length binary format before being stored in the index. This binary format provides additional encryption to the information stored and reduces the index size. The proposed method (TB) is applied to the WebKB dataset. This dataset is related to web page documents (semi-structured documents). The experimental results demonstrated that the storage size is reduced by using TB-tree to 48.5 MB, while the traditional index is 307 MB.
Keywords