Information Technology and Libraries (Sep 1974)

Application of the Variety-Generator Approach to Searches of Personal Names in Bibliographic Data Bases--Part 2. Optimization of Key-Sets, and Evaluation of Their Retrieval Efficiency

  • Dirk W. Fokker,
  • Michael F. Lynch

DOI
https://doi.org/10.6017/ital.v7i3.8951
Journal volume & issue
Vol. 7, no. 3
pp. 201 – 213

Abstract

Read online

Keys consisting of variable-length chamcter strings from the front and rear of surnames, derived by analysis of author names in a particular data base, am used to provide approximate representations of author names. When combined in appropriate ratios, and used together with keys for each of the first two initials of personal names, they provide a high degree of discrimination in search. Methods for optimization of key-sets are described, and the performance of key-sets varying in size between 150 and 300 is determined at file sizes of up to 50,000 name entries. The effects of varying the proportions of the queries present in the file are also examined. The results obtained with fixed-length keys are compared with those for variable-length keys, showing the latter to be greatly superior. Implications of the work for a variety of types of information systems are discussed.