The searching of protein databases as a method of identifying newly sequenced genes is commonplace in molecular biology laboratories. However, it is a procedure that is not usually formally taught to students, and method cookbooks discuss it only briefly. This article uses a single family of highly diverged uracil-DNA glycosylases, which fall into two distinct groups, to highlight some of the difficulties associated with identification of such proteins by database searching.