Algorithms (Apr 2022)
Large-Scale Multimodal Piano Music Identification Using Marketplace Fingerprinting
Abstract
This paper studies the problem of identifying piano music in various modalities using a single, unified approach called marketplace fingerprinting. The key defining characteristic of marketplace fingerprinting is choice: we consider a broad range of fingerprint designs based on a generalization of standard n-grams, and then select the fingerprint designs at runtime that are best for a specific query. We show that the large-scale retrieval problem can be framed as an economics problem in which a consumer and a store interact. In our analogy, the runtime search is like a consumer shopping in the store, the items for sale correspond to fingerprints, and purchasing an item corresponds to doing a fingerprint lookup in the database. Using basic principles of economics, we design an efficient marketplace in which the consumer has many options and adopts a rational buying strategy that explicitly considers the cost and expected utility of each item. We evaluate our marketplace fingerprinting approach on four different sheet music retrieval tasks involving sheet music images, MIDI files, and audio recordings. Using a database containing approximately 375,000 pages of sheet music, our method is able to achieve 0.91 mean reciprocal rank with sub-second average runtime on cell phone image queries. On all four retrieval tasks, the marketplace method substantially outperforms previous methods while simultaneously reducing average runtime. We present comprehensive experimental results, as well as detailed analyses to provide deeper intuition into system behavior.
Keywords