The search is operated for a subset of a database, which is specified by the target identifiers. The search result (ranked list) should contain the specified items only.
Given a fast (optimized) ANN system, new vectors are added. Is the updated ANN system still fast?
Existing approximate nearest neighbor search systems suffer from two fundamental problems that are of practical importance but have not received sufficient attention from the research community. First, although existing systems perform well for the whole database, it is difficult to run a search over a subset of the database. Second, there has been no discussion concerning the performance decrement after many items have been newly added to a system. We develop a reconfigurable inverted index (Rii) to resolve these two issues. Based on the standard IVFADC system, we design a data layout such that items are stored linearly. This enables us to efficiently run a subset search by switching the search method to a linear PQ scan if the size of a subset is small. Owing to the linear layout, the data structure can be dynamically adjusted after new items are added, maintaining the fast speed of the system. Extensive comparisons show that Rii achieves a comparable performance with state-of-the art systems such as Faiss.
@inproceedings{acmmm_matsui_2018, author={Yusuke Matsui and Ryota Hinami and Shin'ichi Satoh}, title={Reconfigurable Inverted Index}, booktitle={ACM International Conference on Multimedia (ACMMM)}, pages={1715--1723}, year={2018} }