ACM International Conference on Multimedia (ACMMM) 2018

Reconfigurable Inverted Index

Yusuke Matsui
National Institute of Informatics
Ryota Hinami
The University of Tokyo
Shin'ichi Satoh
National Institute of Informatics

The search is operated for a subset of a database, which is specified by the target identifiers. The search result (ranked list) should contain the specified items only.

Given a fast (optimized) ANN system, new vectors are added. Is the updated ANN system still fast?

Abstract

Existing approximate nearest neighbor search systems suffer from two fundamental problems that are of practical importance but have not received sufficient attention from the research community. First, although existing systems perform well for the whole database, it is difficult to run a search over a subset of the database. Second, there has been no discussion concerning the performance decrement after many items have been newly added to a system. We develop a reconfigurable inverted index (Rii) to resolve these two issues. Based on the standard IVFADC system, we design a data layout such that items are stored linearly. This enables us to efficiently run a subset search by switching the search method to a linear PQ scan if the size of a subset is small. Owing to the linear layout, the data structure can be dynamically adjusted after new items are added, maintaining the fast speed of the system. Extensive comparisons show that Rii achieves a comparable performance with state-of-the art systems such as Faiss.

Install

  • pip install rii

Publication

BibTeX

@misc{cvpr20_tutorial_image_retrieval,
    author = {Yusuke Matsui and Takuma Yamaguchi and Zheng Wang},
    title = {CVPR2020 Tutorial on Image Retrieval in the Wild},
    howpublished = {\url{https://matsui528.github.io/cvpr2020_tutorial_retrieval/}},
    year = {2020}
}

Last updated: July 25, 2020