It has been well-tested in the real-world and is generally accepted that simple models of indexing perform really well. They have no problems scaling or dealing with gigantic vocabularies. The biggest downside to them is that they can only match queries to documents if the words in the query are also present in the document.
Many alternatives – in the form of probabilistic indexing – exist but are not quite as mature or as predictable in its behavior as the classical vector-space model. The paper “Polynomial Semantic Indexing”  proposes a discriminative model to link queries and documents trained in a supervised manner with inputs where is a query, is a relevant document, and is an irrelevant document.
The model in this paper combines the concept of the traditional inner product between normalized word-count vectors to measure similarity and the factorization method to provide an embedding of the words in a smaller latent space. Ignoring the “polynomial” in the title of the paper the basic model looks like
where is a matrix of weights specified over each pair of words. Note how when (the identity matrix) the equation is nothing but the standard inner product. To avoid the parameters in the matrix the authors choose to represent as the product of smaller matrices and
The expression above shows clearly how the relevance of a query to a document is a combination of the standard inner product and the product of the latent space embedding of and .
The authors then show that the model generalizes to more than just a pairwise weighting. As in
The training data consists of tuples of the form where and are a relevant and a non-relevant document with respect to the query . The learning of should satisfy . The paper minimizes the margin ranking loss function through stochastic gradient descent
The gradient with respect to matrix is derived below when
And so, is updated (and similarly ) using a learning-rate (fixed in this paper)
How is it tested?
The authors make use of Wikipedia articles and the links contained therein to create test data that considers a document as relevant to a document if there is a link to from . Thus a test set of documents and links is evaluated by considering a random document and ranking the rest of the documents and seeing if linked documents are ranked higher than others. A modified version of this experiment is also undertaken where only a small set of random words is selected from a query document to mimic a keyword search.
Another interesting application of this model is to cross-language document retrieval. That is, query documents are taken from one language and relevant documents are taken from another. In this case, the authors pair a Japanese Wikipedia article (acting as the query document) with its English counterpart or a document linked to this one (acting as the relevant document) during training. Then, given a Japanese query document that is the pair of the english document we can evaluate if the model ranks documents linked to higher than those that are not linked to it.
 Bing Bai and Weston and others. 2009. “Polynomial Semantic Indexing”. Advances in Neural Information Processing Systems 22