# A Deep Look into neural ranking models for information retrieval (2023)

ScienceDirect

ViewPDF

## Article preview

• Abstract
• Introduction
• Section snippets
• References (156)
• Cited by (132)
• Recommended articles (6)

Volume 57, Issue 6,

November 2020

, 102067

## Abstract

Ranking models lie at the heart of research on information retrieval (IR). During the past decades, different techniques have been proposed for constructing ranking models, from traditional heuristic methods, probabilistic methods, to modern machine learning methods. Recently, with the advance of deep learning technology, we have witnessed a growing body of work in applying shallow or deep neural networks to the ranking problem in IR, referred to as neural ranking models in this paper. The power of neural ranking models lies in the ability to learn from the raw text inputs for the ranking problem to avoid many limitations of hand-crafted features. Neural networks have sufficient capacity to model complicated tasks, which is needed to handle the complexity of relevance estimation in ranking. Since there have been a large variety of neural ranking models proposed, we believe it is the right time to summarize the current status, learn from existing methodologies, and gain some insights for future development. In contrast to existing reviews, in this survey, we will take a deep look into the neural ranking models from different dimensions to analyze their underlying assumptions, major design principles, and learning strategies. We compare these models through benchmark tasks to obtain a comprehensive empirical understanding of the existing techniques. We will also discuss what is missing in the current literature and what are the promising and desired future directions.

## Introduction

Information retrieval is a core task in many real-world applications, such as digital libraries, expert finding, Web search, and so on. Essentially, IR is the activity of obtaining some information resources relevant to an information need from within large collections. As there might be a variety of relevant resources, the returned results are typically ranked with respect to some relevance notion. This ranking of results is a key difference of IR from other problems. Therefore, research on ranking models has always been at the heart of IR.

Ma ny different ranking models have been proposed over the past decades, including vector space models (Salton,Wong, & Yang, 1975), probabilistic models (Robertson & Jones,1976), and learning to rank (LTR) models (Li, 2011, Liu, 2009). Existing techniques, especially the LTR models, have already achieved great success in many IR applications, e.g., modern Web search engines like Google1 or Bing2. There is still, however, much room for improvement in the effectiveness of these techniques for more complex retrieval tasks.

In recent years, deep neural networks have led to exciting breakthroughs in speech recognition (Hintonetal., 2012), computer vision (Krizhevsky, Sutskever, Hinton, 2012, LeCun, Bengio, Hinton, 2015), and natural language processing (NLP) (Bahdanau, Cho, Bengio, 2014, Goldberg, 2017). These models have been shown to be effective at learning abstract representations from the raw input, and have sufficient model capacity to tackle difficult learning problems. Both of these are desirable properties for ranking models in IR. On one hand, most existing LTR models rely on hand-crafted features, which are usually time-consuming to design and often over-specific in definition. It would be of great value if ranking models could learn the useful ranking features automatically. On the other hand, relevance, as a key notion in IR, is often vague in definition and difficult to estimate since relevance judgments are based on a complicated human cognitive process. Neural models with sufficient model capacity have more potential for learning such complicated tasks than traditional shallow models. Due to these potential benefits and along with the expectation that similar successes with deep learning could be achieved in IR (Craswell,Croft, Guo, Mitra, & deRijke, 2017a), we have witnessed substantial growth of work in applying neural networks for constructing ranking models in both academia and industry in recent years. Note that in this survey, we focus on neural ranking models for textual retrieval, which is central to IR, but not the only mode that neural models can be used for (Brenner, Zhao, Kutiyanawala, Yan, 2018, Wan, Wang, Hoi, Wu, Zhu, Zhang, etal., 2014).

Per haps the first successful model of this type is the Deep Structured Semantic Model (DSSM) (Huangetal., 2013) introduced in 2013, which is a neural ranking model that directly tackles the ad-hoc retrieval task. In the same year, Luand Li(2013) proposed DeepMatch, which is a deep matching method applied to the Community-based Question Answering (CQA) and micro-blog matching tasks. Note that at the same time or even before this work, there were a number of studies focused on learning low-dimensional representations of texts with neural models (Mikolov, Sutskever, Chen, Corrado, Dean, 2013b, Salakhutdinov, Hinton, 2009) and using them either within traditional IR models or with some new similarity metrics for ranking tasks. However, we would like to refer to those methods as representation learning models rather than neural ranking models, since they did not directly construct the ranking function with neural networks. Later, between 2014 and 2015, work on neural ranking models began to grow, such as new variants of DSSM (Huangetal., 2013), ARC I and ARC II (Hu,Lu, Li, & Chen, 2014), MatchPyramid (Pangetal., 2016b), and so on. Most of this research focused on short text ranking tasks, such as TREC QA tracks and Microblog tracks (Severyn & Moschitti,2015). Since 2016, the study of neural ranking models has bloomed, with significant work volume, deeper and more rigorous discussions, and much wider applications (Onaletal., 2018). For example, researchers began to discuss the practical effectiveness of neural ranking models on different ranking tasks (Cohen, Ai, Croft, 2016, Guo, Fan, Ai, Croft, 2016). Neural ranking models have been applied to ad-hoc retrieval (Hui, Yates, Berberich, de Melo, 2017a, Mitra, Diaz, Craswell, 2017), community-based QA (Qiu & Huang,2015), conversational search (Yan,Song, & Wu, 2016a), and so on. Researchers began to go beyond the architecture of neural ranking models, paying attention to new training paradigms of neural ranking models (Dehghani,Zamani, Severyn, Kamps, & Croft, 2017b), alternate indexing schemes for neural representations (Zamani,Dehghani, Croft, Learned-Miller, & Kamps, 2018b), integration of external knowledge (Xiong, Callan, Liu, 2017a, Yang, Qiu, Qu, Guo, Zhang, Croft, etal., 2018), and other novel uses of neural approaches for IR tasks (Fan, Guo, Lan, Xu, Pang, Cheng, 2017a, Tang, Yang, 2018).

Up to now, we have seen exciting progress on neural ranking models. In academia, several neural ranking models learned from scratch can already outperform state-of-the-art LTR models with tens of hand-crafted features (Fan, Guo, Lan, Xu, Zhai, Cheng, 2018, Pang, Lan, Guo, Xu, Xu, Cheng, 2017). Workshops and tutorials on this topic have attracted extensive interest in the IR community (Craswell, Croft, Guo, Mitra, de Rijke, 2017a, Craswell, Croft, de Rijke, Guo, Mitra, 2017b). Standard benchmark datasets (Nguyen, Rosenberg, Song, Gao, Tiwary, Majumder, Deng, 2016b, Yang, Yih, Meek, 2015), evaluation tasks (Dietz,Verma, Radlinski, & Craswell, 2017), and open-source toolkits (Fanetal., 2017b) have been created to facilitate research and rigorous comparison. Meanwhile, in industry, we have also seen models such as DSSM put into a wide range of practical usage in the enterprise (He,Gao, & Deng, 2014). Neural ranking models already generate the most important features for modern search engines. However, beyond these exciting results, there is still a long way to go for neural ranking models: (1) Neural ranking models have not had the level of breakthroughs achieved by neural methods in speech recognition or computer vision; (2) There is little understanding and few guidelines on the design principles of neural ranking models; (3) We have not identified the special capabilities of neural ranking models that go beyond traditional IR models. Therefore, it is the right moment to take a look back, summarize the current status, and gain some insights for future development.

There have been some related surveys on neural approaches to IR (neural IR for short). For example, Onaletal.(2018) reviewed the current landscape of neural IR research, paying attention to the application of neural methods to different IR tasks. Mitraand Craswell(2017) gave an introduction to neural information retrieval. In their booklet, they talked about fundamentals of text retrieval, and briefly reviewed IR methods employing pre-trained embeddings and neural networks. In contrast to this work, this survey does not try to cover every aspect of neural IR, but will focus on and take a deep look into ranking models with deep neural networks. Specifically, we formulate the existing neural ranking models under a unified framework, and review them from different dimensions to understand their underlying assumptions, major design principles, and learning strategies. We also compare representative neural ranking models through benchmark tasks to obtain a comprehensive empirical understanding. We hope these discussions will help researchers in neural IR learn from previous successes and failures, so that they can develop better neural ranking models in the future. In addition to the model discussion, we also introduce some trending topics in neural IR, including indexing schema, knowledge integration, visualized learning, contextual learning and model explanation. Some of these topics are important but have not been well addressed in this field, while others are very promising directions for future research.

In the following, we will first introduce some typical textual IR tasks addressed by neural ranking models in Section2. We then provide a unified formulation of neural ranking models in Section3. From Sections4–6 we review the existing models with regard to different dimensions as well as making empirical comparisons between them. We discuss trending topics in Section7 and conclude the paper in Section8.

(Video) Neural Models for Information Retrieval

## Major applications of neural ranking models

In this section, we describe several major textual IR applications where neural ranking models have been adopted and studied in the literature, including ad-hoc retrieval, question answering, community question answering, and automatic conversation. There are other applications where neural ranking models have been or could be applied, e.g., product search (Brenneretal., 2018), sponsored search (Grbovic,Djuric, Radosavljevic, Silvestri, & Bhamidipati, 2015), and so on. However, due to page

## A unified model formulation

Neural ranking models are mostly studied within the LTR framework. In this section, we give a unified formulation of neural ranking models from a generalized view of LTR problems.

Suppose that $\mathsc{S}$ is the generalized query set, which could be the set of search queries, natural language questions or input utterances, and $\mathsc{T}$ is the generalized document set, which could be the set of documents, answers or responses. Suppose that $\mathsc{Y}=\left\{1,2,\cdots ,l\right\}$ is the label set where labels represent grades. There exists a

## Model architecture

Based on the above unified formulation, here we review existing neural ranking model architectures to better understand their basic assumptions and design principles.

## Model learning

Beyond the architecture, in this section, we review the major learning objectives and training strategies adopted by neural ranking models for comprehensive understadning.

## Model comparison

In this section, we compare the empirical evaluation results of the previously reviewed neural ranking models on several popular benchmark data sets. We mainly survey and analyze the published results of neural ranking models for the ad-hoc retrieval and QA tasks. Note that sometimes it is difficult to compare published results across different papers–small changes such as different tokenization, stemming, etc. can lead to significant differences. Therefore, we attempt to collect results from

## Trending topics

In this section, we discuss several trending topics related to neural ranking models. Some of these topics are important but have not been well addressed in this field, while some are very promising directions for future research.

## Conclusion

The purpose of this survey is to summarize the current research status on neural ranking models, analyze the existing methodologies, and gain some insights for future development. We introduced a unified formulation over the neural ranking models, and reviewed existing models based on this formulation from different dimensions under model architecture and model learning. For model architecture analysis, we reviewed existing models to understand their underlying assumptions and major design

## Acknowlgedgments

This work was funded by the National Natural Science Foundation of China (NSFC) under Grants no. 61425016 and 61722211, and the Youth Innovation Promotion Association CAS under Grants no. 20144310. This work was supported in part by the UMass Amherst Center for Intelligent Information Retrieval and in part by NSF IIS-1715095. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsor.

## References (156)

• T.-Y. LiuLearning to rank for information retrieval

### Foundations and Trends in Information Retrieval

(2009)

• A. Abujabal et al.

(2019)

(2018)

• Q. Ai et al.

(2018)

• Q. Ai et al.

(2018)

• Q. Ai et al.

### WSDM'19 Workshop on Deep Matching in Practical Applications (DAPA 19)

(2019)

• B.V.D. Akker et al.

(2019)

### Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval

(2011)

• R. Baeza-Yates et al.

### Modern information retrieval

(2011)

• D. Bahdanau et al.

### CoRR

(2014)

• P.N. Bennett et al.

### Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval

(2011)

• P.N. Bennett et al.

### Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval

(2012)

• L. Boytsov et al.

### Proceedings of the 25th ACM international on conference on information and knowledge management

(2016)

• E. Brenner et al.

### End-to-end neural ranking for ecommerce product search: an application of task models and textual embeddings

(2018)

• C. Burges et al.

(2005)

• C.J. Burges

### From ranknet to lambdarank to lambdamart: An overview

(Video) Introduction to Neural Re-Ranking

### Learning

(2010)

• O. Chapelle et al.

### Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining

(2010)

• H. Chen et al.

### Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining

(2018)

• L. Chen et al.

### Information retrieval technology

(2018)

• W. Chen et al.

### Proceedings of the 41st international ACM SIGIR conference on research & development in information retrieval

(2018)

• W. Chen et al.

### Advances in neural information processing systems

(2009)

• D. Cohen et al.

### Neu-IR: The SIGIR 2016 Workshop on Neural Information Retrieval

(2016)

• D. Cohen et al.

### The 41st international ACM SIGIR conference on research & development in information retrieval

(2018)

• D. Cohen et al.

### Proceedings of the 41st international ACM SIGIR conference on research & development in information retrieval

(2018)

• D. Cohen et al.

### Proceedings of the ACM SIGIR international conference on theory of information retrieval

(2018)

• D. Cohen et al.

### Proceedings of the 41st international ACM SIGIR conference on research & development in information retrieval, SIGIR

(2018)

• N. Craswell et al.

### Report on the SIGIR 2016 workshop on neural information retrieval (Neu-IR)

(2017)

• N. Craswell et al.

### Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval

(2017)

• Z. Dai et al.

### Proceedings of the eleventh ACM international conference on web search and data mining

(2018)

• M. Dehghani et al.

### CoRR

(2017)

• M. Dehghani et al.

### Proceedings of the 40th international acm sigir conference on research and development in information retrieval

(2017)

• L. Dietz et al.

### Proceedings of the twenty-sixth text retrieval conference, TREC

(2017)

• K. Duh et al.

### Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval

(2008)

• Y. Fan et al.

### Proceedings of the 2017 ACM on conference on information and knowledge management

(2017)

• Y. Fan et al.

### The 41st international ACM SIGIR conference on research & development in information retrieval

(2018)

• Y. Fan et al.

### CoRR

(2017)

• M. Feng et al.

### 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), IEEE

(2015)

• G.W. Furnas et al.

### Communication of the ACM

(1987)

• J. Gao et al.

(2019)

(2018)

• Y. Goldberg

### Synthesis lectures on human language technologies

(2017)

• I. Goodfellow et al.

### Advances in neural information processing systems

(2014)

• M. Grbovic et al.

### Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval

(2015)

• J. Guo et al.

### Proceedings of the 25th ACM international on conference on information and knowledge management

(2016)

• C.V. Gysel et al.

(2018)

• X. He et al.

### Deep learning for natural language processing: Theory and practice

(2014)

• G. Hinton et al.

### IEEE Signal Processing Magazine

(2012)

• D. Hoogeveen et al.

(2015)

• B. Hu et al.

### Advances in neural information processing systems 27

(2014)

(Video) An Introduction to Neural Models in Information Retrieval

• J. Huang et al.

(2017)

• ## Cited by (132)

• Virtual prompt pre-training for prototype-based few-shot relation extraction

2023, Expert Systems with Applications

Prompt tuning with pre-trained language models (PLM) has exhibited outstanding performance by reducing the gap between pre-training tasks and various downstream applications, which requires additional labor efforts in label word mappings and prompt template engineering. However, in a label intensive research domain, e.g., few-shot relation extraction (RE), manually defining label word mappings is particularly challenging, because the number of utilized relation label classes with complex relation names can be extremely large. Besides, the manual prompt development in natural language is subjective to individuals. To tackle these issues, we propose a virtual prompt pre-training method, projecting the virtual prompt to latent space, then fusing with PLM parameters. The pre-training is entity-relation-aware for RE, including the tasks of mask entity prediction, entity typing, distant supervised RE, and contrastive prompt pre-training. The proposed pre-training method can provide robust initialization for prompt encoding, while maintaining the interaction with the PLM. Furthermore, the virtual prompt can effectively avoid the labor efforts and the subjectivity issue in label word mapping and prompt template engineering. Our proposed prompt-based prototype network delivers a novel learning paradigm to model entities and relations via the probability distribution and Euclidean distance of the predictions of query instances and prototypes. The results indicate that our model yields an averaged accuracy gain of 4.21% on two few-shot datasets over strong RE baselines. Based on our proposed framework, our pre-trained model outperforms the strongest RE-related PLM by 6.52%.

• The Threat of Offensive AI to Organizations

2023, Computers and Security

AI has provided us with the ability to automate tasks, extract information from vast amounts of data, and synthesize media that is nearly indistinguishable from the real thing. However, positive tools can also be used for negative purposes. In particular, cyber adversaries can use AI to enhance their attacks and expand their campaigns.

Although offensive AI has been discussed in the past, there is a need to analyze and understand the threat in the context of organizations. For example, how does an AI-capable adversary impact the cyber kill chain? Does AI benefit the attacker more than the defender? What are the most significant AI threats facing organizations today and what will be their impact on the future?

In this study, we explore the threat of offensive AI on organizations. First, we present the background and discuss how AI changes the adversary’s methods, strategies, goals, and overall attack model. Then, through a literature review, we identify 32 offensive AI capabilities which adversaries can use to enhance their attacks. Finally, through a panel survey spanning industry, government and academia, we rank the AI threats and provide insights on the adversaries.

• LaSER: Language-specific event recommendation

2023, Journal of Web Semantics

While societal events often impact people worldwide, a significant fraction of events has a local focus that primarily affects specific language communities. Examples include national elections, the development of the Coronavirus pandemic in different countries, and local film festivals such as the César Awards in France and the Moscow International Film Festival in Russia. However, existing entity recommendation approaches do not sufficiently address the language context of recommendation. This article introduces the novel task of language-specific event recommendation, which aims to recommend events relevant to the user query in the language-specific context. This task can support essential information retrieval activities, including web navigation and exploratory search, considering the language context of user information needs. We propose LaSER, a novel approach toward language-specific event recommendation. LaSER blends the language-specific latent representations (embeddings) of entities and events and spatio-temporal event features in a learning to rank model. This model is trained on publicly available Wikipedia Clickstream data. The results of our user study demonstrate that LaSER outperforms state-of-the-art recommendation baselines by up to 33 percentage points in [emailprotected] concerning the language-specific relevance of recommended events.

• Learning to rank method combining multi-head self-attention with conditional generative adversarial nets

2022, Array

The existing methods of learning to rank often ignore the relationship between ranking features. If the relationship between them can be fully utilized, the performance of learning to rank methods can be improved. Aiming at this problem, an approach of learning to rank that combines a multi-head self-attention mechanism with Conditional Generative Adversarial Nets (CGAN) is proposed in this paper, named *GAN-LTR. The proposed approach improves some design ideas of Information Retrieval Generative Adversarial Networks (IRGAN) framework applied to web search, and a new network model is constructed by integrating convolution layer, multi-head self-attention layer, residual layer, fully connected layer, batch normalization, and dropout technologies into the generator and discriminator of Conditional Generative Adversarial Nets (CGAN). The convolutional neural network is used to extract the ranking feature representation of the hidden layer and capture the internal correlation and interactive information between features. The multi-head self-attention mechanism is used to fuse feature information in multiple vector subspaces and capture the attention weight of features, so as to assign appropriate weights to different features. The experimental results on the MQ2008-semi learning to rank dataset show that compared with IRGAN, our proposed learning to rank method *GAN-LTR has certain performance advantages in various performance indicators on the whole.

• A deep learning based method benefiting from characteristics of patents for semantic relation classification

2022, Journal of Informetrics

The deep learning has become an important technique for semantic relation classification in patent texts. Previous studies just borrowed the relevant models from generic texts to patent texts while keeping structure of the models unchanged. Due to significant distinctions between patent texts and generic ones, this enables the performance of these models in the patent texts to be reduced dramatically. To highlight these distinct characteristics in patent texts, seven annotated corpora from different fields are comprehensively compared in terms of several indicators for linguistic characteristics. Then, a deep learning based method is proposed to benefit from these characteristics. Our method exploits the information from other similar entity pairs as well as that from the sentences mentioning a focal entity pair. The latter stems from the conventional practices, and the former from our meaningful observation: the stronger the connection between two entity pairs is, the more likely they belong to the same relation type. To measure quantitatively the connection between two entity pairs, a similarity indicator on the basis of association rules is raised. Extensive experiments on the corpora of TFH-2020 and ChemProt demonstrate that our method for semantic relation classification is capable of benefiting from characteristic of patent texts.

• ListMAP: Listwise learning to rank as maximum a posteriori estimation

2022, Information Processing and Management

Listwise learning to rank models, which optimize the ranking of a document list, are among the most widely adopted algorithms for finding and ranking relevant documents to user information needs. In this paper, we propose ListMAP, a new listwise learning to rank model with prior distribution that encodes the informativeness of training data and assigns different weights to training instances. The main intuition behind ListMAP is that documents in the training dataset do not have the same impact on training a ranking function. ListMAP formalizes the listwise loss function as a maximum a posteriori estimation problem in which the scoring function must be estimated such that the log probability of the predicted ranked list is maximized given a prior distribution on the labeled data. We provide a model for approximating the prior distribution parameters from a set of observation data. We implement the proposed learning to rank model using neural networks. We theoretically discuss and analyze the characteristics of the introduced model and empirically illustrate its performance on a number of benchmark datasets; namely MQ2007 and MQ2008 of the Letor 4.0 benchmark, Set 1 and Set 2 of the Yahoo! learning to rank challenge data set, and Microsoft 30k and Microsoft 10K datasets. We show that the proposed models are effective across different datasets in terms of information retrieval evaluation metrics NDCG and MRR at positions 1, 3, 5, 10, and 20.

View all citing articles on Scopus

## Recommended articles (6)

• Research article

Network measures: A new paradigm towards reliable novel word sense detection

Information Processing & Management, Volume 57, Issue 6, 2020, Article 102173

In this era of digitization, with the fast flow of information on the web, words are being used to denote newer meanings. Thus novel sense detection becomes a crucial and challenging task in order to build any natural language processing application which depends on the efficient semantic representation of words. With the recent availability of large amounts of digitized texts, automated analysis of language evolution has become possible. Given corpus from two different time periods, the main focus of our work is to detect the words evolved with a novel sense precisely. We pose this problem as a binary classification task to detect whether a new sense of a target word has emerged. This paper presents a unique proposal based on network features to improve the precision of this task of detecting emerged new sense of a target word. For a candidate word where a new sense has been detected by comparing the sense clusters induced at two different time periods, we further compare the network properties of the subgraphs induced from novel sense clusters across these two time periods. Using the mean fractional change in edge density, structural similarity and average path length as features in a Support Vector Machine (SVM) classifier, manual evaluation gives precision values of 0.86 and 0.74 for the task of new sense detection, when tested on 2 distinct time-point pairs, in comparison to the precision values in the range of 0.23-0.32, when the proposed scheme is not used. The outlined method can, therefore, be used as a new post-hoc step to improve the precision of novel word sense detection in a robust and reliable way where the underlying framework uses a graph structure. Another important observation is that even though our proposal is a post-hoc step, it can be used in isolation and that itself results in a very decent performance achieving a precision of 0.54-0.62. Finally, we also show that our method is able to detect well-known historical shifts in 80% cases.

• Research article

Region-action LSTM for mouse interaction sequence based search satisfaction evaluation

Information Processing & Management, Volume 57, Issue 6, 2020, Article 102349

Mouse interaction data contain a lot of interaction information between users and Search Engine Result Pages (SERPs), which can be useful for evaluating search satisfaction. Existing studies use aggregated features or anchor elements to capture the spatial information in mouse interaction data, which might lose valuable mouse cursor movement patterns for estimating search satisfaction. In this paper, we leverage regions together with actions to extract sequences from mouse interaction data. Using regions to capture the spatial information in mouse interaction data would reserve more details of the interaction processes between users and SERPs. To modeling mouse interaction sequences for search satisfaction evaluation, we propose a novel LSTM unit called Region-Action LSTM (RALSTM), which could capture the interactive relations between regions and actions without subjecting the network to higher training complexity. Simultaneously, we propose a data augmentation strategy Multi-Factor Perturbation (MFP) to increase the pattern variations on mouse interaction sequences. We evaluate the proposed approach on open datasets. The experimental results show that the proposed approach achieves significant performance improvement compared with the state-of-the-art search satisfaction evaluation approach.

• Research article

A Contextual Recurrent Collaborative Filtering framework for modelling sequences of venue checkins

Information Processing & Management, Volume 57, Issue 6, 2020, Article 102092

Context-Aware Venue Recommendation (CAVR) systems aim to effectively generate a ranked list of interesting venues users should visit based on their historical feedback (e.g. checkins) and context (e.g. the time of the day or the user’s current location). Such systems are increasingly deployed by Location-based Social Networks (LBSNs) such as Foursquare and Yelp to enhance the satisfaction of the users. Matrix Factorisation (MF) is a popular Collaborative Filtering (CF) technique that can suggest relevant venues to users based on an assumption that similar users are likely to visit similar venues. In recent years, deep neural networks have been successfully applied to recommendation systems. Indeed, various approaches have been previously proposed in the literature to enhance the effectiveness of MF-based approaches by exploiting Recurrent Neural Networks (RNN) models to capture the sequential properties of observed checkins. Moreover, recently, several RNN architectures have been proposed to incorporate contextual information associated with the users’ sequence of checkins (for instance, the time interval or the geographical distance between two successive checkins) to effectively capture such short-term preferences of users. In this work, we propose a Contextual Recurrent Collaborative Filtering framework (CRCF) that leverages the users’ preferred context and the contextual information associated with the users’ sequence of checkins in order to model the users’ short-term preferences for CAVR. In particular, the CRCF framework is built upon two state-of-the-art approaches: namely Deep Recurrent Collaborative Filtering framework (DRCF) and Contextual Attention Recurrent Architecture (CARA). Thorough experiments on three large checkin and rating datasets from commercial LBSNs demonstrate the effectiveness and robustness of our proposed CRCF framework by significantly outperforming various state-of-the-art matrix factorisation approaches. In particular, the CRCF framework significantly improves [emailprotected] by 5–20% over the state-of-the-art DRCF framework(Manotumruksa, Macdonald, and Ounis, 2017a) and the CARA architecture(Manotumruksa, Macdonald, and Ounis, 2018) across the three datasets. Furthermore, the CRCF framework is less significantly risky than both the DRCF framework and the CARA architecture across the three datasets.

• Research article

Eating healthier: Exploring nutrition information for healthier recipe recommendation

Information Processing & Management, Volume 57, Issue 6, 2020, Article 102051

With the booming of personalized recipe sharing networks (e.g., Yummly), a deluge of recipes from different cuisines could be obtained easily. In this paper, we aim to solve a problem which many home-cooks encounter when searching for recipes online. Namely, finding recipes which best fit a handy set of ingredients while at the same time follow healthy eating guidelines. This task is especially difficult since the lions share of online recipes have been shown to be unhealthy. In this paper we propose a novel framework named NutRec, which models the interactions between ingredients and their proportions within recipes for the purpose of offering healthy recommendation. Specifically, NutRec consists of three main components: 1) using an embedding-based ingredient predictor to predict the relevant ingredients with user-defined initial ingredients, 2) predicting the amounts of the relevant ingredients with a multi-layer perceptron-based network, 3) creating a healthy pseudo-recipe with a list of ingredients and their amounts according to the nutritional information and recommending the top similar recipes with the pseudo-recipe. We conduct the experiments on two recipe datasets, including Allrecipes with 36,429 recipes and Yummly with 89,413 recipes, respectively. The empirical results support the framework’s intuition and showcase its ability to retrieve healthier recipes.

• Research article

Hierarchical neural query suggestion with an attention mechanism

Information Processing & Management, Volume 57, Issue 6, 2020, Article 102040

Query suggestions help users of a search engine to refine their queries. Previous work on query suggestion has mainly focused on incorporating directly observable features such as query co-occurrence and semantic similarity. The structure of such features is often set manually, as a result of which hidden dependencies between queries and users may be ignored. We propose an Attention-based Hierarchical Neural Query Suggestion (AHNQS) model that uses an attention mechanism to automatically capture user preferences. AHNQS combines a session-level neural network and a user-level neural network into a hierarchical structure to model the short- and long-term search history of a user. We quantify the improvements of AHNQS over state-of-the-art recurrent neural network-based query suggestion baselines on the AOL query log dataset, with improvements of up to 9.66% and 12.51% in terms of [emailprotected] and [emailprotected], respectively; improvements are especially obvious for short sessions and inactive users with few search sessions.

• Research article

Deep learning on information retrieval and its applications

Deep Learning for Data Analytics, 2020, pp. 125-153

In the domain of information retrieval (IR), the matching of query and document relies on ranking models to calculate the degree of their relevance. Therefore, ranking models remain as the central component of the research. During the past decades, there has been a trend moving from traditional approaches to IR toward deep learning approaches to IR. Traditional IR models include basic handcrafted retrieval models, semantic-based models, term dependency-based models, and learning to rank models. The deep learning approaches, on the other hand, involve methods of representation learning, methods of matching function learning, and methods of relevance learning. Recently, we have seen a growing number of publications in both conferences and journals using deep learning techniques to solve the IR problems. The capability of neural ranking models to extract features directly from raw text inputs overcomes many limitations of traditional IR models that rely on handcrafted features. Moreover, the deep learning methods manage to capture complicated matching patterns for document ranking. In this chapter, we introduce a novel way of classifying these existing IR models, along with their recent improvements and developments. To the best of our knowledge, our approach is the first one to classify the existing work according to how they generate the features and the ranking functions. Moreover, we provide a review of these proposed models to discuss different dimensions and to make empirical comparisons, followed by a conclusion with possible directions of future work.

(Video) Studying the Catastrophic Forgeting Problem in Neural Ranking Models - ECIR 2021
(Video) Neural IR, part 1 | Stanford CS224U Natural Language Understanding | Spring 2021
View full text

## Videos

1. CIIR Talk Series - 3/12/2021: Bhaskar Mitra - Neural Information Retrieval: In search of ...
(CIIR Talk Series)
2. Information retrieval: A deeper dive into methods for finding relevant papers
(Lars Juhl Jensen)
3. CIIR Talk Series- 3/25/2022: Michael Bendersky - Neural Models for Learning To Rank
(CIIR Talk Series)
4. 9. Evaluation (2/3) - Information Retrieval - ETH Zurich - Spring 2022
(Ghislain Fourny's lectures)
5. Neural Learning to Rank: An Overview
(OffNote Labs)
6. IR Intelligence: Introduction to Neural IR & Learning To Rank - Marianne Sweeny
(BCS Member Groups)
Top Articles
Latest Posts
Article information

Author: Jeremiah Abshire

Last Updated: 01/19/2023

Views: 5581

Rating: 4.3 / 5 (54 voted)

Author information

Name: Jeremiah Abshire

Birthday: 1993-09-14

Address: Apt. 425 92748 Jannie Centers, Port Nikitaville, VT 82110

Phone: +8096210939894