RegisterSign in
ViewPDF
- Access throughyour institution
Article preview
- Abstract
- Introduction
- Section snippets
- References (156)
- Cited by (132)
- Recommended articles (6)
Volume 57, Issue 6,
November 2020
, 102067
Author links open overlay panel
Abstract
Ranking models lie at the heart of research on information retrieval (IR). During the past decades, different techniques have been proposed for constructing ranking models, from traditional heuristic methods, probabilistic methods, to modern machine learning methods. Recently, with the advance of deep learning technology, we have witnessed a growing body of work in applying shallow or deep neural networks to the ranking problem in IR, referred to as neural ranking models in this paper. The power of neural ranking models lies in the ability to learn from the raw text inputs for the ranking problem to avoid many limitations of hand-crafted features. Neural networks have sufficient capacity to model complicated tasks, which is needed to handle the complexity of relevance estimation in ranking. Since there have been a large variety of neural ranking models proposed, we believe it is the right time to summarize the current status, learn from existing methodologies, and gain some insights for future development. In contrast to existing reviews, in this survey, we will take a deep look into the neural ranking models from different dimensions to analyze their underlying assumptions, major design principles, and learning strategies. We compare these models through benchmark tasks to obtain a comprehensive empirical understanding of the existing techniques. We will also discuss what is missing in the current literature and what are the promising and desired future directions.
Introduction
Information retrieval is a core task in many real-world applications, such as digital libraries, expert finding, Web search, and so on. Essentially, IR is the activity of obtaining some information resources relevant to an information need from within large collections. As there might be a variety of relevant resources, the returned results are typically ranked with respect to some relevance notion. This ranking of results is a key difference of IR from other problems. Therefore, research on ranking models has always been at the heart of IR.
Ma ny different ranking models have been proposed over the past decades, including vector space models (Salton,Wong, & Yang, 1975), probabilistic models (Robertson & Jones,1976), and learning to rank (LTR) models (Li, 2011, Liu, 2009). Existing techniques, especially the LTR models, have already achieved great success in many IR applications, e.g., modern Web search engines like Google1 or Bing2. There is still, however, much room for improvement in the effectiveness of these techniques for more complex retrieval tasks.
In recent years, deep neural networks have led to exciting breakthroughs in speech recognition (Hintonetal., 2012), computer vision (Krizhevsky, Sutskever, Hinton, 2012, LeCun, Bengio, Hinton, 2015), and natural language processing (NLP) (Bahdanau, Cho, Bengio, 2014, Goldberg, 2017). These models have been shown to be effective at learning abstract representations from the raw input, and have sufficient model capacity to tackle difficult learning problems. Both of these are desirable properties for ranking models in IR. On one hand, most existing LTR models rely on hand-crafted features, which are usually time-consuming to design and often over-specific in definition. It would be of great value if ranking models could learn the useful ranking features automatically. On the other hand, relevance, as a key notion in IR, is often vague in definition and difficult to estimate since relevance judgments are based on a complicated human cognitive process. Neural models with sufficient model capacity have more potential for learning such complicated tasks than traditional shallow models. Due to these potential benefits and along with the expectation that similar successes with deep learning could be achieved in IR (Craswell,Croft, Guo, Mitra, & deRijke, 2017a), we have witnessed substantial growth of work in applying neural networks for constructing ranking models in both academia and industry in recent years. Note that in this survey, we focus on neural ranking models for textual retrieval, which is central to IR, but not the only mode that neural models can be used for (Brenner, Zhao, Kutiyanawala, Yan, 2018, Wan, Wang, Hoi, Wu, Zhu, Zhang, etal., 2014).
Per haps the first successful model of this type is the Deep Structured Semantic Model (DSSM) (Huangetal., 2013) introduced in 2013, which is a neural ranking model that directly tackles the ad-hoc retrieval task. In the same year, Luand Li(2013) proposed DeepMatch, which is a deep matching method applied to the Community-based Question Answering (CQA) and micro-blog matching tasks. Note that at the same time or even before this work, there were a number of studies focused on learning low-dimensional representations of texts with neural models (Mikolov, Sutskever, Chen, Corrado, Dean, 2013b, Salakhutdinov, Hinton, 2009) and using them either within traditional IR models or with some new similarity metrics for ranking tasks. However, we would like to refer to those methods as representation learning models rather than neural ranking models, since they did not directly construct the ranking function with neural networks. Later, between 2014 and 2015, work on neural ranking models began to grow, such as new variants of DSSM (Huangetal., 2013), ARC I and ARC II (Hu,Lu, Li, & Chen, 2014), MatchPyramid (Pangetal., 2016b), and so on. Most of this research focused on short text ranking tasks, such as TREC QA tracks and Microblog tracks (Severyn & Moschitti,2015). Since 2016, the study of neural ranking models has bloomed, with significant work volume, deeper and more rigorous discussions, and much wider applications (Onaletal., 2018). For example, researchers began to discuss the practical effectiveness of neural ranking models on different ranking tasks (Cohen, Ai, Croft, 2016, Guo, Fan, Ai, Croft, 2016). Neural ranking models have been applied to ad-hoc retrieval (Hui, Yates, Berberich, de Melo, 2017a, Mitra, Diaz, Craswell, 2017), community-based QA (Qiu & Huang,2015), conversational search (Yan,Song, & Wu, 2016a), and so on. Researchers began to go beyond the architecture of neural ranking models, paying attention to new training paradigms of neural ranking models (Dehghani,Zamani, Severyn, Kamps, & Croft, 2017b), alternate indexing schemes for neural representations (Zamani,Dehghani, Croft, Learned-Miller, & Kamps, 2018b), integration of external knowledge (Xiong, Callan, Liu, 2017a, Yang, Qiu, Qu, Guo, Zhang, Croft, etal., 2018), and other novel uses of neural approaches for IR tasks (Fan, Guo, Lan, Xu, Pang, Cheng, 2017a, Tang, Yang, 2018).
Up to now, we have seen exciting progress on neural ranking models. In academia, several neural ranking models learned from scratch can already outperform state-of-the-art LTR models with tens of hand-crafted features (Fan, Guo, Lan, Xu, Zhai, Cheng, 2018, Pang, Lan, Guo, Xu, Xu, Cheng, 2017). Workshops and tutorials on this topic have attracted extensive interest in the IR community (Craswell, Croft, Guo, Mitra, de Rijke, 2017a, Craswell, Croft, de Rijke, Guo, Mitra, 2017b). Standard benchmark datasets (Nguyen, Rosenberg, Song, Gao, Tiwary, Majumder, Deng, 2016b, Yang, Yih, Meek, 2015), evaluation tasks (Dietz,Verma, Radlinski, & Craswell, 2017), and open-source toolkits (Fanetal., 2017b) have been created to facilitate research and rigorous comparison. Meanwhile, in industry, we have also seen models such as DSSM put into a wide range of practical usage in the enterprise (He,Gao, & Deng, 2014). Neural ranking models already generate the most important features for modern search engines. However, beyond these exciting results, there is still a long way to go for neural ranking models: (1) Neural ranking models have not had the level of breakthroughs achieved by neural methods in speech recognition or computer vision; (2) There is little understanding and few guidelines on the design principles of neural ranking models; (3) We have not identified the special capabilities of neural ranking models that go beyond traditional IR models. Therefore, it is the right moment to take a look back, summarize the current status, and gain some insights for future development.
There have been some related surveys on neural approaches to IR (neural IR for short). For example, Onaletal.(2018) reviewed the current landscape of neural IR research, paying attention to the application of neural methods to different IR tasks. Mitraand Craswell(2017) gave an introduction to neural information retrieval. In their booklet, they talked about fundamentals of text retrieval, and briefly reviewed IR methods employing pre-trained embeddings and neural networks. In contrast to this work, this survey does not try to cover every aspect of neural IR, but will focus on and take a deep look into ranking models with deep neural networks. Specifically, we formulate the existing neural ranking models under a unified framework, and review them from different dimensions to understand their underlying assumptions, major design principles, and learning strategies. We also compare representative neural ranking models through benchmark tasks to obtain a comprehensive empirical understanding. We hope these discussions will help researchers in neural IR learn from previous successes and failures, so that they can develop better neural ranking models in the future. In addition to the model discussion, we also introduce some trending topics in neural IR, including indexing schema, knowledge integration, visualized learning, contextual learning and model explanation. Some of these topics are important but have not been well addressed in this field, while others are very promising directions for future research.
In the following, we will first introduce some typical textual IR tasks addressed by neural ranking models in Section2. We then provide a unified formulation of neural ranking models in Section3. From Sections4–6 we review the existing models with regard to different dimensions as well as making empirical comparisons between them. We discuss trending topics in Section7 and conclude the paper in Section8.
Section snippets
Major applications of neural ranking models
In this section, we describe several major textual IR applications where neural ranking models have been adopted and studied in the literature, including ad-hoc retrieval, question answering, community question answering, and automatic conversation. There are other applications where neural ranking models have been or could be applied, e.g., product search (Brenneretal., 2018), sponsored search (Grbovic,Djuric, Radosavljevic, Silvestri, & Bhamidipati, 2015), and so on. However, due to page
A unified model formulation
Neural ranking models are mostly studied within the LTR framework. In this section, we give a unified formulation of neural ranking models from a generalized view of LTR problems.
Suppose that is the generalized query set, which could be the set of search queries, natural language questions or input utterances, and is the generalized document set, which could be the set of documents, answers or responses. Suppose that is the label set where labels represent grades. There exists a
Model architecture
Based on the above unified formulation, here we review existing neural ranking model architectures to better understand their basic assumptions and design principles.
Model learning
Beyond the architecture, in this section, we review the major learning objectives and training strategies adopted by neural ranking models for comprehensive understadning.
Model comparison
In this section, we compare the empirical evaluation results of the previously reviewed neural ranking models on several popular benchmark data sets. We mainly survey and analyze the published results of neural ranking models for the ad-hoc retrieval and QA tasks. Note that sometimes it is difficult to compare published results across different papers–small changes such as different tokenization, stemming, etc. can lead to significant differences. Therefore, we attempt to collect results from
Trending topics
In this section, we discuss several trending topics related to neural ranking models. Some of these topics are important but have not been well addressed in this field, while some are very promising directions for future research.
Conclusion
The purpose of this survey is to summarize the current research status on neural ranking models, analyze the existing methodologies, and gain some insights for future development. We introduced a unified formulation over the neural ranking models, and reviewed existing models based on this formulation from different dimensions under model architecture and model learning. For model architecture analysis, we reviewed existing models to understand their underlying assumptions and major design
Acknowlgedgments
This work was funded by the National Natural Science Foundation of China (NSFC) under Grants no. 61425016 and 61722211, and the Youth Innovation Promotion Association CAS under Grants no. 20144310. This work was supported in part by the UMass Amherst Center for Intelligent Information Retrieval and in part by NSF IIS-1715095. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsor.
References (156)
- T.-Y. LiuLearning to rank for information retrieval
Foundations and Trends in Information Retrieval
(2009)
- A. Abujabal et al.
ComQA: A community-sourced dataset for complex factoid question answering with paraphrase clusters
Annual Conference of the North American Chapter of the Association for Computational Linguistics
(2019)
- W.U. Ahmad et al.
Multi-task learning for document ranking and query suggestion
Proceedings of the sixth international conference on learning representations
(2018)
- Q. Ai et al.
Learning a deep listwise context model for ranking refinement
Proceedings of the 41st international ACM SIGIR conference on research & development in information retrieval
(2018)
- Q. Ai et al.
Unbiased learning to rank: Theory and practice
Proceedings of the 27th ACM international conference on information and knowledge management
(2018)
- Q. Ai et al.
Learning groupwise scoring functions using deep neural networks
WSDM'19 Workshop on Deep Matching in Practical Applications (DAPA 19)
(2019)
- B.V.D. Akker et al.
ViTOR: Learning to rank webpages based on visual features
The World Wide Web Conference (WWW'19)
(2019)
- N. Asadi et al.
Pseudo test collections for learning web search ranking functions
Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval
(2011)
- R. Baeza-Yates et al.
Modern information retrieval
(2011)
- D. Bahdanau et al.
Neural machine translation by jointly learning to align and translate
CoRR
(2014)
Inferring and using location metadata to personalize web search
Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval
(2011)
Modeling the impact of short- and long-term behavior on search personalization
Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval
(2012)
Off the beaten path: Let’s replace term-based retrieval with k-NN search
Proceedings of the 25th ACM international on conference on information and knowledge management
(2016)
End-to-end neural ranking for ecommerce product search: an application of task models and textual embeddings
(2018)
Learning to rank using gradient descent
Proceedings of the 22nd international conference on machine learning (ICML)
(2005)
From ranknet to lambdarank to lambdamart: An overview
Learning
(2010)
Multi-task learning for boosting with application to web search ranking
Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining
(2010)
Mix: Multi-channel information crossing for text matching
Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining
(2018)
Ri-match: Integrating both representations and interactions for deep semantic matching
Information retrieval technology
(2018)
Attention-based hierarchical neural query suggestion
Proceedings of the 41st international ACM SIGIR conference on research & development in information retrieval
(2018)
Ranking measures and loss functions in learning to rank
Advances in neural information processing systems
(2009)
Adaptability of neural networks on varying granularity ir tasks
Neu-IR: The SIGIR 2016 Workshop on Neural Information Retrieval
(2016)
Universal approximation functions for fast learning to rank: Replacing expensive regression forests with simple feed-forward networks
The 41st international ACM SIGIR conference on research & development in information retrieval
(2018)
Cross domain regularization for neural ranking models using adversarial learning
Proceedings of the 41st international ACM SIGIR conference on research & development in information retrieval
(2018)
Understanding the representational power of neural retrieval models using NLP tasks
Proceedings of the ACM SIGIR international conference on theory of information retrieval
(2018)
WikiPassageQA: A benchmark collection for research on non-factoid answer passage retrieval
Proceedings of the 41st international ACM SIGIR conference on research & development in information retrieval, SIGIR
(2018)
Report on the SIGIR 2016 workshop on neural information retrieval (Neu-IR)
(2017)
SIGIR 2017 workshop on neural information retrieval (Neu-IR)
Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval
(2017)
Convolutional neural networks for soft-matching n-grams in ad-hoc search
Proceedings of the eleventh ACM international conference on web search and data mining
(2018)
Avoiding your teacher’s mistakes: Training neural networks with controlled weak supervision
CoRR
(2017)
Neural ranking models with weak supervision
Proceedings of the 40th international acm sigir conference on research and development in information retrieval
(2017)
TREC complex answer retrieval overview
Proceedings of the twenty-sixth text retrieval conference, TREC
(2017)
Learning to rank with partially-labeled data
Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval
(2008)
Learning visual features from snapshots for web search
Proceedings of the 2017 ACM on conference on information and knowledge management
(2017)
Modeling diverse relevance patterns in ad-hoc retrieval
The 41st international ACM SIGIR conference on research & development in information retrieval
(2018)
Matchzoo: A toolkit for deep text matching
CoRR
(2017)
Applying deep learning to answer selection: A study and an open task
2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), IEEE
(2015)
The vocabulary problem in human-system communication
Communication of the ACM
(1987)
Neural approaches to conversational AI
Foundations and Trends in Information Retrieval
(2019)
A knowledge-grounded neural conversation model
Proceedings of the thirty-second AAAI conference on artificial intelligence, (AAAI)
(2018)
Neural network methods for natural language processing
Synthesis lectures on human language technologies
(2017)
Generative adversarial nets
Advances in neural information processing systems
(2014)
Context- and content-aware embeddings for query rewriting in sponsored search
Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval
(2015)
A deep relevance matching model for ad-hoc retrieval
Proceedings of the 25th ACM international on conference on information and knowledge management
(2016)
Neural vector spaces for unsupervised information retrieval
ACM Transactions on Information Systems
(2018)
Deep learning for natural language processing: Theory and practice
(2014)
Deep neural networks for acoustic modeling in speech recognition
IEEE Signal Processing Magazine
(2012)
CQADupStack: A benchmark data set for community question-answering research
Proceedings of the 20th Australasian document computing symposium
(2015)
Convolutional neural network architectures for matching natural language sentences
Advances in neural information processing systems 27
(2014)
Multi-granularity neural sentence model for measuring short text similarity
Database systems for advanced applications
(2017)
Cited by (132)
- Virtual prompt pre-training for prototype-based few-shot relation extraction
2023, Expert Systems with Applications
Prompt tuning with pre-trained language models (PLM) has exhibited outstanding performance by reducing the gap between pre-training tasks and various downstream applications, which requires additional labor efforts in label word mappings and prompt template engineering. However, in a label intensive research domain, e.g., few-shot relation extraction (RE), manually defining label word mappings is particularly challenging, because the number of utilized relation label classes with complex relation names can be extremely large. Besides, the manual prompt development in natural language is subjective to individuals. To tackle these issues, we propose a virtual prompt pre-training method, projecting the virtual prompt to latent space, then fusing with PLM parameters. The pre-training is entity-relation-aware for RE, including the tasks of mask entity prediction, entity typing, distant supervised RE, and contrastive prompt pre-training. The proposed pre-training method can provide robust initialization for prompt encoding, while maintaining the interaction with the PLM. Furthermore, the virtual prompt can effectively avoid the labor efforts and the subjectivity issue in label word mapping and prompt template engineering. Our proposed prompt-based prototype network delivers a novel learning paradigm to model entities and relations via the probability distribution and Euclidean distance of the predictions of query instances and prototypes. The results indicate that our model yields an averaged accuracy gain of 4.21% on two few-shot datasets over strong RE baselines. Based on our proposed framework, our pre-trained model outperforms the strongest RE-related PLM by 6.52%.
- The Threat of Offensive AI to Organizations
2023, Computers and Security
AI has provided us with the ability to automate tasks, extract information from vast amounts of data, and synthesize media that is nearly indistinguishable from the real thing. However, positive tools can also be used for negative purposes. In particular, cyber adversaries can use AI to enhance their attacks and expand their campaigns.
Although offensive AI has been discussed in the past, there is a need to analyze and understand the threat in the context of organizations. For example, how does an AI-capable adversary impact the cyber kill chain? Does AI benefit the attacker more than the defender? What are the most significant AI threats facing organizations today and what will be their impact on the future?
In this study, we explore the threat of offensive AI on organizations. First, we present the background and discuss how AI changes the adversary’s methods, strategies, goals, and overall attack model. Then, through a literature review, we identify 32 offensive AI capabilities which adversaries can use to enhance their attacks. Finally, through a panel survey spanning industry, government and academia, we rank the AI threats and provide insights on the adversaries.
- LaSER: Language-specific event recommendation
2023, Journal of Web Semantics
While societal events often impact people worldwide, a significant fraction of events has a local focus that primarily affects specific language communities. Examples include national elections, the development of the Coronavirus pandemic in different countries, and local film festivals such as the César Awards in France and the Moscow International Film Festival in Russia. However, existing entity recommendation approaches do not sufficiently address the language context of recommendation. This article introduces the novel task of language-specific event recommendation, which aims to recommend events relevant to the user query in the language-specific context. This task can support essential information retrieval activities, including web navigation and exploratory search, considering the language context of user information needs. We propose LaSER, a novel approach toward language-specific event recommendation. LaSER blends the language-specific latent representations (embeddings) of entities and events and spatio-temporal event features in a learning to rank model. This model is trained on publicly available Wikipedia Clickstream data. The results of our user study demonstrate that LaSER outperforms state-of-the-art recommendation baselines by up to 33 percentage points in [emailprotected] concerning the language-specific relevance of recommended events.
- Learning to rank method combining multi-head self-attention with conditional generative adversarial nets
2022, Array
The existing methods of learning to rank often ignore the relationship between ranking features. If the relationship between them can be fully utilized, the performance of learning to rank methods can be improved. Aiming at this problem, an approach of learning to rank that combines a multi-head self-attention mechanism with Conditional Generative Adversarial Nets (CGAN) is proposed in this paper, named *GAN-LTR. The proposed approach improves some design ideas of Information Retrieval Generative Adversarial Networks (IRGAN) framework applied to web search, and a new network model is constructed by integrating convolution layer, multi-head self-attention layer, residual layer, fully connected layer, batch normalization, and dropout technologies into the generator and discriminator of Conditional Generative Adversarial Nets (CGAN). The convolutional neural network is used to extract the ranking feature representation of the hidden layer and capture the internal correlation and interactive information between features. The multi-head self-attention mechanism is used to fuse feature information in multiple vector subspaces and capture the attention weight of features, so as to assign appropriate weights to different features. The experimental results on the MQ2008-semi learning to rank dataset show that compared with IRGAN, our proposed learning to rank method *GAN-LTR has certain performance advantages in various performance indicators on the whole.
- A deep learning based method benefiting from characteristics of patents for semantic relation classification
2022, Journal of Informetrics
The deep learning has become an important technique for semantic relation classification in patent texts. Previous studies just borrowed the relevant models from generic texts to patent texts while keeping structure of the models unchanged. Due to significant distinctions between patent texts and generic ones, this enables the performance of these models in the patent texts to be reduced dramatically. To highlight these distinct characteristics in patent texts, seven annotated corpora from different fields are comprehensively compared in terms of several indicators for linguistic characteristics. Then, a deep learning based method is proposed to benefit from these characteristics. Our method exploits the information from other similar entity pairs as well as that from the sentences mentioning a focal entity pair. The latter stems from the conventional practices, and the former from our meaningful observation: the stronger the connection between two entity pairs is, the more likely they belong to the same relation type. To measure quantitatively the connection between two entity pairs, a similarity indicator on the basis of association rules is raised. Extensive experiments on the corpora of TFH-2020 and ChemProt demonstrate that our method for semantic relation classification is capable of benefiting from characteristic of patent texts.
- ListMAP: Listwise learning to rank as maximum a posteriori estimation
2022, Information Processing and Management
Listwise learning to rank models, which optimize the ranking of a document list, are among the most widely adopted algorithms for finding and ranking relevant documents to user information needs. In this paper, we propose ListMAP, a new listwise learning to rank model with prior distribution that encodes the informativeness of training data and assigns different weights to training instances. The main intuition behind ListMAP is that documents in the training dataset do not have the same impact on training a ranking function. ListMAP formalizes the listwise loss function as a maximum a posteriori estimation problem in which the scoring function must be estimated such that the log probability of the predicted ranked list is maximized given a prior distribution on the labeled data. We provide a model for approximating the prior distribution parameters from a set of observation data. We implement the proposed learning to rank model using neural networks. We theoretically discuss and analyze the characteristics of the introduced model and empirically illustrate its performance on a number of benchmark datasets; namely MQ2007 and MQ2008 of the Letor 4.0 benchmark, Set 1 and Set 2 of the Yahoo! learning to rank challenge data set, and Microsoft 30k and Microsoft 10K datasets. We show that the proposed models are effective across different datasets in terms of information retrieval evaluation metrics NDCG and MRR at positions 1, 3, 5, 10, and 20.
Recommended articles (6)
Research article
Network measures: A new paradigm towards reliable novel word sense detectionInformation Processing & Management, Volume 57, Issue 6, 2020, Article 102173
In this era of digitization, with the fast flow of information on the web, words are being used to denote newer meanings. Thus novel sense detection becomes a crucial and challenging task in order to build any natural language processing application which depends on the efficient semantic representation of words. With the recent availability of large amounts of digitized texts, automated analysis of language evolution has become possible. Given corpus from two different time periods, the main focus of our work is to detect the words evolved with a novel sense precisely. We pose this problem as a binary classification task to detect whether a new sense of a target word has emerged. This paper presents a unique proposal based on network features to improve the precision of this task of detecting emerged new sense of a target word. For a candidate word where a new sense has been detected by comparing the sense clusters induced at two different time periods, we further compare the network properties of the subgraphs induced from novel sense clusters across these two time periods. Using the mean fractional change in edge density, structural similarity and average path length as features in a Support Vector Machine (SVM) classifier, manual evaluation gives precision values of 0.86 and 0.74 for the task of new sense detection, when tested on 2 distinct time-point pairs, in comparison to the precision values in the range of 0.23-0.32, when the proposed scheme is not used. The outlined method can, therefore, be used as a new post-hoc step to improve the precision of novel word sense detection in a robust and reliable way where the underlying framework uses a graph structure. Another important observation is that even though our proposal is a post-hoc step, it can be used in isolation and that itself results in a very decent performance achieving a precision of 0.54-0.62. Finally, we also show that our method is able to detect well-known historical shifts in 80% cases.
Research article
Region-action LSTM for mouse interaction sequence based search satisfaction evaluationInformation Processing & Management, Volume 57, Issue 6, 2020, Article 102349
Mouse interaction data contain a lot of interaction information between users and Search Engine Result Pages (SERPs), which can be useful for evaluating search satisfaction. Existing studies use aggregated features or anchor elements to capture the spatial information in mouse interaction data, which might lose valuable mouse cursor movement patterns for estimating search satisfaction. In this paper, we leverage regions together with actions to extract sequences from mouse interaction data. Using regions to capture the spatial information in mouse interaction data would reserve more details of the interaction processes between users and SERPs. To modeling mouse interaction sequences for search satisfaction evaluation, we propose a novel LSTM unit called Region-Action LSTM (RALSTM), which could capture the interactive relations between regions and actions without subjecting the network to higher training complexity. Simultaneously, we propose a data augmentation strategy Multi-Factor Perturbation (MFP) to increase the pattern variations on mouse interaction sequences. We evaluate the proposed approach on open datasets. The experimental results show that the proposed approach achieves significant performance improvement compared with the state-of-the-art search satisfaction evaluation approach.
Research article
A Contextual Recurrent Collaborative Filtering framework for modelling sequences of venue checkinsInformation Processing & Management, Volume 57, Issue 6, 2020, Article 102092
Context-Aware Venue Recommendation (CAVR) systems aim to effectively generate a ranked list of interesting venues users should visit based on their historical feedback (e.g. checkins) and context (e.g. the time of the day or the user’s current location). Such systems are increasingly deployed by Location-based Social Networks (LBSNs) such as Foursquare and Yelp to enhance the satisfaction of the users. Matrix Factorisation (MF) is a popular Collaborative Filtering (CF) technique that can suggest relevant venues to users based on an assumption that similar users are likely to visit similar venues. In recent years, deep neural networks have been successfully applied to recommendation systems. Indeed, various approaches have been previously proposed in the literature to enhance the effectiveness of MF-based approaches by exploiting Recurrent Neural Networks (RNN) models to capture the sequential properties of observed checkins. Moreover, recently, several RNN architectures have been proposed to incorporate contextual information associated with the users’ sequence of checkins (for instance, the time interval or the geographical distance between two successive checkins) to effectively capture such short-term preferences of users. In this work, we propose a Contextual Recurrent Collaborative Filtering framework (CRCF) that leverages the users’ preferred context and the contextual information associated with the users’ sequence of checkins in order to model the users’ short-term preferences for CAVR. In particular, the CRCF framework is built upon two state-of-the-art approaches: namely Deep Recurrent Collaborative Filtering framework (DRCF) and Contextual Attention Recurrent Architecture (CARA). Thorough experiments on three large checkin and rating datasets from commercial LBSNs demonstrate the effectiveness and robustness of our proposed CRCF framework by significantly outperforming various state-of-the-art matrix factorisation approaches. In particular, the CRCF framework significantly improves [emailprotected] by 5–20% over the state-of-the-art DRCF framework(Manotumruksa, Macdonald, and Ounis, 2017a) and the CARA architecture(Manotumruksa, Macdonald, and Ounis, 2018) across the three datasets. Furthermore, the CRCF framework is less significantly risky than both the DRCF framework and the CARA architecture across the three datasets.
Research article
Eating healthier: Exploring nutrition information for healthier recipe recommendationInformation Processing & Management, Volume 57, Issue 6, 2020, Article 102051
With the booming of personalized recipe sharing networks (e.g., Yummly), a deluge of recipes from different cuisines could be obtained easily. In this paper, we aim to solve a problem which many home-cooks encounter when searching for recipes online. Namely, finding recipes which best fit a handy set of ingredients while at the same time follow healthy eating guidelines. This task is especially difficult since the lions share of online recipes have been shown to be unhealthy. In this paper we propose a novel framework named NutRec, which models the interactions between ingredients and their proportions within recipes for the purpose of offering healthy recommendation. Specifically, NutRec consists of three main components: 1) using an embedding-based ingredient predictor to predict the relevant ingredients with user-defined initial ingredients, 2) predicting the amounts of the relevant ingredients with a multi-layer perceptron-based network, 3) creating a healthy pseudo-recipe with a list of ingredients and their amounts according to the nutritional information and recommending the top similar recipes with the pseudo-recipe. We conduct the experiments on two recipe datasets, including Allrecipes with 36,429 recipes and Yummly with 89,413 recipes, respectively. The empirical results support the framework’s intuition and showcase its ability to retrieve healthier recipes.
Research article
Hierarchical neural query suggestion with an attention mechanismInformation Processing & Management, Volume 57, Issue 6, 2020, Article 102040
Query suggestions help users of a search engine to refine their queries. Previous work on query suggestion has mainly focused on incorporating directly observable features such as query co-occurrence and semantic similarity. The structure of such features is often set manually, as a result of which hidden dependencies between queries and users may be ignored. We propose an Attention-based Hierarchical Neural Query Suggestion (AHNQS) model that uses an attention mechanism to automatically capture user preferences. AHNQS combines a session-level neural network and a user-level neural network into a hierarchical structure to model the short- and long-term search history of a user. We quantify the improvements of AHNQS over state-of-the-art recurrent neural network-based query suggestion baselines on the AOL query log dataset, with improvements of up to 9.66% and 12.51% in terms of [emailprotected] and [emailprotected], respectively; improvements are especially obvious for short sessions and inactive users with few search sessions.
Research article
Deep learning on information retrieval and its applicationsDeep Learning for Data Analytics, 2020, pp. 125-153
In the domain of information retrieval (IR), the matching of query and document relies on ranking models to calculate the degree of their relevance. Therefore, ranking models remain as the central component of the research. During the past decades, there has been a trend moving from traditional approaches to IR toward deep learning approaches to IR. Traditional IR models include basic handcrafted retrieval models, semantic-based models, term dependency-based models, and learning to rank models. The deep learning approaches, on the other hand, involve methods of representation learning, methods of matching function learning, and methods of relevance learning. Recently, we have seen a growing number of publications in both conferences and journals using deep learning techniques to solve the IR problems. The capability of neural ranking models to extract features directly from raw text inputs overcomes many limitations of traditional IR models that rely on handcrafted features. Moreover, the deep learning methods manage to capture complicated matching patterns for document ranking. In this chapter, we introduce a novel way of classifying these existing IR models, along with their recent improvements and developments. To the best of our knowledge, our approach is the first one to classify the existing work according to how they generate the features and the ranking functions. Moreover, we provide a review of these proposed models to discuss different dimensions and to make empirical comparisons, followed by a conclusion with possible directions of future work.
(Video) Studying the Catastrophic Forgeting Problem in Neural Ranking Models - ECIR 2021
© 2019 Elsevier Ltd. All rights reserved.