深度神经网络在文本匹配中的应用
文本匹配旨在判断两个语句所包含语义信息是否相似,以及有多相似。它有诸多的应用场景,比如智能问答、文本聚类、文本去重等。在学术界,文本匹配算法通过学习得到一个多维的语义空间,并对不同文本进行空间映射,以表达不同文本所包含的语义信息,同时构建相似度评价方式,用于计算不同文本在该语义空间下的相似度。常用的文本匹配数据集如MSRC(如图一所示:https://aclweb.org/aclwiki/Paraphrase_Identification_(State_of_the_art))、LCQMC等。
图一:MSRC数据集中的样本示例
目前,文本匹配模型主要有两种结构,分别为SE结构(Sentence Encoding)与SI结构(Sentence Interaction),如图二所示。
图二:SE结构网络与SI结构网络
其中,SE结构网络利用深度神经网络对两个输入文本进行单独编码,得到两个文本的语义向量,并通过匹配层计算语义向量的相似性,进而得到两个文本的语义相似度。DSSM(deep structured semantic model)网络(Huang P S, He X, Gao J, et al. Learning deep structured semantic models for web search using clickthrough data[C]//Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2013: 2333-2338.)是较为通用的SE结构,其结构如图三所示,模型通过全连接层对文本进行编码,并计算查询语句Q与文档D_i相似的概率。
图三:DSSM网络
基于DSSM网络,研究人员使用卷积神经网络与循环神经网络,对原始全连接层进行替换,得到其变种网络:C-DSSM(Shen Y, He X, Gao J, et al. A latent semantic model with convolutional-pooling structure for information retrieval[C]//Proceedings of the 23rd ACM international conference on conference on information and knowledge management. ACM, 2014: 101-110.)与LSTM-DSSM(Palangi H, Deng L, Shen Y, et al. Semantic modelling with long-short-term memory for information retrieval[J]. arXiv preprint arXiv:1412.6629, 2014.),其结构如图四图五所示:
图四:C-DSSM网络
图五:LSTM-DSSM网络
SI网络通过注意力机制等方式,让两个文本的语义信息进行深层交互,在交互过程中,模型通过判断两个文本之间的相似语义与无关语义,得到两个文本的向量表达,并以此评价文本的相似性。图六展示了一个经典的SI网络,该网络(Wang S, Jiang J. A compare-aggregate model for matching text sequences[J]. arXiv preprint arXiv:1611.01747, 2016.)与SI网络不通的地方,是它构建了交互层,该网络在交互层中分别使用了五种交互方式,以判断哪种交互方式是最有效的。
图六:Compare-Aggregate模型
围绕SI与SE网络有许多相关工作,整体思路与上述网络类似,下方是近年来一些经典模型,大家有兴趣的话可以查询相关论文进行深入阅读理解:
Feng M, Xiang B, Glass M R, et al. Applying deep learning to answer selection: A study and an open task[C]//2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE, 2015: 813-820.
A COMPARE-AGGREGATE MODEL FOR MATCHING TEXT SEQUENCES
A Decomposable Attention Model for Natural Language Inference
Bilateral Multi-Perspective Matching for Natural Language Sentences
Sentence Similarity Learning by Lexical Decomposition and Composition
Enhanced LSTM for Natural Language Inference
Inter-Weighted Alignment Network for Sentence Pair Modeling
Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference
NATURAL LANGUAGE INFERENCE OVER INTERACTION SPACE
Multi-Cast Attention Networks for Retrieval-based Question Answering and Response Prediction
Co-Stack Residual Affinity Networks with Multi-level Attention Refinement for Matching Text Sequences
Multiway Attention Networks for Modeling Sentence Pairs
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
I Know What You Want: Semantic Learning for Text Comprehension
Simple and Effective Text Matching with Richer Alignment Features
- 点赞
- 收藏
- 关注作者
评论(0)