Segmentation of multi-sentence questions
- Additional Document Info
- View All
Existing question retrieval models work relatively well in finding similar questions in community-based question answering (cQA) services. However, they are designed for single-sentence queries or bag-of-word representations, and are not sufficient to handle multi-sentence questions complemented with various contexts. Segmenting questions into parts that are topically related could assist the retrieval system to not only better understand the user's different information needs but also fetch the most appropriate fragments of questions and answers in cQA archive that are relevant to user's query. In this paper, we propose a graph based approach to segmenting multi-sentence questions. The results from user studies show that our segmentation model outperforms traditional systems in question segmentation by over 30% in user's satisfaction. We incorporate the segmentation model into existing cQA question retrieval framework for more targeted question matching, and the empirical evaluation results demonstrate that the segmentation boosts the question retrieval performance by up to 12.93% in Mean Average Precision and 11.72% in Top One Precision. Our model comes with a comprehensive question detector equipped with both lexical and syntactic features. © 2010 ACM.
author list (cited authors)
Wang, K., Ming, Z., Hu, X., & Chua, T.