学术论文

      自然语言处理中的篇章主次关系研究

      Discourse Primary-Secondary Relationships in Natural Language Processing

      摘要:
      篇章结构分析特别是篇章主次关系研究是自然语言处理领域的一个重要研究方向.篇章主次关系的分析,有助于理解篇章的结构和语义,并为自然语言处理的应用(例如自动文摘、主题抽取和问答系统等)提供有力的支持.然而,目前篇章主次关系分析却是篇章结构分析的一个瓶颈.已有研究一般将篇章主次关系分析看作篇章修辞结构分析中的一个辅助环节,忽略了其在篇章结构分析中的重要性.因此,文中将篇章主次关系提升到篇章结构分析的核心地位,将它从篇章修辞结构分析中分离出来作为一个独立的任务进行研究.首先,探讨了什么是篇章主次关系、如何判别篇章主次关系以及为什么要研究篇章主次关系;其次,分别从两个角度(微观、宏观)和三个方面(理论体系、语料资源和计算模型)详细阐述了篇章主次关系的研究现状;再次,分析了篇章主次关系研究存在的问题,并提出我们的基本研究思路;最后,归纳出篇章主次关系未来的一些研究方向.
      Abstract:
      Discourse structure analysis,especially recognizing the primary-secondary relationship in discourse structures is an important research topic in natural language processing.Recognition of discourse primary-secondary relationship not only helps to understand the discourse structure and semantics,but also provides strong support for deep applications of natural language processing,such as summarization,topic extraction,question answering,etc.However,discourse primary-secondary relationship recognition is bottleneck of discourse structure analysis in current discourse researches.Most existing research views the recognition of primary-secondary relationship as a dispensable component attached to the analysis of the rhetorical structure,totally ignoring the important role of primary-secondary relationship played in document level discourse structure analysis.Nevertheless,this paper regards the recognition of primary-secondary relationship as an independent task from the discourse rhetorical structure analysis,illustrating its critical role in discourse structure analysis.First,the paper discusses the definition of primary-secondary relationship,how to determine the primary-secondary relationship and its importance in discourse structure analysis.Second,the paper summarizes the research status of recognizing the primary-secondary relationship in discourse structure from both macro-level and micro-level,and from three aspects,i.e.,theory system,corpus resource,computing model.Moreover,this paper presents our several proposals targeting at the key issues in research on the primary-secondary relationship.Last but not least,we present several directions of future work related to the primary-secondary relationship.
      作者: 褚晓敏 [1] 朱巧明 [2] 周国栋 [2]
      Author: CHU Xiao-Min [1] ZHU Qiao-Ming [2] ZHOU Guo-Dong [2]
      作者单位: 苏州大学计算机科学与技术学院 江苏苏州 215006 苏州大学自然语言处理实验室 江苏苏州 215006
      刊 名: 计算机学报 ISTICEIPKU
      年,卷(期): 2017, 40(4)
      分类号: TP18
      在线出版日期: 2017年5月18日
      基金项目: 国家自然科学基金,教育部中国移动科研基金,江苏省科技计划(BK20151222)资助.This work is supported by the National Natural Science Foundation of China,the Ministry of Education China Mobile Research Foundation,the Jiangsu Provincial Science and Technology Plan