Chinese

自然语言处理中的经验性方法-Empirical Methods in Natural Language Processing

Course ID: 04832710
Credits: 3
Department: 信息科学技术学院
Introduction in Chinese: 《自然语言处理中的经验性方法》是一门面向信息科学相关专业高年级本科生的专业选修课,在已有的数理逻辑、概率统计以及程序设计等课程的基础上,向同学们介绍如何使用以数据为驱动的经验性方法来解决自然语言处理(特别是文本数据处理)中的常见问题,并培养他们分析、处理大规模数据的实际动手能力。同时,希望就一些热点课题,如统计机器翻译、海量信息抽取等进行专题介绍,为同学们介绍更为前沿的研究进展。这门课中所涉及到的经验性方法主要指以数据为驱动,以语料为对象,以模式识别、机器学习为手段的处理思路;希望同学们通过这门课的学习与锻炼,在遇到实际问题时,能够选择适当的算法以及优化方法、独立编程、轻松应对较大规模的文本数据。我们希望无论是继续深造、还是即将步入工作岗位的同学都能受益于这样的锻炼过程。
Introduction in English: This course is an introduction for undergraduate students who are interested in empirical methods applied to natural language processing. We will emphasize on empirical methods, which mainly refers to data-driven models with ingredient from pattern recognition and machine learning. We will also survey interesting NLP applications, e.g., word segmentation, tagging, parsing, etc., and introduce recent advances in statistical machine translation and information extraction. In this course, students will learn what data-driven methods are, how to utilize those models to build their own systems to analyze massive text data and actually solve a real NLP problem in practice. The pre-requisites for this course include: Passion(sure!), some knowledge about Probability Theory and Statistics, a little bit of Mathematical Logic, and Practice of Programming.