Annual Report Homepage   Previous Next
SMA Logo & Rationale
Front Cover Design
Vision & Mission
Committee Members & Directors
Programme Co-Chairs
Particpation by Industry & NRIs
Job Placement
Faculty & Staff
Students & Alumni
  S.M. Projects (2004/2005)  
  Project abstracts can be viewed from the CD-ROM which is enclosed or the SMA website (  
  IMST Programme MEBCS Programme CS Programme  
  CS Programme  
Using Paraphrases for Information Extraction
Student :
Shilpa Arora
SMA Supervisors :
Assoc Prof Ng Hwee Tou (Singapore) & Prof Leslie P. Kaelbling (MIT)

Project Abstract:

Maximum Entropy has been used in several statistical classification problems in Natural Language Processing (NLP). This project uses the Maximum Entropy approach to build an Information Extraction (IE) system for extracting information about events such as terrorist
events. Here, the Information Extraction (IE) task is modeled as a classification problem where information extracted from the document is used to fill information slots in a template. Manually generated templates are used as the training set for this system but the amount of manually labeled data that can be provided is very limited. So, in our approach we use weakly labeled data, which essentially contains news articles describing same event, and are thus a rich source of paraphrases. The weakly labeled data is freely available and hence provides ample training examples. A basic Maximum Entropy based IE system was developed and improvement in performance using weakly labeled data was investigated.

  - Go back to titles  
Back to Top   Previous Next