Applications of NLP
CS 5964/6964
Fall 2007
![]() |
Applications of NLP
CS 5964/6964 Fall 2007
|
![]() |
| Date | Topics | Readings | HW | Notes |
| 20 Aug |
Introduction to natural language processing
Overview of class |
- | HW1 out | ![]() |
| 22 Aug |
Basic linguistic theory
Words, sentences, morphology, tagging Corpora and tools |
Unix for Poets POS tag list |
- | ![]() |
| 27 Aug |
String processing techniques
Text-to-sound conversion Finite state machines for language |
NLTK 1 | HW1A due | - |
| 29 Aug |
Probability 101
Conditional, Bayes rule, chain rule Estimating probabilities from data |
Carmel | - | - |
| 5 Sep |
Probability in strings
Noisy-channel framework Probabilistic automata |
- | P1 out | - |
| 10 Sep |
Language modeling
Distinguishing good strings from bad |
- | HW1B due | - |
| 12 Sep |
Language modeling II
Sparse data problem, smoothing |
Goodman | - | - |
| 17 Sep |
Probabilistic string transformations
Entity tagging |
- | - | - |
| 19 Sep |
Probabilistic string transformations II
Automatic speech recognition |
- | - | - |
| 24 Sep | Catch-up | - | HW1C due | - |
| 26 Sep |
Incomplete data
Cryptanalysis, tranliteration |
- | HW2 out | - |
| 1 Oct |
Incomplete data II
EM algorithm |
EM notes | - | - |
| 3 Oct |
Word-based alignment models
IBM models 1 and 2 |
SMT (pp.11-26) | HW2A due | - |
| 15 Oct |
Word-based alignment models II
HMM model IBM models 3 and 4 |
SMT (pp.30-45) | P1 due | - |
| 17 Oct |
Machine translation decoding
Integration with language models |
- | P2 out | - |
| 22 Oct | Catch-up | - | - | - |
| 24 Oct |
Toward phrase-based translation
Combination of alignments |
SMT (pp.61-71) | HW2B due | - |
| 29 Oct |
Phrase-based translation
Beam search Discriminative training |
SMT (pp.89-99) | - | - |
| 31 Oct |
Evaluation
BLEU score |
SMT (pp.157-175) | - | ![]() |
| 5 Nov |
Syntax-based translation
Current research directions |
- | - | - |
| 7 Nov | Catch-up | - | HW2C due | - |
| 12 Nov |
Information Retrieval
Inverted indices, TF-IDF |
- | HW3 out P3 out |
![]() |
| 14 Nov |
Single-document summarization
Vector space model Sentence extraction |
- | P2 due | - |
| 19 Nov |
Headline generation
Keyword extraction using automata |
- | HW3A due | ![]() |
| 21 Nov |
Single-document summarization
Discourse and coherence Responding to queries |
- | - | - |
| 26 Nov |
Question answering I
Knowledge-lean approaches |
- | - | ![]() |
| 28 Nov |
Question answering II
Knowledge-rich approaches |
- | - | ![]() |
| 3 Dec | Tree-tranducers and syntatic transformations | - | - | - |
| 5 Dec | CLASS CANCELLED | - | P3, HW3B due | - |
| Segment | Assignment | Topic |
| 1 | Homework 1 | Basic linguistics and probability (Solution; hw1c-wordbigram.pl and hw1c-cblm.pl -- rename .txt to .pl) |
| Project 1 | Language modeling and tagging (Solution and code) | |
| 2 | Homework 2 | Incomplete data (Solution and hw2b-make-ej.txt) |
| Project 2 | Machine translation (Solution and my outputs and code) | |
| 3 | Homework 3 | NLP on the Web (Solution) |
| Project 3 | Headline generation (data) (Solution and my outputs and code) |