2022 Fall AI605 - Deep Learning for Natural Language Processing

For Lab sessions, please bring a laptop/tablet that can run Python 3.9
It is recommended to use Conda to manage Python environments (https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)
Please create a fresh Python environment with the following packages: jupyter ipykernel matplotlib numpy scikit-learn torch
Other packages will be introduced for specific labs later.

Natural Language Processing (Jacob Eisenstein) contains a very comprehensive resource, some parts of this course are aligned with parts of this book:

Assessment options: either 6 assignments (100%) or 2 assignments (33%) and capstone project (67%)

Date	Week	Topic	Resource
08-29	1	Lecture: Overview, Themes, Tasks, and Model Evaluation
08-31	1	Lab: Math Basics, Bag of Words Model, Feed Forward Networks	Ch3,Ch4
09-05	2	Lecture: Recurrent Neural Networks
09-07	2	Lab: Sequence Classification (recurrent models)
09-12	3	No Lecture (Chuseok)
09-14	3	Lab: Sequence Classification 2 (attention)
09-19	4	Lecture: Semantics	Ch14
09-21	4	Lab: Word Embeddings	Ch14.1-3, 14.5-6
09-26	5	Lecture: Token Classification (Tagging)
09-28	5	Lab Structured Prediction (Greedy Decoding, Viterbi, Beam Search)
10-03	6	No Lecture (National Foundation Day)
10-05	6	Lab Structured Prediction (CRF)
10-10	7	No Lecture (Hangeul Proclamation Day)
10-12	7	Lab
10-17	8	Catch up
10-19	8	Catch up
10-24	9	Lecture:
10-26	9	Lab:
10-31	10	Lecture:
11-02	10	Lab
11-07	11	Lecture
11-09	11	Lab
11-14	12	Lecture
11-16	12	Lab
11-21	13	Lecture
11-23	13	Lab
11-28	14	Lecture
11-30	14	Lab
12-05	15	Lecture
12-07	15	Lab
12-12	16	Catch up
12-14	16	Catch up

NLP Overview and Trends
Key tasks and architectures
Trends: kernels, model engineering, pre-training, prompting
Typical modeling choices
Evaluation
Token-based models
Statistical properties of language
Bag of words classifier
Stemming
Word2vec, GloVe, Matrix based word embedding
Semantic analysis
Encoders for supervised classification
Model types: Feed Forward, CNN, RNN, GRU, LSTM
Tasks: Sentiment Analysis, Stance Classification, Entailment
Contextual models for classification
Contextual Encoding, Attention, Self-Attention
Model types: Decomposable Attention, ESIM, Transformers
Sequence Labeling
Approaches: Hidden Markov Model, Conditional Random Field
Algorithms: Viterbi, Forward-Backward Algorithm
Decoding with beam search
Applications: Named Entity Recognition, POS Tagging, Tokenization
Generation
Architecture: encoder-decoder
Tasks: Question answering, machine translation, image captioning
Language modeling and pre-training
Language modeling types, count-based LMs, ELMo, BERT / RoBERTa, T5 / BART / GPT
Representation for tokens: e.g. BPE
Retrieval
Sparse vector based retrieval (e.g. TF-IDF)
Dense retrieval
Generative retrieval
Other topics (TBD)
Dialog
Parsing and grammars
Contrastive training
Adversarial modeling

Subscribe to James Thorne