Elective for CS graduate students at the Technische Hochschule Nürnberg.
Class Schedule and Credits
Time and Location:
- Tuesdays at 11.30a SP.467
Announcements and Discussions: Teams (you’ve been added).
Each week, we will discuss algorithms and their theory before implementing them to get a better hands-on understanding. The materials consist of a mix of required and recommended readings, slides as well as a set of mandatory programming and analysis assignments. Pair programming encouraged, BYOD strongly recommended!
Assignments
Lectures are accompanied by mandatory assignments that will be reviewed and discussed every week.
All assignments are in form of prepared python3 jupyter notebooks.
Depending on the topic, they consist of programming, evaluation and/or discussion sections.
Please submit the notebooks (including state/rendered cells) the Teams channel under Files > Assignments (see schedule below).
Pair-programming is encouraged, but everyone needs to submit an individual file.
Credits
Credits are earned through two components:
- All assignments must be completed on time (see schedule below; pass/fail).
- Oral exam (20’) covering theory and assignments (graded; individual exams).
Note: Materials will be (mostly) in English, the lectures/tutorials will be taught in German unless English speaker present; oral exam in language of choice.
Recommended Textbooks
- Chao, K.-M. and Zhang, L.: Sequence Comparison (Springer). available online through Ohm Library
- Sun, R., Giles, L. and van Leeuwen, J.: Sequence Learning: Paradigms, Algorithms and Applications (Springer). available online through Ohm Library
- Huang, Acero, Hon: Spoken Language Processing: A Guide to Theory, Algorithm and System Development. (ISBN-13: 978-0130226167)
- Jurafsky, D and Martin, J: Speech and Language Processing. 2017 (available online)
- Manning, C, Raghavan P and Schütze, H: Introduction to Information Retrieval, Cambridge University Press. 2008. (available online)
- Goodfellow, I and Bengio,Y and Courville, A: Deep Learning. 2016 (available online)
- Schukat-Talamazzini, E.-G. Automatische Spracherkennung. 1995 (available online)
Please note the required reading remarks in the syllabus.
Important dates
| Date | Info |
|---|---|
| Mar 24 | class ends at 1.50p |
| Apr 7 | no class (Easter) |
| May 12 | class ends at 1.50p |
| May 26 | no class (Pentecost) |
| Jun 23 | no class |
| Jul 6-10 | oral exams (exact dates tbd) |
Syllabus
| # | Topic | Required Reading | Materials |
|---|---|---|---|
| 1 | comparing sequences: ED/Levenshtein, NW, DTW, modeling cost | Chao/Zhang Ch. 1.2 through 1.4, 2.4 and 3. | introduction, comparing sequences |
| 2 | Markov chains: statistical modeling of discrete sequences | Schukat-Talamazzini Ch. 7.2.{1,2}, 7.3 | markov chains |
| 3 | HMMs, pt. 1: basics, BW, time-alignments | Schukat-Talamazzini Ch. 5 & 8 | Hidden Markov Models |
| 4 | HMMs, pt. 2: Viterbi, beam-decoding, higher order modeling | Decoding curtesy of Elmar Nöth | |
| 5 | nnets 1: fundamentals, FF, AE, Word2Vec, FastText, ConvNets, embeddings, HMM-DNN | Karpathy 2016: Yes you should understand backprop, Mikolov et al., 2013. Efficient Estimation of Word Representations in Vector Space, Waibel et al., 1989. Phoneme Recognition Using Time-Delay Neural Networks, LeCun et al., 1998. Gradient-based learning applied to document recognition | nnets basics, nnets pt.1 |
| 6 | reinforcement learning, deep Q learning | reinforcement learning | |
| 7 | nnets2: RNN, LSTM, GRU | Pytorch Seq2Seq Tutorial, Pascanu et al. On the difficulty of training recurrent neural networks, Chris Olah Understanding LSTM Networks | slides |
| 8 | nnets3: s2s, Attn, CTC, RNN-T | Graves et al., 2006. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks, Loren Lugosch’s Introduction to RNN-T, Graves et al., 2012. Sequence Transduction with Recurrent Neural Networks | attention, ctc, rnn-t |
| 9 | tranformers: basics, architectures for text (BERT, SBERT, GPT) and speech (Wav2Vec2) | Łukasz Kaiser: Attention is all you need, esp. at 15:45ff. Vaswani et al. Attention Is All You Need, Jay Alammar The Illustrated Transformer, Devlin et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, wav2vec: Unsupervised Pre-training for Speech Recognition, Radford et al. Improving Language Understanding by Generative Pre-Training | transfer learning |
| 10 | LLMs as foundation models: benchmarks, GPT, BPE, data (pretraining and fine-tuning), instructGPT, RLHL; conditioning of transformers: zero-/few-shot, CoT | road to gpt | |
| 11 | state space models | (tbd) |
Assignments
| # | Topic | Due |
|---|---|---|
| 1 | Dynamic Programming | Mar 31 |
| 2 | Markov chains | Apr 14 |
| 3 | Hidden Markov Models | |
| 4 | Feed Forward Neural Networks | |
| 5 | Reinforcement learning (tbd) | |
| 5 | Recurrent Neural Networks | |
| 6 | Attention | |
| 7 | Transformers | |
| 8 | Summarization |
Subscribe to https://github.com/sikoried/sequence-learning/ to follow updates.