CS772: Deep Learning for Natural Language Processing

Spring, 2025

Announcements

Join MS Teams with code: bbcc3pf (use IITB LDAP)

Access course lecture recordings on YouTube channel YouTube Subscribe

Previous iterations of the course: 2023 | 2022


Course Details:

CS772: Deep Learning for Natural Language Processing (Spring, 2025)

Department of Computer Science and Engineering

Indian Institute of Technology Bombay

Time Table and Venue:

  • Slot: 14
  • Venue: TBD
  • Tuesday: 5:30 PM to 6:55 PM
  • Friday: 5:30 PM to 6:55 PM


Motivation:

Deep Learning (DL) is a framework for solving AI problems based on a network of neurons organized in many layers. DL has found heavy use in Natural Language Processing (NLP) too, including problems like machine translation, sentiment and emotion analysis, question answering, information extraction, and so on, improving performance on automatic systems by orders of magnitude.

The course CS626 (Speech, NLP, and the Web) being taught in the first semester in the CSE Department at IIT Bombay for the last several years creates a strong foundation in NLP, covering the whole NLP stack, starting from morphology to part of speech tagging, to parsing and discourse and pragmatics. Students of the course, which typically number more than 100, acquire a grip on tasks, techniques, and linguistics of a plethora of NLP problems.

CS772 (Deep Learning for Natural Language Processing) comes as a natural sequel to CS626. Language tasks are examined through the lens of Deep Learning. Foundations and advancements in Deep Learning are taught, integrated with NLP problems. For example, sequence-to-sequence transformers are covered with applications in machine translation. Similarly, various techniques in word embedding are taught with applications to text classification, information extraction, etc.


Course Description:

  • Background: History of Neural Nets; History of NLP; Basic Mathematical Machinery - Linear Algebra, Probability, Information Theory, etc.; Basic Linguistic Machinery - Phonology, morphology, syntax, semantics.
  • Introducing Neural Computation: Perceptrons, Feedforward Neural Network and Backpropagation, Recurrent Neural Nets.
  • Difference between Classical Machine Learning and Deep Learning: Representation - Symbolic Representation, Distributed Representation, Compositionality; Parametric and non-parametric learning.
  • Word Embeddings: Word2Vec (CBOW and Skip Gram), Glove, FastText.
  • Application of Word Embedding to Shallow Parsing: Morphological Processing, Part of Speech Tagging and Chunking.
  • Sequence to Sequence (seq2seq) Transformation using Deep Learning: LSTMs and Variants, Attention, Transformers.
  • Deep Neural Net based Language Modeling: XLM, BERT, GPT2-3 etc.; Subword Modeling; Transfer Learning and Multilingual Modeling.
  • Application of seq2seq in Machine Translation: supervised, semi-supervised, and unsupervised MT; encoder-decoder and attention in MT; Memory Networks in MT.
  • Deep Learning and Deep Parsing: Recursive Neural Nets; Neural Constituency Parsing; Neural Dependency Parsing.
  • Deep Learning and Deep Semantics: Word Embeddings and Word Sense Disambiguation; Semantic Role Labeling with Neural Nets.
  • Neural Text Classification: Sentiment and Emotion labeling with Deep Neural Nets (DNN); DNN-based Question Answering.
  • The indispensability of DNN in Multimodal NLP: Advanced Problems like Sarcasm, Metaphor, Humor, and Fake News Detection using multimodality and DNN.
  • Natural Language Generation: Extractive and Abstractive Summarization with Neural Nets.
  • Explainability

References:

  • Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, MIT Press, 2016.
  • Dan Jurafsky and James Martin, Speech and Language Processing, 3rd Edition, October 16, 2019.
  • Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola, Dive into Deep Learning, e-book, 2020.
  • Christopher Manning and Heinrich Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
  • Daniel Graupe, Deep Learning Neural Networks: Design and Case Studies, World Scientific Publishing Co., Inc., 2016.
  • Pushpak Bhattacharyya, Machine Translation, CRC Press, 2017.
  • Journals: Computational Linguistics, Natural Language Engineering, Journal of Machine Learning Research (JMLR), Neural Computation, IEEE Transactions on Neural Networks and Learning Systems.
  • Conferences: Annual Meeting of the Association of Computational Linguistics (ACL), Neural Information Processing (NeuiPS), Int’l Conf on Machine Learning (ICML), Empirical Methods in NLP (EMNLP).

Pre-requisites

  • Data Structures and Algorithms
  • Python (or similar language) Programming skill

Course Instructor

Prof. Pushpak Bhattacharyya
Prof. Pushpak Bhattacharyya


Course Materials

Lecture Topics Slide Links Video Links

Contact Us

  • CFILT Lab
  • Room Number: 401, 4th Floor, new CC building
  • Department of Computer Science and Engineering
  • Indian Institute of Technology Bombay
  • Mumbai 400076, India