Are you curious about how chatbots hold conversations or how ChatGPT generates human-like responses? This course in Natural Language Processing (NLP) is your gateway into the fascinating world where language meets AI. Designed for students and professionals alike, the course blends essential theory with hands-on experience to equip you with the skills needed to build intelligent language systems.
We start by unravelling what makes language so complex—and why teaching machines to understand it is such a challenging task. You’ll explore the inner workings of Natural Language Understanding (NLU) and Generation (NLG), investigate real-world NLP applications, and dive into current trends like large language models (LLMs) and transformer-based systems.
From there, you’ll roll up your sleeves and learn core NLP techniques like tokenization, stemming, lemmatization, and sentence segmentation. You’ll master vector-based approaches like Bag of Words and TF-IDF, then progress to powerful word embeddings like Word2Vec, Skip-gram, and GloVe.
As you advance, you'll build language models, train simple neural networks, and explore cutting-edge tools in POS tagging, syntactic parsing, and semantic analysis. You’ll even touch the future with knowledge graphs and Word Sense Disambiguation. By the end, you’ll be ready to innovate in the fast-evolving NLP landscape.
Graduates of this NLP course can pursue roles such as NLP Engineer, Machine Learning Engineer, or Data Scientist with a focus on language technologies. Opportunities also exist in AI-driven fields like chatbots, voice assistants, sentiment analysis, and information retrieval. Advanced learners may explore careers in research, LLM fine-tuning, or knowledge graph development.
Are you ready to unlock the power of cutting-edge NLP skills? Join us on this exciting journey into the world of language, AI, and intelligent data processing!
In this module, the learners will be introduced to the course and its syllabus, setting the foundation for their learning journey. The course's introductory video will provide them with insights into the valuable skills and knowledge they can expect to gain throughout the duration of this course. Additionally, the syllabus reading will comprehensively outline essential course components, including course values, assessment criteria, grading system, schedule, details of live sessions, and a recommended reading list that will enhance the learner’s understanding of the course concepts. Moreover, this module offers the learners the opportunity to connect with fellow learners as they participate in a discussion prompt designed to facilitate introductions and exchanges within the course community.
What's included
2 videos1 reading1 discussion prompt
Show info about module content
2 videos•Total 5 minutes
Course Introduction•3 minutes
Meet Your Instructor: Prof. Dr. Chetana Gavankar•2 minutes
1 reading•Total 10 minutes
Course Overview•10 minutes
1 discussion prompt•Total 10 minutes
Meet Your Peers •10 minutes
Introduction to Natural Language Processing
Module 2•4 hours to complete
Module details
This module introduces the fundamental concepts of Natural Language Processing (NLP). It begins with the definition of NLP and explores a variety of real-world applications. You will gain an understanding of Natural Language Understanding (NLU) and Natural Language Generation (NLG). The module also covers key evaluation metrics used to assess NLP systems. Additionally, a hands-on lab session will guide you through the implementation of basic NLP preprocessing techniques.
Basic NLP Application Development Using NLP Tools•3 minutes
1 discussion prompt•Total 30 minutes
Real-World Challenges and Tools in Natural Language Processing•30 minutes
Text Preprocessing and Analysis in NLP
Module 3•6 hours to complete
Module details
This module introduces essential NLP preprocessing techniques. It begins with regular expressions for text pattern matching, followed by an overview of words and corpora as foundational data sources. Sentence segmentation and tokenization are then covered through practical demonstrations. Finally, the module explores normalization, lemmatization, and stemming as methods to standardise text, with a demo highlighting their differences and effects.
Building a Preprocessing Pipeline: Challenges and Solutions•30 minutes
Vector Semantics
Module 4•3 hours to complete
Module details
This module explores lexical and vector semantics, focusing on computational representations of word meaning. It covers word vectors, Bag of Words, and co-occurrence matrices to capture contextual relationships. Techniques such as TF-IDF are introduced to measure word importance, along with methods for computing word similarity. Practical examples and mathematical exercises on TF-IDF help reinforce these core NLP concepts.
Recommended Reading: Foundations of Lexical and Vector Semantics •15 minutes
Recommended Reading: Representing Text Using Vectors •15 minutes
Recommended Reading: Term and Inverse Document Frequency •15 minutes
10 assignments•Total 30 minutes
Lexical Semantics •3 minutes
Why Vectors? •3 minutes
Word and Vectors •3 minutes
Bag of Words•3 minutes
Computing Word Similarity •3 minutes
Cosine Similarity •3 minutes
Cosine Similarity Example •3 minutes
Term Frequency •3 minutes
Inverse Document Frequency •3 minutes
TF-IDF •3 minutes
1 discussion prompt•Total 20 minutes
Applying Vector Semantics in a Real-World Scenario•20 minutes
Word Embedding
Module 5•9 hours to complete
Module details
This module introduces Word Embeddings, focusing on the transition from sparse to dense vector representations of words. It covers Word2Vec models, including Skip-gram and CBOW, explained with simple, intuitive examples. The module also explores GloVe embeddings, which capture global word co-occurrence statistics for improved semantic understanding. Learners will visualise word embeddings to gain insights into how words relate in vector space. Finally, the module highlights real-world applications of word embeddings in NLP tasks like sentiment analysis, machine translation, and question answering.
Skip-Gram Negative Training Data Example•3 minutes
SGNS Log Loss Function•3 minutes
Derivative of SGNS Loss Function•3 minutes
SGNS Example Part 1•3 minutes
SGNS Example Part 2•3 minutes
Continuous Bag of Words (CBOW)•3 minutes
Graded Quiz - Modules 3 and 4•60 minutes
SGA-1 Submission: Word Embedding•300 minutes
1 discussion prompt•Total 20 minutes
The Power of Dense Vectors: Choosing an Embedding Model•20 minutes
N-gram Language Modeling
Module 6•4 hours to complete
Module details
This module introduces Language Modeling (LM) and its role in predicting word sequences in natural language. It explores practical applications of LMs and explains N-gram models, including challenges like generalization and handling zero probabilities. Techniques such as smoothing and stupid backoff are covered to improve model robustness. The module concludes with methods for evaluating language models using standard metrics.
Recommended Reading: Language Modelling Introduction•15 minutes
Recommended Reading: N-grams •15 minutes
Recommended Reading: Smoothing •15 minutes
Recommended Reading: Language Modelling Evaluation •15 minutes
13 assignments•Total 39 minutes
What is Language Modeling? •3 minutes
Language Modelling Applications •3 minutes
How to Build a Language Model •3 minutes
Markov Assumption•3 minutes
N-gram Language Models •3 minutes
Bi-gram Computation •3 minutes
Raw Probabilities •3 minutes
Perils of Overfitting •3 minutes
Laplace Smoothing•3 minutes
Interpolation & Backoff•3 minutes
How Good is the Model?•3 minutes
Extrinsic Evaluation •3 minutes
Perplexity & its Example•3 minutes
1 discussion prompt•Total 20 minutes
Balancing Simplicity and Performance in Language Modelling•20 minutes
Neural Networks and Neural Language Models
Module 7•5 hours to complete
Module details
This module explores the use of Neural Networks in Language Modelling, starting with the fundamentals of Feed-Forward Neural Networks and their training process for language tasks. It introduces Neural Language Models, which capture complex patterns in text beyond traditional statistical methods. The module also provides a foundational understanding of Large Language Models (LLMs) and their capabilities. Finally, it introduces Prompt Engineering as a technique to effectively interact with and guide LLMs for various NLP applications.
The Next Generation of Language Modelling: From N-grams to LLMs•20 minutes
Part of Speech Tagging
Module 8•4 hours to complete
Module details
This module provides an introduction to Part-of-Speech (POS) Tagging, techniques to perform POS Tagging and their applications in NLP. POS tagging is a fundamental task in Natural Language Processing (NLP) that involves assigning grammatical categories (like noun, verb, adjective) to words in text. Starting from basic linguistic foundations and real-world applications, the module dives into the evolution of POS tagging techniques—from statistical models like Hidden Markov Models (HMMs) and Maximum Entropy classifiers, to modern deep learning approaches using Recurrent Neural Networks (RNNs). Learners will gain a strong theoretical understanding and insight into how POS tagging supports downstream tasks like parsing, named entity recognition, and machine translation. The module includes a hands-on coding demonstration for POS tagging.
POS Tagging: The Right Tool for the Job•30 minutes
Parsing and Applications
Module 9•11 hours to complete
Module details
This module introduces students to the syntactic structure of natural language and its critical role in Natural Language Processing (NLP) applications. Parsing is the task of assigning a structured representation—typically a tree—to a sentence, revealing the grammatical relationships between its components. The module begins by revisiting Context-Free Grammars (CFGs) and how they form the foundation for syntactic parsing. We explore Constituent Parsing, introducing classical parsing techniques such as the CKY (Cocke-Kasami-Younger) algorithm. The module then transitions to modern span-based neural parsing approaches that use neural networks to score and predict parse trees. A significant portion of the module is dedicated to Dependency Parsing, where syntactic structure is represented through direct relationships between words rather than phrases. Students will study both transition-based and graph-based dependency parsers, gaining insight into their strengths, algorithmic designs, and practical performance. Throughout the module, we emphasise real-world NLP applications.
Parsing Frameworks: Constituent vs. Dependency•30 minutes
Word Senses, Disambiguation, and the Semantic Web
Module 10•5 hours to complete
Module details
This module explores the semantic dimension of natural language by covering both lexical semantics—including word senses, ambiguity, and disambiguation techniques—and the semantic web—a framework for enabling machine-readable, structured understanding of web data. The module starts with foundational concepts in lexical semantics and WordNet, then proceeds to classical and modern word sense disambiguation (WSD) methods. The second part focuses on Semantic Web technologies, covering ontologies, knowledge graphs, RDF/OWL, and their role in enabling intelligent systems and knowledge-driven NLP applications.
Navigating WordNet Hierarchies and Graph Structures•5 minutes
What is Word Sense Disambiguation? •4 minutes
Supervised WSD•8 minutes
Knowledge-Based WSD: Lesk Algorithm•5 minutes
From Syntactic Web to Semantic Web: What's the Problem?•6 minutes
Semantic Web Vision: Data Integration and Automation•3 minutes
Ontologies•4 minutes
Ontology Languages and Their Layers•9 minutes
What is a Knowledge Graph? •3 minutes
Applications in NLP•6 minutes
Module Wrap Up•1 minute
5 readings•Total 130 minutes
Recommended Reading: Word Senses and Lexical Semantics•30 minutes
Code Document: Querying WordNet in Python (using nltk.corpus.wordnet)•10 minutes
Recommended Reading: WordNet and Semantic Lexicons•30 minutes
Recommended Reading: Word Sense Disambiguation (WSD)•30 minutes
Recommended Reading: Introduction to the Semantic Web and Ontologies•30 minutes
14 assignments•Total 42 minutes
What is a Word Sense? •3 minutes
Homonymy vs Polysemy•3 minutes
Sense Relations•3 minutes
Introduction to WordNet and Synsets•3 minutes
Relations in WordNet•3 minutes
Navigating WordNet Hierarchies and Graph Structures•3 minutes
What is Word Sense Disambiguation?•3 minutes
Supervised WSD•3 minutes
Knowledge-Based WSD: Lesk Algorithm•3 minutes
Semantic Web Vision: Data Integration and Automation•3 minutes
Ontologies•3 minutes
Ontology Languages and Their Layers•3 minutes
What is a Knowledge Graph? •3 minutes
Applications in NLP•3 minutes
1 discussion prompt•Total 30 minutes
Disambiguating the Future: WSD and the Semantic Web•30 minutes
Ethical Implications
Module 11•6 hours to complete
Module details
This module introduces students to the evolution of neural network architectures in NLP, beginning with recurrent models (RNNs), progressing through attention mechanisms, and culminating in Transformer-based models that have revolutionised natural language processing. Through hands-on coding and application-driven lessons, students will explore how Transformers power state-of-the-art systems in sentiment analysis (text classification), machine translation, and question answering. The module emphasises both theoretical foundations and practical implementation using modern deep learning frameworks.
Birla Institute of Technology & Science, Pilani (BITS Pilani) is one of only ten private universities in India to be recognised as an Institute of Eminence by the Ministry of Human Resource Development, Government of India. It has been consistently ranked high by both governmental and private ranking agencies for its innovative processes and capabilities that have enabled it to impart quality education and emerge as the best private science and engineering institute in India.
BITS Pilani has four international campuses in Pilani, Goa, Hyderabad, and Dubai, and has been offering bachelor's, master’s, and certificate programmes for over 58 years, helping to launch the careers for over 1,00,000 professionals.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I purchase the Certificate?
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.