This week will start where last week left off, exploring some general techniques for doing machine learning. In particular we will study logistic regression. Logistic regression was an extremely common feature in NLP systems, and it's important to study it because deep learning, which is a ubiquitous technique in NLP these days, is a generalisation of logistic regression. You will be learning about deep learning approaches to NLP if you choose to take NLU+ next year.
We will then focus on NLP specific tasks again, starting with words. In particular, we will start with an analysis of morphology. Words have structure: you can make words up from stems and affixes in a highly productive way. The degree to which morphology is productive varies a lot across languages. In some languages, morphology is so productive that you can express in a single word the kinds of contents that in English you would need to articulate with a clause. Because composiing words from its bits is productive, you cannot list all word forms, and moreover, you need to parse words because their structure reveals important information about a word's meaning, just like you need to parse sentences for similar reasons. We will look at how finite state transducers can be used to parse and generate words.
We will then introduce the task of Part of Speech (POS) tagging: given a sequence of words, what are the most likely parts of speech of each of those words in the sequence? POS tagging will carry over to week 5 as well.
The content in the following pages is structured as follows:
10: Logistic Regression
11: Morpohology
12: POS tagging: An overview
As always, each of the above includes videos, the slides that were used in the videos, required readings, and a post-lecture quiz. The quiz is a chance for you to gauge your understanding of the material presented here, and so we strongly encourage you to review this content in the above order, and then complete the quiz. If there is anything you don't understand, then you have several options:
- Post a question on piazza;
- Ask a question at the in person lectures; and/or
- Ask your tutor.