Week 5: Algorithmic bias and RNNs
Reminders and announcements
Welcome to Week 5! This will be my (Sharon's) last week of lecturing, as I will be handing over to Edoardo next week. Despite the early start, I've been enjoying seeing so many of you engaged with the lectures and I like getting a chance to talk to some of you afterward. But I am also looking forward to a break from so much lecture preparation! I will still be seeing some of you in tutorials, and will be back for one more lecture in Week 10, but my involvement in the course will be much less visible after this week.
There's no Additional Materials section this week, as all the materials are here!
- Solutions for Tutorial 1 are now available.
- Exercises for Tutorial 2 are linked again here for convenience. As usual, you should plan to work through the questions before your tutorial group meets this week.- Reminder: the Tuesday tutorial groups will be meeting in AT 5.01 this week (and will not meet next week, despite what your calendar might say). After that they will return to the usual room, AT 5.04.
 
- Next week (and the week after) we will have lab sessions, where you'll be able to work with RNNs (which I will cover this week) and Transformers (which Edoardo will cover next week).
- Preview of next week's reading: Mon: JM3 13.8, 12.0-12.3; Wed: JM3 8.1, 8.3; Fri: JM3 8.2, 8.4-8.7, 8.8.2.
Overview of this week
On Monday this week, we will continue to build up our repertoire of different models, this time introducing RNNs, which will finally allow us to relax the strict independence assumptions adopted by all the previous language models we've seen.
The Wed and Fri lectures will be a bit of a change. Instead of focusing on linguistic structure or computational methods, we'll have the first of a few lectures where we will discuss some of the ethical issues that arise when we start to use NLP systems in the real world. In this case, we will be talking about algorithmic bias: what it is, and some examples of where and how it can arise. To illustrate these ideas, I'll focus on an aspect of language that is often overlooked by NLP systems: how social identity and language interact in the use of dialects (or language varieties). Wednesday's lecture will provide some background on how linguists think about dialects, which might be different from how you think about them! Then I'll go into more detail with a case study demonstrating bias in NLP systems against speakers of a particular dialect of English.
Lectures and readings
| Lecture # | Who? | Slides | Reading | 
|---|---|---|---|
| 1 | SG | JM3 13.0-13.2 (*), 13.3.1-2, 13.3.3 (*), 13.7 (*) 
 | |
| 2 | SG | Dialect and discrimination | Understanding Bias, Part 1* from Machines Gone Wrong by Lim Swee Kiat. JM3 4.12 | 
| 3 | SG | Dialect case study and additional data ethics | See the notes below about these additional readings: | 
Notes about additional reading for Lecture 3:
- School Ethics page: you don't need to read the whole thing (!), just have a quick look at what's there, and read the "Ethics and Integrity" section which is a very brief overview.
- ACL Responsible NLP Guidelines: Again, you do not need to read the whole thing in detail. Look over the headings in bold, especially for parts B and D, and consider how these relate to issues we've discussed in this class.
- Luccioni and Viviano (2021): I will refer to this very briefly in the lecture, and it is not urgent to read it right now, but we will ask you to read it before Tutorial 3 group meetings, which will have some questions about the paper.