Skip to main content

FNLP - top navigation

  • Learn
  • Piazza
  • DRPS

Breadcrumb

  1. Home
  2. FNLP: Foundations of Natural Language Processing
  3. FNLP: Course Materials
  4. FNLP: Week 1: Overview, Ambiguity and Corpora

FNLP: 3: Corpora

This page consists of:

  • three videos of short lectures. They cover:
    • Text Corpora: Motivation
    • Text Corpora: Some basic principles and experimental design
    • Text Corpora: Tokenisation
  • some required reading from Jurafsky and Martin and the NLTK book
  • a quiz that tests your understanding of the material presented here.

Please do the required reading, and attempt the quiz.  If there is anything you don't understand, then you can ask questions in the lecture or on piazza.

Lecture 3 Slides: whole!
  • 03_slides.pdf
3a: Text Corpora: Motivation
  • Slides: 03a_slides.pdf


 3b: Text Corpora: Basic Principles and Experimental Design
  • slides:  03b_slides.pdf 

3c: Text Corpora: Tokenisation
  • slides: 03c_slides.pdf

Recommended Reading

J&M, 2nd edition, chapter 1

NLTK, chapter 11.

NOTE: The abbreviation J&M refers to the textbook:    
Dan Jurafsky and James H. Martin, Speech and Language Processing.

When we specify 2nd edition, we are referring to the version of the book that was published by Pearson International in 2008.

When we specify 3rd edition, then we will supply links to the drafts of the relevant parts of that book (since the third edition isn't published yet, but the current draft is available here: https://web.stanford.edu/~jurafsky/slp3/).

The abbreviation NLTK refers to the textbook:

Bird, S., E. Klein and E. Loper (2009), Natural Language Processing with Python, O'Reilly Media

An (early) online version of this book is here: http://www.nltk.org/book_1ed/.

Quiz 3: Corpora and Sentiment Analysis

These questions are designed to test your understanding of the above course content; doing this quiz does not contribute to your overall grade.  Some questions require a text answer.  You can ask for formative feedback on these from your tutor or on piazza.  Other questions are multiple choice or they require a numeric answer: you will get immediate feedback for these. Please don't attempt this quiz until you have acquainted yourself with this lecture and the required reading.

You must be logged onto Learn to do this quiz.

License
All rights reserved The University of Edinburgh

Book traversal links for FNLP: 3: Corpora

  • FNLP: 1-2: Introduction and Ambiguity
  • Up
  • FNLP: Week 2: Annotation, Evaluation and Language Models

Navigation links

  • FNLP: Resource List
  • FNLP: Assessment
  • FNLP: Course Materials
    • FNLP: Week 1: Overview, Ambiguity and Corpora
      • FNLP: 1-2: Introduction and Ambiguity
      • FNLP: 3: Corpora
    • FNLP: Week 2: Annotation, Evaluation and Language Models
    • FNLP: Week 3: Important ML techniques for NLP
    • FNLP: Week 4: More ML methods, Morphology and POS tagging
    • FNLP: Week 5: POS Tagging, Context Free Grammars and Parsing
    • FNLP: Week 6: More Parsing and Compositional Semantics
    • FNLP: Week 7: Discourse Semantics and Lexical Semantics
    • FNLP: Week 8: Deep Learning for NLP
    • FNLP: Week 9: Neural Text Generation
    • FNLP Week 10: Transfer learning, Revision and Q&A
  • FNLP: Lab Exercises
  • FNLP: Tutorial Exercises
RSS feed

Opencourse privacy & accessibility statements; contact Informatics, ILTS.