16 Week 10: Validation
This week we’ll be thinking about how to validate techniques we’ve used in the preceding weeks. Validation is a necessary and important part of any text analysis technique.
Often we speak of validation in the context of machine labelling of large text data. But validation need not—and should not—be restricted to automated classification tasks. The articles by Ying et al. (2021) and Rodriguez et al. (2021) describe ways to approach validation in unsupervised contexts. Finally, the article by Peterson and Spirling (2018) shows how validation and accuracy might provide a measure of substantive significance.
Required reading:
- Ying et al. (2021)
- Rodriguez et al. (2021)
- Peterson and Spirling (2018)
- Manning et al. (2007, ch.2: https://nlp.stanford.edu/IR-book/information-retrieval-book.html)
Further reading:
- K. Krippendorff (2004)
- Denny and Spirling (2018)
- Grimmer and Stewart (2013b)
- Barberá et al. (2021)
- Schiller et al. (2021)
Slides:
- Week 10 Slides