Tuesday 24 January 2012

nltk - Just the Basics

import nltk
nltk.download()

Spawns Edward Loper's TK download application (works better on Linux than Whine-dows)
Download book (about 100MB of space needed).

from nltk.book import *
text1.collocations() - high probability bigrams
text1.concordance("monstrous") - search with in-context results
text1.similar("monstrous") - words that appear in similar contexts
text1.common_contexts(["monstrous", "very"])

You can also go visualisation.

text1.dispersion_plot(["citizens", "democracy", "freedom", "duties", "America"])

Notes

  • You might need to hack and reinstall to get working in Python 2.5. Basically, os.walk can't use followLinks=true.

  • concordance can't process bigrams correctly

No comments: