Torna alla pagina principale Sigillo di Ateneo

Semantic Text Analysis

Goal

This short course (8 hours) covers the following aspects of (semantic) text analysis. It combines theoretic notions with practical exercises in python. The topics addressed are

  • Word prediction: language model
    • N-Gram
  • Text classification
    • Bayesian Classifier
    • Logistic regression
  • Adding Semantics: Computational Lexical Semantics
    • Wordnet
    • Word Disambiguation
  • Sentiment Analysis and Opinion Mining
  • Libraries
    • nltk http://www.nltk.org/
    • Scikit learn http://scikit-learn.org/
    • word2vec http://radimrehurek.com/gensim/models/word2vec.html

Books / References

D. Jurafsky & J. H. Martin: Speech and Language Processing, Pearson International 2009
D. Jurafsky, slides from NLP Courses
Dan Klein, Slides from NLP Course
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets Stanford University
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to Latent Semantic Analysis. Discourse Processes, 25, 259-284.
Jacob Perkins. Python 3 Text Processing with NLTK 3 Cookbook Packt Publishing 2014
Trent Hauck. Scikit-learn Cookbook. Packt Publishing 2014
Wes McKinney: Python for Data Analysis, O’Reilly Media, Inc 2013

Slides

Part 1 – Theory

Part 2- Libraries and Tools

Part 3 – Exercises