A sentiment classification approach using stacked supervised learning incorporating web mined features

Speaker: Rejean Lau (University of Alberta)

Sentiment classification is a form of the text categorization problem where user sentiment is categorized as either positive or negative sentiment. It is generally accepted that sentiment analysis is a more challenging classification problem then topic categorization and here the bag of words approach does not perform as well. Using the IMDB sentiment dataset from Cornell University, we improve on their results by using a stacked classifier and web-mined features. Utilizing distance from SVM hyperplane as the learned feature weights, we achieve classification results comparable to topic categorization. Stacked supervised learning is shown to significantly boost performance from the baseline (non-stacked) learner when additional information is incorporated, such as web-mined features.