AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


Sentiment Mining in a Location-Based Social Networking Space: Semantically Oriented Rule-Based Reviews' Classication

Author

Term

4. term

Publication year

2011

Pages

101

Abstract

Denne afhandling præsenterer et system, der automatisk klassificerer sentimentet – positivt eller negativt – i anmeldelser på sociale medier. Tilgangen er uovervåget og sprogligt drevet: i stedet for at lære fra mærkede eksempler bygger systemet på SentiWordNet, en leksikalsk ressource der tildeler ords betydninger positive og negative scores. Vi udtrækker ord med naturlig sprogbehandling og kombinerer deres scores med mønsterbaserede regler for at nå en samlet vurdering. For at finde den mest effektive fremgangsmåde sammenlignede vi flere klassifikatorer, der kombinerer metoder på ord- og anmeldelsesniveau. Den mest lovende blev derefter forbedret med praktiske sproglige funktioner, herunder stavekorrektion, genkendelse af emotikoner og udråbstegn samt detektion af negationer. Vi gennemførte også et empirisk studie af ordbetydningsafklaring (Word Sense Disambiguation, WSD), dvs. at vælge den rette betydning af et ord i sin kontekst. Med testsætninger fra SemCor-korpus og definitioner (glosses) fra eXtended WordNet udviklede vi to definition-centrerede WSD-teknikker baseret på overlap og semantisk beslægtethed mellem definitioner. Eksperimenterne bekræftede, at WSD forbedrer sentimentklassifikation, og indikerede at mange typer ord, inklusive substantiver, kan bære følelsesmæssig information.

This thesis presents a system that automatically classifies the sentiment—positive or negative—of reviews posted on social media. It follows an unsupervised, linguistically driven approach: instead of learning from labeled examples, it relies on SentiWordNet, a lexical resource that assigns positivity and negativity scores to word meanings. We extract words with natural language processing and combine their scores using pattern-based rules to reach an overall judgment. To identify the most effective procedure, we compared several classifier setups that mix methods at the word and review level. The most promising one was then enhanced with practical language handling, including spelling correction, recognition of emoticons and exclamation marks, and detection of negation. We also conducted an empirical study on Word Sense Disambiguation (WSD), the task of selecting the correct meaning of a word in context. Using test sentences from the SemCor corpus and definitions (glosses) from eXtended WordNet, we designed two definition-centered WSD techniques based on overlaps and semantic relatedness among glosses. Experimental results confirmed that WSD improves sentiment classification performance and indicated that many kinds of words, including nouns, can carry emotional information.

[This abstract was generated with the help of AI]