Exploring methods to find and label latent topic-groups in a blogging environment

Studenteropgave: Kandidatspeciale og HD afgangsprojekt

  • Seynabou Ndiaye
Twitter is mainly known for its tweets that have been subject for many researches. However, it has a normal size blog that is actually growing very fast that allows more room for content. With approximately 1600 blogs, twitter is a great communication channel between users and businesses that promotes themselves.
As one of the biggest social media this small “failure” raised my curiosity of trying to know how is the grouping done or could be done?
What could be the possibilities of improvement in this case?
In order to find possible groups of topics, we have crawled the twitter blog collect 100 blogs on which we have applied a cluster analysis and a topic modelling analysis.
In the context of Information retrieval (IR) modelling contextual information in documents search have been subject of several researches [22].

In this report is described the experimental design, process and results to extract hidden structure in our corpus using LSA and KMEANS that will be compared in this report. Blogs categorise blogs and compare with what twitter is actually proposing on its portal. In this research paper, we are trying to achieve a categorization that can have a positive impact on blog's search. In this report, we discuss the challenges with document clustering, through the following questions: What are the challenges cluster labelling? How to find topic labels for clusters.?
SpecialiseringsretningBusiness Development
SprogEngelsk
Udgivelsesdato14 sep. 2017
Antal sider70
ID: 262520614