• Abiram Mohanaraj
  • Elisabeth Niemeyer Laursen
4. semester, Software, Kandidat (Kandidatuddannelse)
A key challenge with conversational agents is providing user-tailored responses to the user. This includes structuring personal information about a particular user and referring to it at the right time. In recent research, we observe the usage of knowledge graphs to retrieve and store personal information as an ideal representation. From the previous semester work in populating user-specific knowledge graphs, we observe deficiencies in an architecture of existing solutions. The first deficiency is a lack of entity linking to local personal knowledge graph (PKG), which we refer to PKG Statement Linking problem. The second problem concerns updating the PKG with the information from the previous problem. We refer to this as PKG Enrichment. As both of these problem are complex, we focus on the former. We also construct a dataset suitable for training and testing PKG population architectures, since we find no conversational dataset with annotations of textual triple, entity linking to open KG, and personal entities. The dataset spans 100 conversations with triple annotations, personal entity annotations, and ConceptNet entity annotations. We choose ConceptNet as the ideal common sense knowledge graph through a survey on the open knowledge graphs. To further collect the annotations, we make three annotation website implementations available. Our proposed solution consists of a personal entity ckassifier (PEC) and personal entity disambiguator (PED). PEC determines whether non-pronoun entity mention in an utterance is present in the PKG. We propose a new transformer architecture with a modified input embedding and masking layer for this purpose. As input, we propose a supergraph consisting of a PKG and an Utterance Relation Graph (URG), which combines an utterance and textual triples (i.e., two substrings and the relation inbetween). We evaluate the PEC in comparison to two baseline models, where it outperforms them by 35−42% in F1-score. The second component, PED, is a heuristic-based method for linking the entity mention to a specific personal entity, which also incorporates coreference resolution for linking pronouns. In our experiment with PED, we achieve an F1 of 0.87. We observe that the critical component of the architecture is the PEC, as we observe a significant effect of compounding error in the performance of PED. Though our findings display room for improvement, we have still managed to uncover vital information about the basis of personal knowledge graph population, which brings us one step closer to personalised conversation.
SprogEngelsk
Udgivelsesdato2022
Antal sider28
ID: 473195124