Analysis and Classification of Word Co-Occurrence Networks From Alzheimer’s Patients and Controls

Tristan Millington, Saturnino Luz

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

In this paper we construct word co-occurrence networks from transcript data of controls and patients with potential Alzheimer’s disease using the ADReSS challenge dataset of spontaneous speech. We examine measures of the structure of these networks for significant differences, finding that networks from Alzheimer’s patients have a lower heterogeneity and centralization, but a higher edge density. We then use these measures, a network embedding method and some measures from the word frequency distribution to classify the transcripts into control or Alzheimer’s, and to estimate the cognitive test score of a participant based on the transcript. We find it is possible to distinguish between the AD and control networks on structure alone, achieving 66.7% accuracy on the test set, and to predict cognitive scores with a root mean squared error of 5.675. Using the network measures is more successful than using the network embedding method. However, if the networks are shuffled we find relatively few of the measures are different, indicating that word frequency drives many of the network properties. This observation is borne out by the classification experiments, where word frequency measures perform similarly to the network measures.
Original languageEnglish
Number of pages12
JournalFrontiers in Computer Science
Publication statusPublished - 29 Apr 2021


Dive into the research topics of 'Analysis and Classification of Word Co-Occurrence Networks From Alzheimer’s Patients and Controls'. Together they form a unique fingerprint.

Cite this