Small Town or Metropolis? Analyzing the Relationship between Population Size and Language

Amy Rechkemmer, Steve Wilson, Rada Mihalcea

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The variance in language used by different cultures has been a topic of study for researchers in linguistics and psychology, but often times, language is compared across multiple countries in order to show a difference in culture. As a geographically large country that is diverse in population in terms of the background and experiences of its citizens, the U.S. also contains cultural differences within its own borders. Using a set of over 2 million posts from distinct Twitter users around the country dating back as far as 2014, we ask the following question: is there a difference in how Americans express themselves online depending on whether they reside in an urban or rural area? We categorize Twitter users as either urban or rural and identify ideas and language that are more commonly expressed in tweets written by one population over the other. We take this further by analyzing how the language from specific cities of the U.S. compares to the language of other cities and by training predictive models to predict whether a user is from an urban or rural area. We publicly release the tweet and user IDs that can be used to reconstruct the dataset for future studies in this direction.
Original languageEnglish
Title of host publicationProceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020)
PublisherEuropean Language Resources Association (ELRA)
Number of pages5
ISBN (Electronic)979-10-95546-34-4
Publication statusPublished - 16 May 2020
Event12th Language Resources and Evaluation Conference - Le Palais du Pharo, Marseille, France
Duration: 11 May 202016 May 2020
Conference number: 12


Conference12th Language Resources and Evaluation Conference
Abbreviated titleLREC 2020
Internet address


  • population
  • cities
  • culture
  • social media


Dive into the research topics of 'Small Town or Metropolis? Analyzing the Relationship between Population Size and Language'. Together they form a unique fingerprint.

Cite this