Session stitching using sequence fingerprinting for web page visits

Johannes De Smedt, Ewelina Lacka, Spyro Nita, Hans-Helmut Kohls, Ross Paton

Research output: Contribution to journalArticlepeer-review

Abstract

The nature of people’s web navigation has significantly changed in recent years.The advent of smartphones and other handheld devices has given rise to web users consulting websites with more than one device, or using a shared device.As a result, large volumes of seemingly disjoint data are available, which when analysed together can support decision-making. The task of identifying web sessions by linking such data back to a specific person, however, is hard. The idea of session stitching aims to overcome this by using machine learning inference to identify similar or identical users. Many such efforts use various demographic data or device-based features to train matching algorithms. However, often these variables are not available for every dataset or are recorded differently,making a streamlined setup difficult. Besides, the often result in vast feature spaces which are hard to use for actionable interpretation.

In this paper, we present an alternative approach based on the finger printing of web pages visited by users in a single session. By learning behavioral patterns from these sequences of page visits, we obtain features that can be used for matching without requiring sensitive user-agent data such as IP, geo location,or device details as is common with other approaches. Using these sequential fingerprints does not rely on pre-defined features, but only requires the recording of web page visits, making our approach actionable. The approach is empirically tested on real-life web logs and compared with matching using regular user-agent features and state-of-the-art embedding techniques. Results in an ecommerce context show sequential features can still obtain strong performance with fewer features, facilitating decision-making on session stitching and inform subsequent related activities such as marketing or customer analysis.
Original languageEnglish
Article number113579
JournalDecision Support Systems
Early online date28 Apr 2021
DOIs
Publication statusE-pub ahead of print - 28 Apr 2021

Keywords / Materials (for Non-textual outputs)

  • session stitching
  • web analytics
  • sequence mining
  • session fingerprinting

Fingerprint

Dive into the research topics of 'Session stitching using sequence fingerprinting for web page visits'. Together they form a unique fingerprint.

Cite this