Edinburgh Research Explorer

The VADA Architecture for Cost-Effective Data Wrangling

Research output: Chapter in Book/Report/Conference proceedingConference contribution

  • Nikolaos Konstantinou
  • Martin Koehler
  • Edward Abel
  • Cristina Civili
  • Bernd Neumayr
  • Emanuel Sallinger
  • Alvaro A.A. Fernandes
  • Georg Gottlob
  • John A. Keane
  • Leonid Libkin
  • Norman W. Paton

Related Edinburgh Organisations

Open Access permissions

Open

Documents

http://dl.acm.org/citation.cfm?doid=3035918.3058730
Original languageEnglish
Title of host publicationProceedings of the 2017 ACM International Conference on Management of Data
Place of PublicationNew York, NY, USA
PublisherACM
Pages1599-1602
Number of pages4
ISBN (Print)978-1-4503-4197-4
DOIs
Publication statusPublished - 9 May 2017
Event2017 ACM International Conference on Management of Data - Chicago, United States
Duration: 14 May 201719 May 2017
http://sigmod2017.org/

Publication series

NameSIGMOD '17
PublisherACM

Conference

Conference2017 ACM International Conference on Management of Data
Abbreviated titleSIGMOD/PODS 2017
CountryUnited States
CityChicago
Period14/05/1719/05/17
Internet address

Abstract

Data wrangling, the multi-faceted process by which the data required by an application is identified, extracted, cleaned and integrated, is often cumbersome and labor intensive. In this paper, we present an architecture that supports a complete data wrangling lifecycle, orchestrates components dynamically, builds on automation wherever possible, is informed by whatever data is available, refines automatically produced results in the light of feedback, takes into account the user's priorities, and supports data scientists with diverse skill sets. The architecture is demonstrated in practice for wrangling property sales and open government data.

    Research areas

  • data wrangling

Event

2017 ACM International Conference on Management of Data

14/05/1719/05/17

Chicago, United States

Event: Conference

Download statistics

No data available

ID: 36374809