Edinburgh Research Explorer

Data Wrangling for Big Data: Challenges and Opportunities

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions

Open

Documents

  • Download as Adobe PDF

    Final published version, 507 KB, PDF-document

    Licence: Creative Commons: Attribution (CC-BY)

http://openproceedings.org/html/pages/2016_edbt.html
Original languageEnglish
Title of host publicationAdvances in Database Technology — EDBT 2016
Subtitle of host publicationProceedings of the 19th International Conference on Extending Database Technology
Pages473-478
Number of pages6
DOIs
Publication statusPublished - 2016
Event19th International Conference on Extending Database Technology - Bordeaux, France
Duration: 15 Mar 201618 Mar 2016
http://edbticdt2016.labri.fr/

Publication series

NameAdvances in Database Technology
PublisherUniversity of Konstanz
ISSN (Print)2367-2005

Conference

Conference19th International Conference on Extending Database Technology
Abbreviated titleEDBT 2016
CountryFrance
CityBordeaux
Period15/03/1618/03/16
Internet address

Abstract

Data wrangling is the process by which the data required by an application is identified, extracted, cleaned and integrated, to yield adata set that is suitable for exploration and analysis. Although there are widely used Extract, Transform and Load (ETL) techniques and platforms, they often require manual work from technical and domain experts at different stages of the process. When confronted with the 4 V’s of big data (volume, velocity, variety and veracity),manual intervention may make ETL prohibitively expensive. This paper argues that providing cost-effective, highly-automated approaches to data wrangling involves significant research challenges,requiring fundamental changes to established areas such as data extraction,integration and cleaning, and to the ways in which these areas are brought together. Specifically, the paper discusses the importance of comprehensive support for context awareness within data wrangling, and the need for adaptive, pay-as-you-go solutions that automatically tune the wrangling process to the requirements and resources of the specific application.

Event

19th International Conference on Extending Database Technology

15/03/1618/03/16

Bordeaux, France

Event: Conference

Download statistics

No data available

ID: 25070474