Projects per year
Abstract
Data wrangling is the process by which the data required by an application is identified, extracted, cleaned and integrated, to yield adata set that is suitable for exploration and analysis. Although there are widely used Extract, Transform and Load (ETL) techniques and platforms, they often require manual work from technical and domain experts at different stages of the process. When confronted with the 4 V’s of big data (volume, velocity, variety and veracity),manual intervention may make ETL prohibitively expensive. This paper argues that providing cost-effective, highly-automated approaches to data wrangling involves significant research challenges,requiring fundamental changes to established areas such as data extraction,integration and cleaning, and to the ways in which these areas are brought together. Specifically, the paper discusses the importance of comprehensive support for context awareness within data wrangling, and the need for adaptive, pay-as-you-go solutions that automatically tune the wrangling process to the requirements and resources of the specific application.
| Original language | English |
|---|---|
| Title of host publication | Advances in Database Technology — EDBT 2016 |
| Subtitle of host publication | Proceedings of the 19th International Conference on Extending Database Technology |
| Pages | 473-478 |
| Number of pages | 6 |
| DOIs | |
| Publication status | Published - 2016 |
| Event | 19th International Conference on Extending Database Technology - Bordeaux, France Duration: 15 Mar 2016 → 18 Mar 2016 http://edbticdt2016.labri.fr/ |
Publication series
| Name | Advances in Database Technology |
|---|---|
| Publisher | University of Konstanz |
| ISSN (Print) | 2367-2005 |
Conference
| Conference | 19th International Conference on Extending Database Technology |
|---|---|
| Abbreviated title | EDBT 2016 |
| Country/Territory | France |
| City | Bordeaux |
| Period | 15/03/16 → 18/03/16 |
| Internet address |
Fingerprint
Dive into the research topics of 'Data Wrangling for Big Data: Challenges and Opportunities'. Together they form a unique fingerprint.Projects
- 1 Finished
-
VADA: Value Added Data Systems: Principles and Architecture
Libkin, L. (Principal Investigator), Buneman, P. (Co-investigator), Fan, W. (Co-investigator) & Pieris, A. (Co-investigator)
1/04/15 → 30/09/20
Project: Research