Segmented Gel Electrophoresis Images used for Training GelGenie Models

  • Matthew Aquilina (Creator)
  • Nathan Wu (Creator)
  • Kiros Kwan (Creator)
  • Filip Busic (Creator)
  • James Dodd (Creator)
  • Laura Nicolas-Saenz (Creator)
  • Alan O'Callaghan (Creator)
  • Peter Bankhead (Creator)
  • Katherine Dunn (Creator)
  • Siyuan Stella Wang (Contributor)
  • Yichen Zhao (Contributor)
  • Thomas Mayer (Contributor)
  • Huangchen Cui (Contributor)
  • Joana Reis (Contributor)
  • Ricarda Törner (Contributor)

Dataset

Description

This repository contains gel images and corresponding hand-labelled segmentation maps (575 total) from various sources. The images have various sizes, shapes, gel contents and imaging conditions. In more detail: The matthew_gels (301 total), matthew_gels_2 (85 total) and nathan_gels (37 total) folders contain images originating from previous experiments in the Dunn Lab. The quantitation_ladder_gels (35 total) were generated specifically for the GelGenie project and contain gel ladders in each well. These were used to generate the data for Figure 1 in the main paper. The stella_gels_for_finetuning (26 total) folder contains gel images provided to us from Siyuan Stella Wang (Wyss Institute/Dana-Farber Cancer Institute) which we used to finetune our U-Net model and produce the results in Figure 4C of the main paper. The external_gels (25 total) images were gifted to us from 5 different researchers: Yichen Zhao (images generated at the University of Waterloo), Huangchen Cui (images generated at Tsinghua University), Thomas Mayer (images generated at the Technical University of Munich), Joana Reis (images generated at the Dana-Farber Cancer Institute) and Ricarda Törner (images generated at the Dana-Farber Cancer Institute). These were used as an external unseen test set for our fine-tuned model. The lsdb_gels (66 total) folder contains segmentation maps of images downloaded from the RGB Caps dataset, which is available from https://dbarchive.biosciencedbc.jp/en/rgp-caps/desc.html. The original images have been shared with a Creative Commons Attribution-Share Alike 2.1 Japan license. Permission was obtained from the data depositors for the training of our models on these images, as well as for the sharing of derived segmentation masks using the CC-BY license. To use this portion of the dataset, you will need to download the original images from the website, and place them alongside our segmentation masks, mirroring the same setup used for all the other datasets in this repository. The folders are organised as follows: The images, val_images and test_images contain the original images, split into training/validation/testing partitions respectively (except for the lsdb_gels dataset). The masks, val_masks and test_masks contain the corresponding segmentation maps for each image in the main folders (they have identical naming i.e. an image named test_image.tif will have a segmentation map labelled test_image.tif). The external_gels folder does not have validation or test partitions. The segmentation maps are 8-bit images, for which a white pixel (0) corresponds to a background pixel and a brown pixel (1) corresponds to a foreground pixel. This is the main dataset used to train the GelGenie models, for which the corresponding source code and GUI can be downloaded at https://github.com/mattaq31/GelGenie.

Data Citation

Aquilina, M., Wu, N., Kwan, K., Bušić, F., Dodd, J., Nicolás-Sáenz, L., O'Callaghan, A., Bankhead, P., & Dunn, K. (2025). Segmented Gel Electrophoresis Images used for Training GelGenie Models [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14641949
Date made available13 Jan 2025
PublisherZenodo

Cite this