AIRRSHIP: Simulating human B cell receptor repertoire sequences

Catherine Sutherland, Graeme J.M. Cowan

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

SUMMARY: Adaptive Immune Receptor Repertoire Sequencing is a rapidly developing field that has advanced understanding of the role of the adaptive immune system in health and disease. Numerous tools have been developed to analyse the complex data produced by this technique but work to compare their accuracy and reliability has been limited. Thorough, systematic assessment of their performance is dependent on the ability to produce high quality simulated datasets with known ground truth. We have developed AIRRSHIP, a flexible and fast Python package that produces synthetic human B cell receptor sequences. AIRRSHIP uses a comprehensive set of reference data to replicate key mechanisms in the immunoglobulin recombination process, with a particular focus on junctional complexity. Repertoires generated by AIRRSHIP are highly similar to published data and all steps in the sequence generation process are recorded. These data can be used to not only determine the accuracy of repertoire analysis tools but can also, by tuning of the large number of user-controllable parameters, give insight into factors that contribute to inaccuracies in results.

AVAILABILITY AND IMPLEMENTATION: AIRRSHIP is implemented in Python. It is available via and on PyPI at Documentation can be found at

Original languageEnglish
Article numberbtad365
Number of pages4
Issue number6
Publication statusPublished - 5 Jun 2023

Keywords / Materials (for Non-textual outputs)

  • documentation
  • humans
  • reproducibility of results
  • software


Dive into the research topics of 'AIRRSHIP: Simulating human B cell receptor repertoire sequences'. Together they form a unique fingerprint.

Cite this