Preface

Why reference book?

With the newly gained information over the last two decades, it is time to re-evaluate how we infer and analyze genes and alleles in the expressed repertoire. IGHV alleles hold valuable information about the function of the adaptive immune system. This book uses IGH repertoires from VDJbase.org (Omer et al. 2020) and describes a new way of exploring the expressed alleles and inferring genotypes. The book:

  • Shows the expressed repertoire data of four projects from VDJbase.org.

  • Includes interactive and dynamic visualizations.

  • Integrated with applications that allow you to “play” with the data to gain new insights.

  • Express yourself! Each page includes a comment section where you can add insights and thoughts about the expressed alleles and visualizations (under development).

Genotyping is summarizing the genes and alleles an individual carries (Yaari and Kleinstein 2015). Though capturing this information from genomic data is technically feasible, this is a very challenging task due to the highly repetitive nature of these genomic loci Watson et al. (2013). In recent years several methods have been developed to deduce this information from expressed B cell receptor repertoires Slabodkin et al. (2021). These methods rely on a pre-set maximum number of alleles per gene that can appear in the genotype and the overall expression of the alleles. Although the current methods have their merits, they do possess several cavities that can bias the inferred genotypes Zhang et al. (2015). One major pitfall is the goodness of the alignment. Each aligned sequence is assigned with one or more matching allele annotations. Several factors can cause errors in the final annotations for each given sequence, the major ones are somatic hypermutations, the length of the sequenced V(D)J segments, and the quality of the sequencing. The existing genotype inference methods attempt to overcome these factors but are not very successful. In this book, we raise the idea of abandoning the concept of genes for inferring a genotype and rather determining the alleles directly. To better understand the influence of the alleles on one another, the alleles were divided into allele similarity clusters (ASCs). These ASCs were clustered based on the nucleotide level similarity between the allele’s germline sequence (For more information see chapter 1). This concept of allele clustering is not new and was previously performed by IMGT (Giudicelli and Lefranc 1999). Yet, with gained information from the last two decades on the expressed alleles, we sought to re-group the alleles. This re-grouping into the alleles by similarity clusters is consistent with the observed multiple annotations in sequences within the expressed repertoires.

How to read this book?

Each chapter of this book showcases a different IGHV ASC. The chapters include information on the expression of the alleles assigned to these ASCs in several projects. Our ASC-based genotype method has set allele thresholds and options to interactively “play” with the data, to gain insights into the ASC’s expressed alleles and the genotype combinations.

We included the allele groups for full and partial (BIOMED-2 (Van Dongen et al. 2003)) length V segments. The ASC numbers differ between the V lengths as the clustering is done based on the given germline length.

Who are we?

We are part of the AIRR community and members of the IARC committee. Our target is to make immune repertoire exploration more accessible and fun! We want to create a space where we can discuss and explore the IGH reference and gain insights from the community.

References

Corcoran, Martin M., Ganesh E. Phad, Néstor Vázquez Bernat, Christiane Stahl-Hennig, Noriyuki Sumida, Mats A. A. Persson, Marcel Martin, and Gunilla B. Karlsson Hedestam. 2016. “Production of Individualized v Gene Databases Reveals High Levels of Immunoglobulin Genetic Diversity.” Nature Communications 7 (1): 13642. https://doi.org/10.1038/ncomms13642.
Gadala-Maria, Daniel, Gur Yaari, Mohamed Uduman, and Steven H. Kleinstein. 2015. “Automated Analysis of High-Throughput b-Cell Sequencing Data Reveals a High Frequency of Novel Immunoglobulin v Gene Segment Alleles.” Proceedings of the National Academy of Sciences 112 (8): E862–70. https://doi.org/10.1073/pnas.1417683112.
Giudicelli, Veronique, and Marie-Paule Lefranc. 1999. “Ontology for Immunogenetics: The IMGT-ONTOLOGY.” Bioinformatics 15 (12): 1047–54.
Matsuda, Fumihiko, Kazuo Ishii, Patrice Bourvagnet, Kei-ichi Kuma, Hidenori Hayashida, Takashi Miyata, and Tasuku Honjo. 1998. “The Complete Nucleotide Sequence of the Human Immunoglobulin Heavy Chain Variable Region Locus.” The Journal of Experimental Medicine 188 (11): 2151–62.
Ohlin, Mats. 2021. “Poorly Expressed Alleles of Several Human Immunoglobulin Heavy Chain Variable Genes Are Common in the Human Population.” Frontiers in Immunology 11: 603980.
Omer, Aviv, Or Shemesh, Ayelet Peres, Pazit Polak, Adrian J Shepherd, Corey T Watson, Scott D Boyd, Andrew M Collins, William Lees, and Gur Yaari. 2020. “VDJbase: An Adaptive Immune Receptor Genotype and Haplotype Database.” Nucleic Acids Research 48 (D1): D1051–56.
Slabodkin, Andrei, Maria Chernigovskaya, Ivana Mikocziova, Rahmad Akbar, Lonneke Scheffer, Milena Pavlović, Habib Bashour, et al. 2021. “Individualized VDJ Recombination Predisposes the Available Ig Sequence Space.” Genome Research 31 (12): 2209–24.
Van Dongen, JJM, AW Langerak, M Brüggemann, PAS Evans, M Hummel, FL Lavender, E Delabesse, et al. 2003. “Design and Standardization of PCR Primers and Protocols for Detection of Clonal Immunoglobulin and t-Cell Receptor Gene Recombinations in Suspect Lymphoproliferations: Report of the BIOMED-2 Concerted Action Bmh4-Ct98-3936.” Leukemia 17 (12): 2257–317.
Watson, Corey T, Karyn M Steinberg, John Huddleston, Rene L Warren, Maika Malig, Jacqueline Schein, A Jeremy Willsey, et al. 2013. “Complete Haplotype Sequence of the Human Immunoglobulin Heavy-Chain Variable, Diversity, and Joining Genes and Characterization of Allelic and Copy-Number Variation.” The American Journal of Human Genetics 92 (4): 530–46.
Yaari, Gur, and Steven H Kleinstein. 2015. “Practical Guidelines for b-Cell Receptor Repertoire Sequencing Analysis.” Genome Medicine 7 (1): 1–14.
Zhang, Bochao, Wenzhao Meng, Eline T Luning Prak, and Uri Hershberg. 2015. “Discrimination of Germline v Genes at Different Sequencing Lengths and Mutational Burdens: A New Tool for Identifying and Evaluating the Reliability of v Gene Assignment.” Journal of Immunological Methods 427: 105–16.