1  ASC-based genotype

Genotyping an Individual’s repertoire is becoming a common practice in down stream analysis. There are several tools nowadays to achieve such inference, namely TIgGER(Gadala-Maria et al. 2015) and IgDiscover (Corcoran et al. 2016). Though the methods are doing a fine job at inferring the genotype in high accuracy, they often neglect to detect lowly frequent alleles. The set of restriction the methods operates under enhance the specificity over the sensitivity.

Aside from low frequent alleles, another limitation that can hinder genotype inference is sequence multiple assignment. Each sequence in the repertoire is assigned its inferred V(D)J alleles for each of the segments. The assignments can be influenced by several factors, such as sequencing errors, somatic hyper mutations, amplicon length, and the initial reference set. This confounding factors can results in assigning more than a single allele per sequence segment. This multiple assignment has a downstream affect on the genotype inference. Each tool tries to deal with this effect in various ways.

In this reference book we observed the allele distribution across the population and derived threshold for determining genotype inference.

1.1 Allele’s thresholds

The following thresholds were used for determining presence of an allele in the genotype.

Corcoran, Martin M., Ganesh E. Phad, Néstor Vázquez Bernat, Christiane Stahl-Hennig, Noriyuki Sumida, Mats A. A. Persson, Marcel Martin, and Gunilla B. Karlsson Hedestam. 2016. “Production of Individualized v Gene Databases Reveals High Levels of Immunoglobulin Genetic Diversity.” Nature Communications 7 (1): 13642. https://doi.org/10.1038/ncomms13642.
Gadala-Maria, Daniel, Gur Yaari, Mohamed Uduman, and Steven H. Kleinstein. 2015. “Automated Analysis of High-Throughput b-Cell Sequencing Data Reveals a High Frequency of Novel Immunoglobulin v Gene Segment Alleles.” Proceedings of the National Academy of Sciences 112 (8): E862–70. https://doi.org/10.1073/pnas.1417683112.