dsvast.blogg.se

Immune repertoire
Immune repertoire









immune repertoire

c PWMs inferred from the V, D, and J genes. Pearson correlation coefficient ρ and gene usage are given for each. b Comparison between the observed mutation rate per nucleotide and its prediction by the PWM model, as a function of position along the V segment, for the four most frequent V genes. The PWM is learned by expectation-maximization from the out-of-frame sequences of memory B cells. Each nucleotide σ at position i within ± m of the hypermutation site (in red) has an additive contribution e i( σ) to the hypermutation log odds (Eq. a Position weight matrix (PWM) model for predicting hypermutation hotspots in IGH. IGoR recovers the physiological exclusion between D2 and J1, while MiXCR does not d Usage frequency of TRB D gene conditioned on the J gene, inferred by the IGoR and MiXCR (Partis does not handle TCR sequences). “Failed” corresponds to sequences for which the algorithm did not output an assignment. c Frequency with which IGoR, MiXCR, and Partis call the correct scenario of recombination as the most likely one (“scenario”) in hypermutation-free IGH, as well as each separate feature of the scenario (“V gene,” etc.). b Distribution of the number of scenarios that need to be enumerated (from most to least likely) to include the true scenario with 50% (blue), 75% (green), 90% (red), or 95% (cyan) confidence for IGH (see Supplementary Fig. 10 for equivalent figure for TRB). Note that the best-ranked (maximum-likelihood) scenario is the correct one in less than 30% of cases. a Distribution of the rank of the true scenario as called by IGoR for both TRB and IGH. IGoR ranks putative scenarios by descending order of likelihood. Synthetic 130-bp reads of recombined hypermutation-free IGH sequences and 60-bp reads of TRB sequences were generated with a 5 × 10 −3 error rate, and processed for analysis by IGoR and two existing methods, MiXCR and Partis. Probabilistic analysis of putative recombination scenarios and comparison to existing methods. In the generation mode, IGoR produces synthetic sequences with specified recombination statistics In the analysis mode, IGoR outputs detailed recombination scenario statistics for each sequence. In the learning mode, IGoR learns recombination statistics from data sequences. Architectures for TRA and IGH are described in Methods. b The likelihood of each scenario is computed using a Bayesian network of dependencies between the recombination features (V, D, J segment choices, insertions, and deletions), as illustrated here for the human TRB locus. IGoR lists putative recombination scenarios consistent with the observed sequence, and weighs them according to their likelihood. Hypermutations (in the case of B cells) or sequencing errors (in red) further enhance diversity. Each segment gets trimmed at its ends (hashed areas), and a varying number of non-templated insertions are added between them (orange). a V(D)J recombination proceeds by joining randomly selected segments (V, D, and J segments in the case of TRB and IGH).











Immune repertoire