国产精品一区99久久久国产aaa精品国产91久久久久久国产又黄又粗又爽又色国产精品亚洲第一区|精品国产第一精品国产|亚洲v综合v国产v国产成人综合久久精品免费欧美91亚洲精品日韩已方久久99欧美精品v国产精品v日韩精品国产福利免费福利久久久一本|99久久久无码国产精品不卡|97亚洲va在线va天堂va国产|国产精品天干天干在线观看啪|成a人片免费在线观看,欧美a一二三三区,AV免费看一二区,人体无码AV

MobiVision V(D)J Algorithm Introduction

Algorithm Overview

Barcode and UMI correction

The schematic diagram of the VDJ library generated by the MobiNova platform is as follows:

 

 

From the structure of VDJ above, it can be seen that the 5' end of Read1 is cell label sequence (20bp) and UMI sequence (10bp). In order to determine whether the cell label sequence carried by Read1 is correct, MobiVision will compare the cell label sequence in the sequenced fragment with the cell label sequence in the known white list. Currently, the MobiCube high-throughput single-cell V(D)J v1.0 kit provides nearly 3,000,000 cell labeling sequences. Sequencing reads that meet the following conditions will be retained:

  • The cell label of Read1 exists in the whitelist;
  • The cell label of Read1 does not exist in the white list, but the minimum Hamming distance with the cell label in the white list is <=2, and the cell label in Read1 is corrected according to the cell label in the white list.

For the sequenced fragments that pass, Read1 only retains the corrected cell label sequence and UMI sequence, and Read2 does not process it at this step.

 

For the fastq data after correcting the cell label sequence

對于糾正細胞標簽序列后的fastq數(shù)據(jù)中

  • There may be a 13bp TSO sequence at the 5' end of the Read1 fragment, and a polyA sequence at the 3' end.
  • There may be a polyT sequence at the 5' end of the Read2 fragment, and a 13bp TSO reverse complementary sequence at the 3' end.
  • The existence of TSO, polyA, polyT and other sequences will effectively reduce the alignment rate of the library. Therefore, it is necessary to remove the TSO sequence and poly A sequence that may exist at both ends of the insert fragment before alignment.
  • Removal of adapter sequences and poly A and poly T may result in too short inserted DNA fragments, and too short DNA fragments will increase the probability of mismatching. Therefore, after completing the removal of adapter sequences, it is necessary to filter out inserted DNA fragments smaller 30bp Read.

Check VDJ gene chain type

Align the inner primers to the fastq insert, and then calculate the ratio of the inner primers alignment reads from TCR to all inner primers alignment reads. If the ratio is greater than 80%, the library is considered to be a TCR type library ; If the ratio is less than 20%, the library is considered to be a BCR type library, otherwise it is an ALL type (BCR+TCR type) library.

VDJ gene sequence filter

In order to ensure the effectiveness and speed of splicing, we compared all reads to the reference sequence of VDJ, and eliminated reads that were not necessarily matched. Only the reads on the alignment are used for subsequent splicing analysis.

Assemble contig

Collect reads from the same Barcode to form a set of fastq files, use the De Brujin algorithm to splice transcripts of short fragments, and finally obtain the full-length information (contig). Each base of the contig is given a base quality value, and the UMI and the number of reads are also recorded. For all barcodes, perform the same operation to get the contig information in each barcode.

V(D)J Annotation

The purpose of VDJ annotation is to find a biologically functional and effective protein receptor/product, which needs to meet the following conditions: 1. The structure is complete, that is, the full-length sequence; 2. It starts with a codon, and there is no stop codon in the VJ region ; 3. The last codon of the J gene-the start codon of the V gene/3 is an integer; 4. The sequence contains the CDR3 region, and the length of the region spanned by V-J is reasonable to avoid structural abnormalities; 5. VJ (reference fragment Total length)-len (last codon-first codon of V) is between -25-25 amino acids, IGH is between -55-25 amino acids.

The method of determining CDR3: look for the conserved motif sequence on the left and right sides of CDR3, starting from C amino acid, 5-27 amino acid in length, without stop codon. If more than one CDR3 sequence is found, the one with the highest score is regarded as the CDR3 region, and if the scores are the same, the longer CDR3 sequence is selected.

Cell Calling

Cell Calling is based on whether there is a valid contig in the Barcode, and only if there is a valid contig will the cell be considered a real cell instead of a blank cell or a twin cell. Generally, the following conditions need to be met to select cells expressing V(D)J gene. Only T or B cells will have vdj rearrangement and produce full-length transcripts. The filtered Barcode must have sufficient UMI count support to avoid background mRNA interference. In addition, UMIs should have sufficient reads support to avoid library contamination and Sample index jumps.

Assignment of clonotype

The cell barcode is grouped to form different clonotypes, that is, the same or similar paired receptor sequences are found, and the cell barcode is grouped into different clonotypes.

Clonotype results include the following and can be used for subsequent downstream analysis.

1.clonotype_id

2.The number of cell Barcodes corresponding to the clonetype id frequency

3.Proportion corresponds to the proportion of cell Barcode

4.Amino acid sequence of CDR3_aa CDR3

5.Nucleotide sequence of CDR3_nt CDR3

Quality control report

When mobivision vdj is running, it will make statistics on the raw data and analysis results of the entire library, and finally generate a quality control report. The report is an honest feedback on the entire library, aiming to help users understand the quality of the original data and analysis results of the library from a macro perspective, without any data screening or filtering. If necessary, users can adjust the library results according to the results of the quality control report before starting downstream analysis.

庐江县| 上栗县| 敦化市| 永修县| 泸州市| 买车| 织金县| 安顺市| 广汉市| 汪清县| 苗栗市| 芜湖市| 靖西县| 大连市| 屏山县| 富宁县| 石景山区| 东兰县| 临颍县| 石屏县| 阜宁县| 桐梓县| 虎林市| 江都市| 江油市| 赞皇县| 读书| 昔阳县| 丰镇市| 江川县| 枣强县| 会昌县| 社旗县| 临桂县| 巫溪县| 嘉义市| 齐齐哈尔市| 夏邑县| 日照市| 昌吉市| 藁城市|