Alfalfa Gene Editing Database
Nr:
Query id | Subject id | identity % | alignment length | mismatches | gap openings | q. start | q. end | s. start | s. end | e-value | bit score |
---|---|---|---|---|---|---|---|---|---|---|---|
MS.gene053288.t1 | XP_013469813.1 | 74.1 | 205 | 46 | 1 | 1 | 205 | 20 | 217 | 1.30E-48 | 203 |
Swissprot:
Query id | Subject id | identity % | alignment length | mismatches | gap openings | q. start | q. end | s. start | s. end | e-value | bit score |
---|
Trembl:
Query id | Subject id | identity % | alignment length | mismatches | gap openings | q. start | q. end | s. start | s. end | e-value | bit score |
---|---|---|---|---|---|---|---|---|---|---|---|
MS.gene053288.t1 | A0A072VQE2 | 74.1 | 205 | 46 | 1 | 1 | 205 | 20 | 217 | 9.7e-49 | 203.0 |
TFs/TRs:
Gene ID | Type | Classification |
---|
Protein Kinases:
Gene ID | Type | Classification |
---|
Network:
Co-expression Network:
Gene1 | Gene2 | correlation coefficient | p_value | FDR |
---|---|---|---|---|
MS.gene053287 | MS.gene053288 | 0.990511 | 5.54E-183 | -1.69E-46 |
MS.gene053288 | MS.gene053289 | 0.990478 | 7.96E-183 | -1.69E-46 |
MS.gene053288 | MS.gene26868 | 0.936289 | 2.20E-97 | -1.69E-46 |
MS.gene053288 | MS.gene93545 | 0.807308 | 5.39E-50 | -1.69E-46 |
MS.gene053288 | MS.gene93574 | 0.816326 | 5.86E-52 | -1.69E-46 |
MS.gene053288 | MS.gene95784 | 0.919017 | 7.61E-87 | -1.69E-46 |
PPI:
Gene1 | Gene2 | Type |
---|
Query id | Subject id | identity % | alignment length | mismatches | gap openings | q. start | q. end | s. start | s. end | e-value | bit score |
---|---|---|---|---|---|---|---|---|---|---|---|
MS.gene053288.t1 | MTR_1g103420 | 95.122 | 205 | 3 | 1 | 1 | 205 | 20 | 217 | 4.24e-130 | 365 |
MS.gene053288.t1 | MTR_5g006940 | 59.512 | 205 | 71 | 5 | 1 | 205 | 20 | 212 | 5.20e-63 | 194 |
MS.gene053288.t1 | MTR_1g071720 | 56.190 | 105 | 42 | 4 | 8 | 109 | 27 | 130 | 1.53e-31 | 113 |
MS.gene053288.t1 | MTR_1g071720 | 56.190 | 105 | 42 | 4 | 8 | 109 | 27 | 130 | 1.68e-31 | 113 |
MS.gene053288.t1 | MTR_7g095250 | 48.544 | 103 | 46 | 3 | 18 | 113 | 2 | 104 | 7.93e-23 | 90.9 |
Query id | Subject id | identity % | alignment length | mismatches | gap openings | q. start | q. end | s. start | s. end | e-value | bit score |
---|---|---|---|---|---|---|---|---|---|---|---|
MS.gene053288.t1 | AT1G05450 | 60.993 | 141 | 48 | 3 | 7 | 147 | 27 | 160 | 2.39e-46 | 152 |
MS.gene053288.t1 | AT3G22620 | 47.692 | 195 | 80 | 6 | 13 | 205 | 29 | 203 | 2.62e-41 | 139 |
MS.gene053288.t1 | AT2G48140 | 50.331 | 151 | 63 | 6 | 1 | 140 | 15 | 164 | 5.44e-35 | 122 |
MS.gene053288.t1 | AT2G48140 | 57.009 | 107 | 44 | 2 | 1 | 106 | 15 | 120 | 2.56e-34 | 119 |
Find 66 sgRNAs with CRISPR-Local
Find 81 sgRNAs with CRISPR-GE
CRISPR-Local
sgRNA_sequence | on_target_score | Position | Region |
---|---|---|---|
CACTCTCTTCACACCTTGTA+TGG | 0.248146 | 1.4:-2322659 | MS.gene053288:CDS |
AATCCATGCCACTACTTGTT+AGG | 0.277189 | 1.4:+2322569 | None:intergenic |
AAGAGAGTGCTTATTGTAGA+AGG | 0.288971 | 1.4:+2322673 | None:intergenic |
GGAGATGAAGTTGGTGAAGC+TGG | 0.347569 | 1.4:+2321819 | None:intergenic |
ATAGCTAAGGTTCTATTGAT+TGG | 0.350447 | 1.4:+2322505 | None:intergenic |
CTTACAGGTAGTAGTGCTAA+TGG | 0.358055 | 1.4:-2322631 | MS.gene053288:CDS |
CCATCGCCGTCGGTGAATTC+CGG | 0.375246 | 1.4:-2321753 | MS.gene053288:CDS |
GTCATTAATATTGTCATGGC+TGG | 0.381229 | 1.4:-2322730 | None:intergenic |
CTTCCAGTTGTCGATGTAGA+TGG | 0.386358 | 1.4:+2321726 | None:intergenic |
TACTGTTCTATTGATTGCAC+TGG | 0.398561 | 1.4:-2321649 | MS.gene053288:CDS |
GAGATGAAGTTGGTGAAGCT+GGG | 0.416122 | 1.4:+2321820 | None:intergenic |
GGCGATGGCGGAGTCAAAAT+AGG | 0.422484 | 1.4:+2321768 | None:intergenic |
TGCTAATAACCTTGAGGACT+AGG | 0.423773 | 1.4:+2321965 | None:intergenic |
CTTATTGTAGAAGGGTTACA+TGG | 0.446986 | 1.4:+2322682 | None:intergenic |
TCACCTGCATCTTCTCCCTC+AGG | 0.449806 | 1.4:-2322001 | MS.gene053288:CDS |
ATCCATGCCACTACTTGTTA+GGG | 0.450316 | 1.4:+2322570 | None:intergenic |
TGGGGTGTTGATTTGTGCAT+TGG | 0.461296 | 1.4:+2322702 | None:intergenic |
TGAGACATTGTAAGATGTAA+TGG | 0.469321 | 1.4:+2321674 | None:intergenic |
GCTAATAACCTTGAGGACTA+GGG | 0.472496 | 1.4:+2321966 | None:intergenic |
TTCACCAACTTCATCTCCTC+TGG | 0.485231 | 1.4:-2321814 | MS.gene053288:CDS |
CCTCGTGCTTGTAACATGCC+TGG | 0.487865 | 1.4:-2322475 | MS.gene053288:CDS |
ATTCCATCTACATCGACAAC+TGG | 0.490256 | 1.4:-2321729 | MS.gene053288:CDS |
GCTTCACCACTTCCTGCTCC+TGG | 0.495227 | 1.4:-2322304 | MS.gene053288:intron |
GCATTACCAGGAGCAGGAAG+TGG | 0.522233 | 1.4:+2322298 | None:intergenic |
AGAGAGTGCTTATTGTAGAA+GGG | 0.525567 | 1.4:+2322674 | None:intergenic |
AATCCTGAGGGAGAAGATGC+AGG | 0.533311 | 1.4:+2321998 | None:intergenic |
AGCAGGAAGTGGTGAAGCTG+AGG | 0.535400 | 1.4:+2322309 | None:intergenic |
CCGGAATTCACCGACGGCGA+TGG | 0.538150 | 1.4:+2321753 | None:intergenic |
ATTCCAACCCCTAGTCCTCA+AGG | 0.540237 | 1.4:-2321974 | MS.gene053288:intron |
ACTCTCTTCACACCTTGTAT+GGG | 0.540459 | 1.4:-2322658 | MS.gene053288:CDS |
AATTTGGCATTACCAGGAGC+AGG | 0.546347 | 1.4:+2322292 | None:intergenic |
TTCACAGGTCCTGTGGCTCT+AGG | 0.546401 | 1.4:-2322034 | MS.gene053288:intron |
TAACCTTGAGGACTAGGGGT+TGG | 0.546635 | 1.4:+2321971 | None:intergenic |
ATGTATATTCACAGGTCCTG+TGG | 0.548141 | 1.4:-2322041 | MS.gene053288:intron |
ATCCCTAACAAGTAGTGGCA+TGG | 0.551127 | 1.4:-2322572 | MS.gene053288:CDS |
CTTAAATCCCTAACAAGTAG+TGG | 0.551269 | 1.4:-2322577 | MS.gene053288:CDS |
TGTCGATGTAGATGGAATAC+CGG | 0.553646 | 1.4:+2321734 | None:intergenic |
TGAGATGGTCCTAGAGCCAC+AGG | 0.556902 | 1.4:+2322025 | None:intergenic |
ACTGTTCTATTGATTGCACT+GGG | 0.566423 | 1.4:-2321648 | MS.gene053288:CDS |
CCTGTAAGAAAACCCATACA+AGG | 0.567820 | 1.4:+2322646 | None:intergenic |
TAAGATGTAATGGCAGATGA+TGG | 0.568483 | 1.4:+2321684 | None:intergenic |
TTGCATTGAACTGGAACACC+AGG | 0.568933 | 1.4:+2322457 | None:intergenic |
GGAACATTTGTGTCTTGTTG+TGG | 0.569274 | 1.4:+2321789 | None:intergenic |
AAGAGCACCGCAACATTCAG+TGG | 0.581316 | 1.4:+2322597 | None:intergenic |
TTTGACTCCGCCATCGCCGT+CGG | 0.585425 | 1.4:-2321763 | MS.gene053288:CDS |
AGGGAACACTAGCAGTTACA+AGG | 0.586483 | 1.4:+2322536 | None:intergenic |
AGTGCAATCAATAGAACAGT+AGG | 0.586789 | 1.4:+2321651 | None:intergenic |
GTGTCTTGTTGTGGTGCCAG+AGG | 0.590812 | 1.4:+2321798 | None:intergenic |
GCACCGCAACATTCAGTGGT+TGG | 0.593515 | 1.4:+2322601 | None:intergenic |
TCTACATCGACAACTGGAAG+CGG | 0.603015 | 1.4:-2321723 | MS.gene053288:CDS |
GAAGATGCAGGTGAATGAGA+TGG | 0.607870 | 1.4:+2322010 | None:intergenic |
GAATTCACCGACGGCGATGG+CGG | 0.628016 | 1.4:+2321756 | None:intergenic |
GGTGTTCCAGTTCAATGCAA+AGG | 0.630148 | 1.4:-2322454 | MS.gene053288:intron |
CCAGGCATGTTACAAGCACG+AGG | 0.631598 | 1.4:+2322475 | None:intergenic |
ACGAGGGAGAGAGATAGCTA+AGG | 0.635982 | 1.4:+2322492 | None:intergenic |
TGGTGAAGCTGGGAGATCAG+AGG | 0.637738 | 1.4:+2321830 | None:intergenic |
CTAATAACCTTGAGGACTAG+GGG | 0.642081 | 1.4:+2321967 | None:intergenic |
TCACCAACCACTGAATGTTG+CGG | 0.642405 | 1.4:-2322604 | MS.gene053288:CDS |
TTATTGTAGAAGGGTTACAT+GGG | 0.644859 | 1.4:+2322683 | None:intergenic |
GCAACATTCAGTGGTTGGTG+AGG | 0.651267 | 1.4:+2322606 | None:intergenic |
GGGGTTGGAATAAATCCTGA+GGG | 0.662198 | 1.4:+2321986 | None:intergenic |
AGGGGTTGGAATAAATCCTG+AGG | 0.672890 | 1.4:+2321985 | None:intergenic |
GGTGCCAGAGGAGATGAAGT+TGG | 0.674504 | 1.4:+2321810 | None:intergenic |
TATTGTAGAAGGGTTACATG+GGG | 0.692481 | 1.4:+2322684 | None:intergenic |
GGAATACCGGAATTCACCGA+CGG | 0.713881 | 1.4:+2321747 | None:intergenic |
CAGGCATGTTACAAGCACGA+GGG | 0.748196 | 1.4:+2322476 | None:intergenic |
CRISPR-GE
badsite warning | sgRNA_sequence | Strand | Position | Region | GC_content |
---|---|---|---|---|---|
!! | AAATTATTGAAATTGAAATT+TGG | + | chr1.4:2322065-2322084 | None:intergenic | 10.0% |
!! | CATTATTAAGATTAATTCAA+AGG | + | chr1.4:2322271-2322290 | None:intergenic | 15.0% |
!!! | GTTAGAAAATATTTTTGTTA+GGG | + | chr1.4:2322123-2322142 | None:intergenic | 15.0% |
!!! | TGTTAGAAAATATTTTTGTT+AGG | + | chr1.4:2322124-2322143 | None:intergenic | 15.0% |
! | TCAAAGAAAAATGTAAGAGT+TGG | + | chr1.4:2322481-2322500 | None:intergenic | 25.0% |
! | TCAAAGGAAAAATAGTTTAG+AGG | + | chr1.4:2322255-2322274 | None:intergenic | 25.0% |
!! | TTTTGTTAGGGATTATAGAA+GGG | + | chr1.4:2322111-2322130 | None:intergenic | 25.0% |
!! | TTTTTGTTAGGGATTATAGA+AGG | + | chr1.4:2322112-2322131 | None:intergenic | 25.0% |
!!! | CTATTGATTGGAATTTTGAA+GGG | + | chr1.4:2321824-2321843 | None:intergenic | 25.0% |
!!! | TCTATTGATTGGAATTTTGA+AGG | + | chr1.4:2321825-2321844 | None:intergenic | 25.0% |
!!! | TGATTTTGATGTATATTCAC+AGG | - | chr1.4:2322289-2322308 | MS.gene053288:intron | 25.0% |
AATTGAAATTTGGCATTACC+AGG | + | chr1.4:2322055-2322074 | None:intergenic | 30.0% | |
ATAGCTAAGGTTCTATTGAT+TGG | + | chr1.4:2321836-2321855 | None:intergenic | 30.0% | |
CTATCATCTAGTAACATTGT+AGG | - | chr1.4:2322222-2322241 | MS.gene053288:intron | 30.0% | |
TGAGACATTGTAAGATGTAA+TGG | + | chr1.4:2322667-2322686 | None:intergenic | 30.0% | |
TTATTGTAGAAGGGTTACAT+GGG | + | chr1.4:2321658-2321677 | None:intergenic | 30.0% | |
! | TTTTTACCTTTGCATTGAAC+TGG | + | chr1.4:2321893-2321912 | None:intergenic | 30.0% |
AGTGCAATCAATAGAACAGT+AGG | + | chr1.4:2322690-2322709 | None:intergenic | 35.0% | |
CTTAAATCCCTAACAAGTAG+TGG | - | chr1.4:2321761-2321780 | MS.gene053288:CDS | 35.0% | |
CTTATTGTAGAAGGGTTACA+TGG | + | chr1.4:2321659-2321678 | None:intergenic | 35.0% | |
GTGTAATGCTAATAACCTTG+AGG | + | chr1.4:2322382-2322401 | None:intergenic | 35.0% | |
TAAGATGTAATGGCAGATGA+TGG | + | chr1.4:2322657-2322676 | None:intergenic | 35.0% | |
TATTGTAGAAGGGTTACATG+GGG | + | chr1.4:2321657-2321676 | None:intergenic | 35.0% | |
! | ACTGTTCTATTGATTGCACT+GGG | - | chr1.4:2322690-2322709 | MS.gene053288:CDS | 35.0% |
! | TACTGTTCTATTGATTGCAC+TGG | - | chr1.4:2322689-2322708 | MS.gene053288:CDS | 35.0% |
!! | AAGAGAGTGCTTATTGTAGA+AGG | + | chr1.4:2321668-2321687 | None:intergenic | 35.0% |
!! | AGAGAGTGCTTATTGTAGAA+GGG | + | chr1.4:2321667-2321686 | None:intergenic | 35.0% |
AATCCATGCCACTACTTGTT+AGG | + | chr1.4:2321772-2321791 | None:intergenic | 40.0% | |
ACTCTCTTCACACCTTGTAT+GGG | - | chr1.4:2321680-2321699 | MS.gene053288:CDS | 40.0% | |
ATGTATATTCACAGGTCCTG+TGG | - | chr1.4:2322297-2322316 | MS.gene053288:intron | 40.0% | |
ATTCCATCTACATCGACAAC+TGG | - | chr1.4:2322609-2322628 | MS.gene053288:CDS | 40.0% | |
CCTGTAAGAAAACCCATACA+AGG | + | chr1.4:2321695-2321714 | None:intergenic | 40.0% | |
CTAATAACCTTGAGGACTAG+GGG | + | chr1.4:2322374-2322393 | None:intergenic | 40.0% | |
CTTACAGGTAGTAGTGCTAA+TGG | - | chr1.4:2321707-2321726 | MS.gene053288:CDS | 40.0% | |
GCTAATAACCTTGAGGACTA+GGG | + | chr1.4:2322375-2322394 | None:intergenic | 40.0% | |
TGCTAATAACCTTGAGGACT+AGG | + | chr1.4:2322376-2322395 | None:intergenic | 40.0% | |
TGTCGATGTAGATGGAATAC+CGG | + | chr1.4:2322607-2322626 | None:intergenic | 40.0% | |
! | ATCCATGCCACTACTTGTTA+GGG | + | chr1.4:2321771-2321790 | None:intergenic | 40.0% |
! | CCTTGTATGGGTTTTCTTAC+AGG | - | chr1.4:2321692-2321711 | MS.gene053288:CDS | 40.0% |
! | GGAACATTTGTGTCTTGTTG+TGG | + | chr1.4:2322552-2322571 | None:intergenic | 40.0% |
AATTTGGCATTACCAGGAGC+AGG | + | chr1.4:2322049-2322068 | None:intergenic | 45.0% | |
AGGGAACACTAGCAGTTACA+AGG | + | chr1.4:2321805-2321824 | None:intergenic | 45.0% | |
AGGGGTTGGAATAAATCCTG+AGG | + | chr1.4:2322356-2322375 | None:intergenic | 45.0% | |
ATCCCTAACAAGTAGTGGCA+TGG | - | chr1.4:2321766-2321785 | MS.gene053288:CDS | 45.0% | |
CACTCTCTTCACACCTTGTA+TGG | - | chr1.4:2321679-2321698 | MS.gene053288:CDS | 45.0% | |
CTTCCAGTTGTCGATGTAGA+TGG | + | chr1.4:2322615-2322634 | None:intergenic | 45.0% | |
GAAGATGCAGGTGAATGAGA+TGG | + | chr1.4:2322331-2322350 | None:intergenic | 45.0% | |
GGGGTTGGAATAAATCCTGA+GGG | + | chr1.4:2322355-2322374 | None:intergenic | 45.0% | |
GGTGTTCCAGTTCAATGCAA+AGG | - | chr1.4:2321884-2321903 | MS.gene053288:intron | 45.0% | |
TCACCAACCACTGAATGTTG+CGG | - | chr1.4:2321734-2321753 | MS.gene053288:CDS | 45.0% | |
TCTACATCGACAACTGGAAG+CGG | - | chr1.4:2322615-2322634 | MS.gene053288:CDS | 45.0% | |
TTCACCAACTTCATCTCCTC+TGG | - | chr1.4:2322524-2322543 | MS.gene053288:CDS | 45.0% | |
TTGCATTGAACTGGAACACC+AGG | + | chr1.4:2321884-2321903 | None:intergenic | 45.0% | |
!! | GAGATGAAGTTGGTGAAGCT+GGG | + | chr1.4:2322521-2322540 | None:intergenic | 45.0% |
!! | TGGGGTGTTGATTTGTGCAT+TGG | + | chr1.4:2321639-2321658 | None:intergenic | 45.0% |
AATCCTGAGGGAGAAGATGC+AGG | + | chr1.4:2322343-2322362 | None:intergenic | 50.0% | |
ACGAGGGAGAGAGATAGCTA+AGG | + | chr1.4:2321849-2321868 | None:intergenic | 50.0% | |
ATTCCAACCCCTAGTCCTCA+AGG | - | chr1.4:2322364-2322383 | MS.gene053288:intron | 50.0% | |
CAGGCATGTTACAAGCACGA+GGG | + | chr1.4:2321865-2321884 | None:intergenic | 50.0% | |
GGAATACCGGAATTCACCGA+CGG | + | chr1.4:2322594-2322613 | None:intergenic | 50.0% | |
TAACCTTGAGGACTAGGGGT+TGG | + | chr1.4:2322370-2322389 | None:intergenic | 50.0% | |
! | GCAACATTCAGTGGTTGGTG+AGG | + | chr1.4:2321735-2321754 | None:intergenic | 50.0% |
!! | AAGAGCACCGCAACATTCAG+TGG | + | chr1.4:2321744-2321763 | None:intergenic | 50.0% |
!! | GGAGATGAAGTTGGTGAAGC+TGG | + | chr1.4:2322522-2322541 | None:intergenic | 50.0% |
AGCAGGAAGTGGTGAAGCTG+AGG | + | chr1.4:2322032-2322051 | None:intergenic | 55.0% | |
CCAGGCATGTTACAAGCACG+AGG | + | chr1.4:2321866-2321885 | None:intergenic | 55.0% | |
CCTCGTGCTTGTAACATGCC+TGG | - | chr1.4:2321863-2321882 | MS.gene053288:intron | 55.0% | |
GCACCGCAACATTCAGTGGT+TGG | + | chr1.4:2321740-2321759 | None:intergenic | 55.0% | |
GCATTACCAGGAGCAGGAAG+TGG | + | chr1.4:2322043-2322062 | None:intergenic | 55.0% | |
GGTGCCAGAGGAGATGAAGT+TGG | + | chr1.4:2322531-2322550 | None:intergenic | 55.0% | |
TCACCTGCATCTTCTCCCTC+AGG | - | chr1.4:2322337-2322356 | MS.gene053288:intron | 55.0% | |
TGAGATGGTCCTAGAGCCAC+AGG | + | chr1.4:2322316-2322335 | None:intergenic | 55.0% | |
TGGTGAAGCTGGGAGATCAG+AGG | + | chr1.4:2322511-2322530 | None:intergenic | 55.0% | |
TTCACAGGTCCTGTGGCTCT+AGG | - | chr1.4:2322304-2322323 | MS.gene053288:intron | 55.0% | |
! | GTGTCTTGTTGTGGTGCCAG+AGG | + | chr1.4:2322543-2322562 | None:intergenic | 55.0% |
!! | GGCGATGGCGGAGTCAAAAT+AGG | + | chr1.4:2322573-2322592 | None:intergenic | 55.0% |
CCATCGCCGTCGGTGAATTC+CGG | - | chr1.4:2322585-2322604 | MS.gene053288:CDS | 60.0% | |
GAATTCACCGACGGCGATGG+CGG | + | chr1.4:2322585-2322604 | None:intergenic | 60.0% | |
GCTTCACCACTTCCTGCTCC+TGG | - | chr1.4:2322034-2322053 | MS.gene053288:intron | 60.0% | |
TTTGACTCCGCCATCGCCGT+CGG | - | chr1.4:2322575-2322594 | MS.gene053288:CDS | 60.0% | |
CCGGAATTCACCGACGGCGA+TGG | + | chr1.4:2322588-2322607 | None:intergenic | 65.0% |
Chromosome | Type | Strat | End | Strand | Name |
---|---|---|---|---|---|
chr1.4 | gene | 2321623 | 2322737 | 2321623 | ID=MS.gene053288 |
chr1.4 | mRNA | 2321623 | 2322737 | 2321623 | ID=MS.gene053288.t1;Parent=MS.gene053288 |
chr1.4 | exon | 2322455 | 2322737 | 2322455 | ID=MS.gene053288.t1.exon1;Parent=MS.gene053288.t1 |
chr1.4 | CDS | 2322455 | 2322737 | 2322455 | ID=cds.MS.gene053288.t1;Parent=MS.gene053288.t1 |
chr1.4 | exon | 2322305 | 2322331 | 2322305 | ID=MS.gene053288.t1.exon2;Parent=MS.gene053288.t1 |
chr1.4 | CDS | 2322305 | 2322331 | 2322305 | ID=cds.MS.gene053288.t1;Parent=MS.gene053288.t1 |
chr1.4 | exon | 2321975 | 2322049 | 2321975 | ID=MS.gene053288.t1.exon3;Parent=MS.gene053288.t1 |
chr1.4 | CDS | 2321975 | 2322049 | 2321975 | ID=cds.MS.gene053288.t1;Parent=MS.gene053288.t1 |
chr1.4 | exon | 2321623 | 2321855 | 2321623 | ID=MS.gene053288.t1.exon4;Parent=MS.gene053288.t1 |
chr1.4 | CDS | 2321623 | 2321855 | 2321623 | ID=cds.MS.gene053288.t1;Parent=MS.gene053288.t1 |
Gene Sequence |
Protein sequence |