52 ncRNA genes predicted using a combination of methods depending on their type. tRNAs are predicted using tRNAScan-SE, rRNAs using RNAmmer, and for all other types, using covariance models and sequences from RFAM. ncRNA genes 1 {'colour_key' => '[biotype]','caption' => 'ncRNA genes','name' => 'ncRNA genes','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ncrna'}
1 Dust is a program that identifies low-complexity sequences (regions of the genome with a biased distribution of nucleotides, such as a repeat). The Dust module is widely used with BLAST to prevent 'sticky' regions from determining false hits. Low complexity (Dust) 1 \N
36 UniParc mapping based on sequence checksums UniParc cross-reference 0 \N
31 RepeatMasker is used to find repeats and low-complexity sequences. This track usually shows repeats alone (not low-complexity sequences). Repeats 1 \N
35 match Protein 0 \N
30 CpG islands are regions of nucleic acid sequence containing a high number of adjacent cytosine guanine pairs (along one strand). Usually unmethylated, they are associated with promoters and regulatory regions. They are determined from the genomic sequence using a program written by G. Miklem, similar to newcpgreport in the EMBOSS package. CpG islands 1 \N
34 Sequences from various databases are matched to Ensembl transcripts using Exonerate. These are external references, or 'Xrefs'. DNA match 0 \N
12 Gap feature annotated in ENA Gap (ENA) 1 \N
3 Protein coding genes annotated in ENA Genes 1 {'colour_key' => '[biotype]','caption' => 'Genes','name' => 'Genes','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'multi_caption' => 'Genes','key' => 'ena_genes'}
2 Tandem Repeats Finder locates adjacent copies of a pattern of nucleotides. Tandem repeats (TRF) 1 \N
223 NCBI-BlastP search against ProDom families ProDom 1 {'type' => 'domain'}
238 Protein domains and motifs in the Pfam database. Pfam 1 {'type' => 'domain'}
229 Protein domains and motifs in the SMART database. SMART 1 {'type' => 'domain'}
224 Gene3D analysis as of interpro_scan.pl Gene3D 1 {'type' => 'domain'}
231 Protein domains and motifs in the TIGRFAM database. TIGRFAM 1 {'type' => 'domain'}
234 Prediction of transmembrane helices in proteins by TMHMM. Transmembrane helices 1 \N
239 InterPro2GO file is generated manually by the InterPro team at the EBI. InterPro2GO mapping 0 \N
226 Protein domains and motifs from the PIR (Protein Information Resource) Superfamily database. PIRSF 1 {'type' => 'domain'}
228 Protein domains and motifs from the PROSITE profiles database are aligned to the genome. PROSITE patterns 1 {'type' => 'domain'}
225 Protein domains and motifs from the PROSITE profiles database are aligned to the genome. PROSITE profiles 1 {'type' => 'domain'}
230 Protein domains and motifs in the SUPERFAMILY database. Superfamily 1 {'type' => 'domain'}
236 HAMAP is a system, based on manual protein annotation, that identifies and semi-automatically annotates proteins that are part of well-conserved families or subfamilies: the HAMAP families. HAMAP 1 {'type' => 'domain'}
235 Identification of peptide low complexity sequences by Seg. Low complexity (Seg) 1 \N
227 Protein fingerprints (groups of conserved motifs) are aligned to the genome. These motifs come from the PRINTS database. Prints 1 {'type' => 'domain'}
233 Prediction of signal peptide cleavage sites by SignalP. Cleavage site (Signalp) 1 {'type' => 'feature'}
237 HMM-Panther families PANTHER 1 {'type' => 'domain'}
198 The Gene Ontology XRef projection pipeline GO projected xrefs 1 \N
232 Prediction of coiled-coil regions in proteins is by Ncoils. Coiled-coils (Ncoils) 1 \N
240 InterPro2Pathway mapping is obtained from interproScan results. InterPro2Pathway mapping 0 \N
199 UniProt cross-reference derived transitively from a UniParc identifier UniParc-derived cross-reference 1 \N
200 Cross-reference derived transitively from a UniProt record UniProt-derived cross-reference 0 \N
201 Covariance models from Rfam (release 12.1), aligned to the genome with 'cmscan' from the Infernal suite of programs. Rfam Models 1 {'type' => 'rna','default' => {'contigviewbottom' => 'as_alignment_label'}}
202 tRNA models predicted with tRNAscan-SE (release 1.3.1). tRNA Models 1 {'type' => 'rna'}
222 Conserved Domain Database model CDD 1 {'type' => 'domain'}