Sciences and methods > Biosciences > Databases > UniProt
 
Tehdyt toimenpiteet

UniProt

UniProt Knowledgebase is the central hub for the collection of functional information on proteins, with accurate, consistent, and rich annotation. In addition to capturing the core data mandatory for each UniProt entry (principally, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and clear indications of the quality of annotation in the form of evidence attribution of experimental and computational data.

Created by merging the data in Swiss-Prot, TrEMBL and PIR-PSD, UniProt Knowledgebase entries are annotated to an even greater level of detail than that already achieved in the source databases.

 The UniProt Knowledgebase consists of two sections

  • SwissProt section containis manually-annotated records with information extracted from literature and curator-evaluated computational analysi
  •  TrEMBL section contains computationally analyzed records that await full manual annotation.

The UniRef databases are derived from the UniProt and UniParc databases. UniRef aims to provide a nonredeundant sequence set,  where identical or similiar sequences are merged into a cluster that is represeind in the database by just one sequence.  Removing the redundancy allows faster and more informative sequence similarity searches. The merging has been perfomed in 100%, 90% and 50% identity level.

  • UniRef100 Nonredundant protein sequence set where merging has been perfomed in 100% identity level.

  • UniRef90 Nonredundant protein sequece set where merging has been perfomed in 90% identity level.

  • UniRef50 Nonredundant protein sequece set where merging has been perfomed in 50% identity level.

At CSC you can use UniProt with BLAST and EMBOSS.