UniProt Knowledgebase is the central hub for the collection of
functional information on proteins, with accurate, consistent, and rich
annotation. In addition to capturing the core data mandatory for each
UniProt entry (principally, the amino acid sequence, protein name or
description, taxonomic data and citation information), as much
annotation information as possible is added. This includes widely
accepted biological ontologies, classifications and cross-references,
and clear indications of the quality of annotation in the form of
evidence attribution of experimental and computational data.
Created by merging the data in Swiss-Prot, TrEMBL and PIR-PSD, UniProt
Knowledgebase entries are annotated to an even greater level of detail
than that already achieved in the source databases.
The UniProt
Knowledgebase consists of two sections
- SwissProt section containis manually-annotated records with information extracted from literature and curator-evaluated computational analysi
- TrEMBL section contains computationally analyzed records that await full manual annotation.
The UniRef databases are derived from the UniProt and UniParc databases. UniRef aims to provide a nonredeundant sequence set, where identical or similiar sequences are merged into a cluster that is represeind in the database by just one sequence. Removing the redundancy allows faster and more informative sequence similarity searches. The merging has been perfomed in 100%, 90% and 50% identity level.
-
UniRef100 Nonredundant protein sequence set where merging has been perfomed in 100% identity level.
-
UniRef90 Nonredundant protein sequece set where merging has been perfomed in 90% identity level.
-
UniRef50 Nonredundant protein sequece set where merging has been perfomed in 50% identity level.