NCBI's Reference Sequence (RefSeq) database was established to overcome the problem of redundancy and errors typical for sequence repositories like EMBL and GenBank. RefSeq aims to provide a comprehensive, non-redundant set of curated high-quality sequences, including genomic DNA, transcripts, and proteins, for major research organisms. At the moment it covers over 1000 viruses, 100 bacteria, and numerous higher organisms including human, mouse, rat, zebrafish and arabidopsis. RefSeq selects one representative sequence for each locus, and the only duplicates represent splice variants. In addition to the high-quality entries, RefSeq also contains alignment based model transcripts which can be distinguished by their accession number starting with X.
At CSC you can do BLAST searchers against the following RefSeq nucleotide sequences: transcripts and non-coding RNA (accessions NM, NR, XM and XR), gene sequences (NG), microbial and arabidopsis chromosomes (NC) and human genomic contigs (NT).