RefSeq Genes, TSS and other annotations for protein-coding genes.
Transcription Start Sites (TSS), Transcription End Sites (TES)
and CDS start sites from the RefSeq annotation
- Raw data was downloaded from: RefSeq
- Input file format: GFF
- Download date: 3-10-2017
From M. musculus (March 2012 GRCm38/mm10).
GFF file was converted in SGA format using an in-house
script. Transcrips were reteined if id was of the following
type 'NM_XXXXXX' and CDS if protein id was similar to
O'Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, Rajput B, Robbertse B, Smith-White B, Ako-Adjei D, Astashyn A, Badretdin A, Bao Y, Blinkova O, Brover V, Chetvernin V, Choi J, Cox E, Ermolaeva O, Farrell CM, Goldfarb T, Gupta T, Haft D, Hatcher E, Hlavina W, Joardar VS, Kodali VK, Li W, Maglott D, Masterson P, McGarvey KM, Murphy MR, O'Neill K, Pujar S, Rangwala SH, Rausch D, Riddick LD, Schoch C, Shkeda A, Storz SS, Sun H, Thibaud-Nissen F, Tolstoy I, Tully RE, Vatsan AR, Wallin C, Webb D, Wu W, Landrum MJ, Kimchi A, Tatusova T, DiCuccio M, Kitts P, Murphy TD, Pruitt KD
Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45. doi: 10.1093/nar/gkv1189. Epub 2015 Nov 8.