解码生命 守护健康

这基因是啥功能啊?

2017-09-10 14:01:03生信人

这基因是啥功能呀?

 

 

拿到序列,看到一个一个碱基(氨基酸),是不是感觉很抽象,很迷糊呀?

 

到底是干吗用的啊?

 

 

 

不要急,小编来给你仔细说一说,这条序列到底是啥来头。

 

啥手段啊?

 

 

 

当然是基因功能注释了!

 

那开讲吧

 

 

 

OK!

 

基因功能注释主要将基因的序列与各数据库进行比对,然后获取对应的功能注释信息。简单说,就是数据库中A知道功能,然后咱的序列B去跟数据库比对,恰好比对到A了,然后我们推测B也应该具有A的功能(理论依据是序列的相似性与基因功能是密切相关的)。我们常用的数据库有Nt, Nr, Swissprot, trEMBL,EggNO, KEGG, InterPro ,GO等。

 

1. Blast database

 

 

 

 

Nt, Nr is the non-redundant NCBI collection of nucleotide and protein sequence database.

此数据库下载到本地后可以进行大规模的基因注释,也可以使用少量序列使用NCBI的BLAST选择NT或者NR库进行在线注释。涉及到的一些用法,小编已经阐释过了,如下:

NCBI在线BLAST用法详解

 

NT和NR分割

 

 

2.UniProtKB

 

 

 

 

1)Swiss-Prot

a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure,post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases.

这里面的蛋白注释都经过人工Check的,注释准确性非常高。

2)TrEMBL

a computer-annotated supplement of Swiss-Prot that contains all the translations of EMBL nucleotide sequence entries not yet integrated in Swiss-Prot.

这里面的蛋白功能注释是预测的(借助序列同源性来推测的),是对UniProtKB/Swiss-Prot的补充。

3)在线比对网址:

http://www.uniprot.org/blast/

 

 

3. eggNo

 

 

 

eggNo(evolutionary genealogy of genes: Non-supervised Orthologous Groups),The database currently covers 2031 eukaryotic and prokaryotic organisms, as well as precomputed mappings for 1655 additional prokaryotes and 352 virus. 

目前有开发的针对EggNOG的比对工具eggNOG-mapper,在线网站为:

http://eggnogdb.embl.de/#/app/emapper

 

 

 

4. KEGG

 

 

 

KEGG(Kyoto Encyclopedia of Genes and Genomes),KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies.

KEGG在线网站: http://www.kegg.jp/

 

具体用法见:

最新实用KEGG介绍

 

更多解释见:

KEGG图怎么看

 

KEGG Pathway 中基因的颜色怎么简单的标记上去

 

KEGG数据库的自动注释服务

 

 

5. InterPro annotation

 

 

 

 

InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as signatures, provided by several different databases (referred to as member databases) that make up the InterPro consortium.

 

在线网站:http://www.ebi.ac.uk/interpro/

 

 

 

 

Pfam, PRINTS, PROSITE, ProDom, and SMART, 这五个库是最好的

 

 

6. GO

 

 

 

 

Gene Onotology (GO) is for eukaryote species. We can use the web-based browser amiGO http://amigo.geneontology.org/cgi-bin/amigo/go.cgi , or software OBO-Edit to search any GO term.

现今的生物学家们浪费了太多的时间和精力在搜寻生物信息上。这种情况归结为生物学上定义混乱的原因:不光是精确的计算机难以搜寻到这些随时间和人为多重因 素而随机改变的定义,即使是完全由人手动处理也无法完成。举个例子来说,如果需要找到一个用于制抗生素的药物靶点,你可能想找到所有的和细菌蛋白质合成相 关的基因产物,特别是那些和人中蛋白质合成组分显著不同的。但如果一个数据库描述这些基因产物为“翻译类”,而另一个描述其为“蛋白质合成类”,那么这无 疑对于计算机来说是难以区分这两个在字面上相差甚远却在功能上相一致的定义。Gene Ontology (GO)项目正是为了能够使对各种数据库中基因产物功能描述相一致的努力结果。

GO分为三大类:(丛三个角度,交叉重复)

Biological Process: Those processes specifically pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. A process is a collection of molecular events with a defined beginning and end.

Cellular Component: The part of a cell or its extracellular environment in which a gene product is located. A gene product may be located in one or more parts of a cell and its location may be as specific as a particular macromolecular complex, that is, a stable, persistent association of macromolecules that function together.

Molecular Function: Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.

利用Interprocan数据库进行GO注释(单条序列直接进行Interprocan注释,即可获得GO号),相应的Mapping文件如下,可以下载本地进行大规模基因的GO注释:

http://www.geneontology.org/external2go/interpro2go;

另外Blast2GO也有试用版可以下载:

https://www.blast2go.com/blast2go-pro/download-b2g