生物數據庫與在線工具

生物數據庫

      生物數據庫是收集自科學實驗、出版文獻、高通量實驗技術和計算分析等生命科學信息庫,它包含來自基因組學、蛋白質組學、代謝組學、微陣列基因表達和系統發育學等領域的信息。
php

      生物數據庫大體可分爲序列、結構和功能數據庫。序列數據庫儲存核酸和蛋白質序列;結構數據庫儲存RNA和蛋白質的結構信息;功能數據庫提供關於基因產物的生理做用信息(例如,酶活性、突變表型和生物途徑等)。html

數據庫類型

    生物數據庫有兩個常見的概念:一級數據庫和二級數據庫。一級數據庫儲存實驗中得到數據;二級數據庫使用其它數據庫(例如,一級數據庫)做爲其信息源,而後根據須要進行處理或分析得到的結果。ios

 

數據庫查找

       查找生物數據庫的重要資源是NAR(Nucleic Acids Research,核酸研究)期刊的特刊,它將許多與生物學和生物信息學相關的公開在線數據庫分類,截止2018年共收錄了1737個數據庫。git

    NAR將全部數據庫劃分爲15類,核苷酸序列數據庫、RNA序列數據庫、蛋白質序列數據庫、結構數據庫、基因組學數據庫(非脊椎動物)、代謝和信號通路數據庫、人類和其餘脊椎動物基因組數據庫、人類基因和疾病數據庫、微陣列數據和其餘基因表達數據庫、蛋白質組學資源數據庫、其餘分子生物學數據庫、細胞器數據庫、植物數據庫、免疫學數據庫和細胞生物學數據庫。github

 

在線工具

      NAR除了收錄生物數據庫,每一年還發布可用於分子生物學數據分析和可視化的網絡資源。web

 

表1 2017年網絡資源數據庫

Web Server name  URL  Brief description 
agriGO v2  http://systemsbiology.cau.edu.cn/agriGOv2/  GO analysis for agricultural species 
AMMOS2  http://drugmod.rpbs.univ-paris-diderot.fr/ammosHome.php  Energy minimization of protein–ligand complexes 
antiSMASH  http://antismash.secondarymetabolites.org/  Secondary metabolite biosynthetic gene cluster mining in bacterial and fungal genomes 
ARTS  http://arts.ziemertlab.com  Biosynthetic gene cluster mining for novel antibiotics 
BAR 3.0  http://bar.biocomp.unibo.it/bar3  Protein structure and function annotation 
BepiPred-2.0  http://www.cbs.dtu.dk/services/BepiPred-2.0/  B-cell epitope prediction from a protein sequence 
BioAtlas  http://bioatlas.compbio.sdu.dk  Visualization of microbiome and metagenome locations 
BIS2Analyzer  http://www.lcqb.upmc.fr/BIS2Analyzer/  Analysis of coevolving amino-acid pairs in protein sequences 
BusyBee  https://ccb-microbe.cs.uni-saarland.de/busybee  Metagenome binning 
CAFE  https://github.com/younglululu/CAFE  Stand-alone program for alignment-free comparison of metagenome data 
Cancer PanorOmics  http://panoromics.irbbarcelona.org  Mapping of cancer mutations to 3D protein–protein interaction sites 
COFACTOR  http://zhanglab.ccmb.med.umich.edu/COFACTOR/  Structure-based protein function annotation 
compleXView  http://xvis.genzentrum.lmu.de/compleXView  Protein-protein interaction based on affinity purification mass spectrometry 
ConTra v3  http://bioit2.irc.ugent.be/contra/v3  Transcription factor binding sites analysis 
CPC2  http://cpc2.cbi.pku.edu.cn  Protein coding potential of RNA transcripts 
CSPADE  http://cspade.fimm.fi/  Chemoinformatics bioactivity assay visualization 
CSTEA  http://comp-sysbio.org/cstea/  Analysis of time-series gene expression data on cell state transitions 
DEOGEN2  http://deogen2.mutaframe.com/  Prediction of deleterious mutations in proteins 
DNAproDB  http://dnaprodb.usc.edu  Structural analysis of DNA–protein complexes 
DSSR  http://jmol.x3dna.org  DNA and RNA structure visualization 
DynOmics  http://dyn.life.nthu.edu.tw/oENM/  Protein molecular dynamics using elastic network models 
EBISearch  http://www.ebi.ac.uk/ebisearch  Web services text search in EMBL-EBI data 
FireProt  http://loschmidt.chemi.muni.cz/fireprot  Design of thermostable proteins 
GalaxyHomomer  http://galaxy.seoklab.org/cgi-bin/submit.cgi?type=HOMOMER  Prediction of protein homo-oligomer structure 
GASS-WEB  http://gass.unifei.edu.br/  Identification of enzyme active sites 
GeMSTONE  http://gemstone.yulab.org/  Genetic variant prioritization in human disease 
Gene ORGANizer  http://geneorganizer.huji.ac.il  Linkage of human genes to their affected body organs 
GenProBiS  http://genprobis.insilab.org  Mapping of SNPs to protein binding sites 
GEPIA  http://gepia.cancer-pku.cn/  Analysis of differential gene expression in cancer 
GeSeq  https://chlorobox.mpimp-golm.mpg.de/geseq.html  Annotation of chloroplast genomes 
GibbsCluster  http://www.cbs.dtu.dk/services/GibbsCluster-2.0  Detection of protein short linear motifs 
GPCR-SSFE 2.0  http://www.ssfa-7tmr.de/ssfe2/  Homology modeling of G-protein coupled receptors 
GWAB  http://www.inetbio.org/gwab/  Network-based genome wide association analysis 
HDOCK  http://hdock.phys.hust.edu.cn/  Protein–protein and protein–DNA/RNA docking 
HGVA  http://bioinfodev.hpc.cam.ac.uk/web-apps/hgva  Archive of human genetic variant annotations 
HH-MOTiF  http://chimborazo.biochem.mpg.de/  Detection of protein short linear motifs 
I-TASSER-MR  http://zhanglab.ccmb.med.umich.edu/I-TASSER-MR/  Protein structure modeling for X-ray crystallography 
INTAA  http://bioinfo.uochb.cas.cz/INTAA/  Analysis of amino acid interaction energies 
IntaRNA 2.0  http://rna.informatik.uni-freiburg.de/IntaRNA/Input.jsp  Prediction of interactions between RNA molecules 
IslandViewer 4.0  http://www.pathogenomics.sfu.ca/islandviewer4/  Prediction of bacterial genomic islands (horizontal gene transfer) 
kpLogo  http://kplogo.wi.mit.edu/  Detection and visualization of short sequence motifs 
LigParGen  http://jorgensenresearch.com/ligpargen  Force field parameters for molecular dynamics 
LimTox  http://limtox.bioinfo.cnio.es  Text mining for compound toxicity 
mCSM-NA  http://structure.bioc.cam.ac.uk/mcsm_na  Prediction of protein mutation effect on nucleic acid binding affinity 
MicrobiomeAnalyst  http://microbiomeanalyst.ca  Analysis of microbiome data 
MinePath  http://www.minepath.org  Differential expression analysis for regulatory network subpaths 
ModFOLD6  http://www.reading.ac.uk/bioinf/ModFOLD/  Protein structure quality assessment 
mTCTScan  http://jjwanglab.org/mTCTScan  Mutation prioritization for cancer drug response 
MutaGene  https://www.ncbi.nlm.nih.gov/projects/mutagene/  Visualization and analysis of mutational profiles in cancer 
NNAlign-2.0  http://www.cbs.dtu.dk/services/NNAlign-2.0  Detection of ligand motifs for receptor–ligand interactions 
NOREVA  http://server.idrb.cqu.edu.cn/noreva/  Evaluation of data normalization methods for mass spectrometry based metabolomics data 
Olelo  http://www.hpi.de/plattner/olelo  Text mining in PubMed 
OmicSeq  http://www.omicseq.org  Search for omics data in major repositories 
P4P  http://sing.ei.uvigo.es/p4p  Bacterial strain classification based on peptide datasets 
Pathview  http://pathview.uncc.edu/  Visualization and annotation of metabolic pathways 
pepATTRACT  http://bioserv.rpbs.univ-paris-diderot.fr/services/pepATTRACT  Prediction of protein–peptide docking 
PharmMapper  http://lilab.ecust.edu.cn/pharmmapper  Drug target search using pharmacophore mapping 
PhD-SNPg  http://snps.biofold.org/phd-snpg  Deleterious SNP classification 
PIGSPro  http://cassandra.med.uniroma1.it/AbPrediction/web/pigs.php  Modeling of immunoglobulin variable domains 
plantiSMASH  http://plantismash.secondarymetabolites.org  Detection of biosynthetic gene clusters in plants 
PMut  http://mmb.irbbarcelona.org/PMut/  Prediction of disease potential for protein mutations 
Prism3  http://prism3.magarveylab.ca/prism  Prediction of natural product structures from biosynthetic gene clusters 
ProteinsAPI  http://www.ebi.ac.uk/proteins/api  Web service for protein data from UniProtKB 
ProteinsPlus  http://proteins.plus  Structure-based modeling of proteins 
ProteoSign  http://bioinformatics.med.uoc.gr/ProteoSign  Protein differential abundance analysis 
ReFOLD  http://www.reading.ac.uk/bioinf/ReFOLD/  Protein structure refinement 
RegulatorTrail  https://regulatortrail.bioinf.uni-sb.de  Analysis of transcription factors and target genes 
RiPPMiner  http://www.nii.ac.in/rippminer.html  Prediction of chemical structures for ribosomally synthesized and post translationally modified peptides 
RNA workbench  https://github.com/bgruening/galaxy-rna-workbench  Stand-alone collection of tools for analyzing RNAseq and RNA sequence data 
RNA-MoIP  http://rnamoip.cs.mcgill.ca/  Prediction of RNA 2D and 3D structure 
SBSPKSv2  http://www.nii.ac.in/sbspks2.html  Analysis of polyketide synthases 
SCENERY  http://mensxmachina.org/en/software/  Network reconstruction from cytometry data 
SDM  http://structure.bioc.cam.ac.uk/sdm2  Prediction of stability in protein mutants 
SeMPI  http://www.pharmaceutical-bioinformatics.de/sempi/  Prediction of polyketide synthase products from biosynthetic gene clusters 
SLiMSearch  http://slim.ucd.ie/slimsearch/  Detection of protein short linear motifs 
SODA  http://protein.bio.unipd.it/soda/  Prediction of solubility in protein mutants 
SpartaABC  http://spartaabc.tau.ac.il/webserver  Sequence simulation with indels 
ThreaDomEx  http://zhanglab.ccmb.med.umich.edu/ThreaDomEx  Prediction of protein domains and domain boundaries 
Tools at EMBL-EBI  http://www.ebi.ac.uk/Tools/webservices/  Web service tools from EMBL-EBI 
TraitRateProp  http://traitrate.tau.ac.il/prop  Test of sequence evolution association with phenotype 
TRAPP  http://trapp.h-its.org  Analysis of protein binding site dynamics 
VCF.Filter  https://biomedical-sequencing.at/VCFFilter/  Stand-alone program for filtering and annotating genetic variants in vcf files 
Web3DMol  http://web3dmol.duapp.com/  Protein structure visualization 
WebGestalt  http://www.webgestalt.org  Gene set functional enrichment analysis 
WoPPER  http://WoPPER.ba.itb.cnr.it/  Detection of bacterial genome regions with coordinated gene expression changes 
XSuLT  http://structure.bioc.cam.ac.uk/xsult  Annotation and visualization of protein multiple sequence alignment 

 

表2 2018年網絡資源express

Web server name  URL  Brief description 
AAI-profiler  http://ekhidna2.biocenter.helsinki.fi/AAI  proteome average amino acid identity comparison 
AlloFinder  http://mdl.shsmu.edu.cn/ALF/  allosteric modulator identification 
ArDock  http://ardock.ibcp.fr  protein–protein interaction region prediction 
BAGEL4  http://bagel4.molgenrug.nl  secondary metabolite gene clusters (RIPPs, bacteriocins) 
BaMM  https://bammmotif.mpibpc.mpg.de  nucleotide binding motifs 
BeStSel  http://bestsel.elte.hu  circular dichroism spectroscopy based protein secondary structure analysis 
BRepertoire  http://mabra.biomed.kcl.ac.uk/BRepertoire  antibody repertoire analysis 
BUSCA  http://busca.biocomp.unibo.it  protein subcellular localization prediction 
CABS-flex 2.0  http://biocomp.chem.uw.edu.pl/CABSflex2  simulation of protein structure flexibility 
CalFitter  https://loschmidt.chemi.muni.cz/calfitter/  protein thermal denaturation analysis 
CASTp 3.0  http://sts.bioe.uic.edu/castp/  topology of protein pockets, cavities and channels 
CavityPlus  http://www.pkumdl.cn/cavityplus  protein binding site cavities 
CellAtlasSearch  http://www.cellatlassearch.com  single cell gene expression data search 
cgDNAweb  http://cgDNAweb.epfl.ch  double-stranded DNA coarse-grain models 
CircadiOmics  http://circadiomics.ics.uci.edu  circadian rhythm dataset analysis and repository 
COACH-D  http://yanglab.nankai.edu.cn/COACH-D/  protein–ligand binding site prediction 
Coloc-stats  https://hyperbrowser.uio.no/coloc-stats/  genomic location enrichment analysis 
ComplexContact  http://raptorx2.uchicago.edu/ComplexContact/  protein heterodimer complex residue–residue contact prediction 
CoNekT-Plants  http://conekt.plant.tools  comparative analyses of plant gene co-expression 
CRISPOR  http://crispor.org  guide sequences for CRISPR/Cas9 genome editing 
CRISPRCasFinder  https://crisprcas.i2bc.paris-saclay.fr  CRISPR array and Cas gene detection 
CSAR-web  http://genome.cs.nthu.edu.tw/CSAR-web  contig scaffolding 
dbCAN2  http://cys.bios.niu.edu/dbCAN2  carbohydrate-active enzyme annotation 
DynaMut  http://biosig.unimelb.edu.au/dynamut/  point mutation effects on protein stability and dynamics 
easyFRAP-web  https://easyfrap.vmnet.upatras.gr/  protein mobility analysis with fluorescence recovery after photobleaching data 
EviNet  https://www.evinet.org/  gene set network enrichment analysis 
ezTag  http://eztag.bioqrator.org  biomedical concept annotation 
FragFit  http://proteinformatics.de/FragFit  protein segment modeling of cryo-EM density maps 
Freiburg RNA tools  http://rna.informatik.uni-freiburg.de  RNA analysis 
GADGET  http://gadget.biosci.gatech.edu  population-based distributions of genetic variants 
Galaxy  https://usegalaxy.org  biomedical data analysis workflows 
Galaxy HiCExplorer  https://hicexplorer.usegalaxy.eu  chromatin 3D conformation analysis 
GDA  http://gda.unimore.it/  integration of drug response, gene expression profiles and mutations for cancer 
GeneMANIA  http://genemania.org  gene function prediction 
geno2pheno[ngs-freq]  http://ngs.geno2pheno.org  viral drug resistance prediction 
GIANT 2.0  http://giant-v2.princeton.edu  human tissue-specific gene functional relationships 
GPCRM  http://gpcrm.biomodellab.eu/  G protein-coupled receptors structure modeling 
gRINN  http://grinn.readthedocs.io  protein molecular dynamics residue interaction energies 
GWAS4D  http://mulinlab.org/gwas4d  prioritization of regulatory variants from GWAS data 
HMMER  http://www.ebi.ac.uk/Tools/hmmer  profile hidden Markov models homology search 
HotSpot Wizard 3.0  http://loschmidt.chemi.muni.cz/hotspotwizard3  protein engineering directed mutation 
HPEPDOCK  http://huanglab.phys.hust.edu.cn/hpepdock/  peptide–protein docking 
HSYMDOCK  http://huanglab.phys.hust.edu.cn/hsymdock/  symmetric protein complex docking 
InterEvDock2  http://bioserv.rpbs.univ-paris-diderot.fr/services/InterEvDock2/  protein–protein docking 
INTERSPIA  http://bioinfo.konkuk.ac.kr/INTERSPIA/  protein–protein interactions in multiple species 
iPath3.0  http://pathways.embl.de  metabolic pathway visualization and customization 
IUPred2A  http://iupred2a.elte.hu  intrinsically disordered protein regions 
Kinact  http://biosig.unimelb.edu.au/kinact/  kinase activating missense mutations prediction 
KnotGenome  http://knotgenom.cent.uw.edu.pl/  topological analysis of chromosome knots and links 
LitVar  https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar  genetic variant information retrieval from PubMed 
LOLAweb  http://lolaweb.databio.org  genomic region enrichment analysis 
MetaboAnalyst 4.0  http://metaboanalyst.ca  metabolomics data analysis 
MetExplore  https://metexplore.toulouse.inra.fr/metexplore2/  metabolic network analysis 
MiGA  http://microbial-genomes.org/  prokaryotic genome and metagenome classification 
MISTIC2  https://mistic2.leloir.org.ar  residue pair covariation in protein families 
MOLEonline  https://mole.upol.cz  biomolecule channels, tunnels, and pores 
mTM-align  http://yanglab.nankai.edu.cn/mTM-align/  protein structure multiple alignment and database search 
Mutalisk  http://mutalisk.org  somatic mutations correlation with genomic, transcriptional and epigenomic features 
Ocean Gene Atlas  http://tara-oceans.mio.osupytheas.fr/ocean-gene-atlas/  marine plankton gene geolocation and abundance 
oli2go  http://oli2go.ait.ac.at/  PCR primer and hybridization probe design for non-human DNA 
OmicsNet  http://www.omicsnet.ca  molecular interactions networks visualization 
oriTfinder  http://bioinfo-mml.sjtu.edu.cn/oriTfinder  origin of transfer sites in bacterial mobile genetic elements 
PaintOmics 3  http://bioinfo.cipf.es/paintomics/  visualization of omics data on KEGG pathways 
PANNZER2  http://ekhidna2.biocenter.helsinki.fi/sanspanz/  protein function prediction 
PatScanUI  https://patscan.secondarymetabolites.org/  DNA and protein sequence pattern search 
PhytoNet  http://www.gene2function.de  phytoplankton gene expression profiles 
pirScan  http://cosbi4.ee.ncku.edu.tw/pirScan/  piRNA target prediction 
ProTox-II  http://tox.charite.de/protox_II  chemical toxicity prediction 
psRNATarget  http://plantgrn.noble.org/psRNATarget/  plant small RNA target prediction 
PSSMSearch  http://slim.ucd.ie/pssmsearch/  protein motifs for binding and post-translational modification 
PUG-REST  https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest  PubChem cheminformatics programmatic access 
RepeatsDB-lite  http://protein.bio.unipd.it/repeatsdb-lite  tandem repeats in proteins 
RNApdbee 2.0  http://lepus.cs.put.poznan.pl/rnapdbee-2.0/  RNA secondary structure annotation 
RSAT  http://www.rsat.eu/  DNA regulatory motifs 
SMARTIV  http://smartiv.technion.ac.il/  RNA sequence and structure motifs for RNA binding proteins 
SNPnexus  http://www.snp-nexus.org  SNP functional annotation 
SPAR  https://www.lisanwanglab.org/SPAR  analysis of small RNA sequencing data 
SWISS-MODEL  https://swissmodel.expasy.org  structure homology modeling for proteins and protein complexes 
TAM 2.0  http://www.scse.hebut.edu.cn/tam/  microRNA set enrichment analysis 
TCRmodel  http://tcrmodel.ibbr.umd.edu/  T cell receptor structure modeling 
UNRES  http://unres-server.chem.ug.edu.pl  coarse-grained simulation of protein structure 
VarAFT  http://varaft.eu  disease-causing variants annotation 
WEGO 2.0  http://wego.genomics.org.cn  Gene Ontology visualization 
X2K Web  http://X2K.cloud  kinase enrichment analysis for differentially expressed gene signatures 
xiSPEC  http://spectrumviewer.org  proteomics mass spectrometry data analysis 

 

參考資料

https://en.wikipedia.org/wiki/Biological_databaseapi

The 2018 Nucleic Acids Research database issue and the online molecular biology database collection網絡

Editorial: The 15th annual Nucleic Acids Research web server issue 2017

Editorial: The 16th annual Nucleic Acids Research web server issue 2018

相關文章
相關標籤/搜索