Results 1 - 10
of
263
The swiss-prot protein knowledgebase and its supplement trembl in 2003
- Nucleic Acids Res
, 2003
"... The SWISS-PROT protein knowledgebase ..."
Multiple sequence alignment with the Clustal series of programs
- Nucleic Acids Res
, 2003
"... The Clustal series of programs are widely used in molecular biology for the multiple alignment of both nucleic acid and protein sequences and for preparing phylogenetic trees. The popularity of the programs depends on a number of factors, including not only the accuracy of the results, but also the ..."
Abstract
-
Cited by 139 (3 self)
- Add to MetaCart
The Clustal series of programs are widely used in molecular biology for the multiple alignment of both nucleic acid and protein sequences and for preparing phylogenetic trees. The popularity of the programs depends on a number of factors, including not only the accuracy of the results, but also the robustness, portability and user-friendliness of the programs. New features include NEXUS and FASTA format output, printing range numbers and faster tree calculation. Although, Clustal was originally developed to run on a local computer, numerous Web servers have been set up, notably at the EBI
UniProt: the Universal Protein Knowledgebase
- NUCLEIC ACIDS RES
, 2004
"... To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide ..."
Abstract
-
Cited by 135 (20 self)
- Add to MetaCart
To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-references). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online
SCOP database in 2004: refinements integrate structure and sequence family data
, 2004
"... The Structural Classication of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. Protein domains in SCOP are hierarchically classied into families, superfamilies, folds and classes. The continual acc ..."
Abstract
-
Cited by 124 (5 self)
- Add to MetaCart
The Structural Classication of Proteins (SCOP) database is a comprehensive ordering of all proteins of known structure, according to their evolutionary and structural relationships. Protein domains in SCOP are hierarchically classied into families, superfamilies, folds and classes. The continual accumulation of sequence and structural data allows more rigorous analysis and provides important information for understanding the protein world and its evolutionary repertoire. SCOP participates in a project that aims to rationalize and integrate the data on proteins held in several sequence and structure databases. As part of this project, starting with release 1.63, we have initiated a renement of the SCOP classication, which introduces a number of changes mostly at the levels below superfamily. The pending SCOP reclassication will be carried out gradually through a number of future releases. In addition to the expanded set of static links to external resources, available at the level of domain entries, we have started modernization of the interface capabilities of SCOP allowing more dynamic links with other databases. SCOP can be accessed at http://scop.mrc-lmb.cam.ac.uk/scop.
The InterPro Database, 2003 brings increased coverage and new features
- Nucleic Acids Res
, 2003
"... InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and ..."
Abstract
-
Cited by 123 (11 self)
- Add to MetaCart
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of
Rfam: An RNA family database
- Nucleic Acids Res
, 2003
"... Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against ..."
Abstract
-
Cited by 114 (1 self)
- Add to MetaCart
Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against a library of covariance models, and view multiple sequence alignments and family annotation. The database can also be downloaded in flatfile form and searched locally using the INFERNAL package (http://infernal.wustl.edu/). The first release of Rfam (1.0) contains 25 families, which annotate over 50000 non-coding RNA genes in the taxonomic divisions of the EMBL nucleotide database.
The InterPro database, an integrated documentation resource for protein families, domains and functional sites
"... Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and P ..."
Abstract
-
Cited by 99 (12 self)
- Add to MetaCart
Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1 000 000 hits from 462 500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http:// www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk.
The Molecular Biology Database Collection: 2005 update
- Nucleic Acids Res
, 2005
"... The NAR Molecular Biology Database Collection is a public online resource that contains links to all databases described in this issue of Nucleic Acids Research. In addition, this collection lists databases that have been featured in previous issues of NAR, as well as selected other databases that a ..."
Abstract
-
Cited by 99 (0 self)
- Add to MetaCart
The NAR Molecular Biology Database Collection is a public online resource that contains links to all databases described in this issue of Nucleic Acids Research. In addition, this collection lists databases that have been featured in previous issues of NAR, as well as selected other databases that are freely available to the public and may be useful to the molecular biologist. The 2006 update includes 858 databases, 139 more than the previous one. The databases come with brief summaries, many of which have been updated recently. Each database is assigned a stable accession number that does not change if the database moves to a new location and its URL, authors ’ names or the contact person address are updated. The complete database list and summaries are available online at the Nucleic Acids Research website
Pfam: clans, web tools and services
- Nucleic Acids Res
, 2006
"... Pfam is a database of protein families that currently contains 7973 entries (release 18.0). A recent development in Pfam has enabled the grouping of related families into clans. Pfam clans are described in detail, together with the new associated web pages. Improvements to the range of Pfam web tool ..."
Abstract
-
Cited by 78 (5 self)
- Add to MetaCart
Pfam is a database of protein families that currently contains 7973 entries (release 18.0). A recent development in Pfam has enabled the grouping of related families into clans. Pfam clans are described in detail, together with the new associated web pages. Improvements to the range of Pfam web tools and the first set of Pfam web services that allow programmatic access to the database and associated tools are also presented. Pfam is available on the web in the UK
Predicting Protein-Protein Interactions From Primary Structure
, 2001
"... Motivation: An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. The expectation is that this will provide a fuller appreciation of cellular processes and networks at the protein level, ultimately leading to a better un ..."
Abstract
-
Cited by 76 (2 self)
- Add to MetaCart
Motivation: An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. The expectation is that this will provide a fuller appreciation of cellular processes and networks at the protein level, ultimately leading to a better understanding of disease mechanisms and suggesting new means for intervention. This paper addresses the question: can protein--protein interactions be predicted directly from primary structure and associated data? Using a diverse database of known protein interactions, a Support Vector Machine (SVM) learning system was trained to recognize and predict interactions based solely on primary structure and associated physicochemical properties. Results: Inductive accuracy of the trained system, defined here as the percentage of correct protein interaction predictions for previously unseen test sets, averaged 80% for the ensemble of statistical experiments. Future proteomics studies may benefit from this research by proceeding directly from the automated identification of a cell's gene products to prediction of protein interaction pairs. Contact: dgough@bioeng.ucsd.edu

