Results 1 - 10
of
1,070
MAFFT version 5: improvement in accuracy of multiple sequence alignment
- NUCLEIC ACIDS RES
, 2005
"... The accuracy of multiple sequence alignment pro-gram MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative refinement options, H-INS-i, F-INS-i and G-INS-i, in which pairwise alignment information are incorporated into objective function. These new options of MAFFT showed high ..."
Abstract
-
Cited by 801 (5 self)
- Add to MetaCart
The accuracy of multiple sequence alignment pro-gram MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative refinement options, H-INS-i, F-INS-i and G-INS-i, in which pairwise alignment information are incorporated into objective function. These new options of MAFFT showed higher accuracy than currently available methods including TCoffee version 2 and CLUSTAL W in benchmark tests consisting of alignments of.50 sequences. Like the previously available options, the new options of MAFFT can handle hundreds of sequences on a standard desktop computer. We also examined the effect of the number of homologues included in an alignment. For a multiple alignment consisting of 8 sequences with low similarity, the accuracy was improved (2–10 percentage points) when the sequences were aligned together with dozens of their close homologues (E-value, 105–1020) col-lected from a database. Such improvement was gen-erally observed for most methods, but remarkably large for the new options of MAFFT proposed here. Thus, we made a Ruby script, mafftE.rb, which aligns the input sequences together with their close homologues collected from SwissProt using NCBI-BLAST.
Pfam protein families database
- Nucleic Acids Research, 2008, 36(Database issue): D281–D288
"... Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. The current release of Pfam (22.0) contains 9318 protein families. Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenP ..."
Abstract
-
Cited by 771 (13 self)
- Add to MetaCart
(Show Context)
Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. The current release of Pfam (22.0) contains 9318 protein families. Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenPept and on sequences from selected metage-nomics projects. Pfam is available on the web from the consortium members using a new, consistent and improved website design in the UK
The swiss-prot protein knowledgebase and its supplement trembl in 2003
- Nucleic Acids Res
, 2003
"... The SWISS-PROT protein knowledgebase ..."
(Show Context)
Multiple sequence alignment with the Clustal series of programs
- Nucleic Acids Res
, 2003
"... The Clustal series of programs are widely used in molecular biology for the multiple alignment of both nucleic acid and protein sequences and for preparing phylogenetic trees. The popularity of the programs depends on a number of factors, including not only the accuracy of the results, but also the ..."
Abstract
-
Cited by 747 (5 self)
- Add to MetaCart
(Show Context)
The Clustal series of programs are widely used in molecular biology for the multiple alignment of both nucleic acid and protein sequences and for preparing phylogenetic trees. The popularity of the programs depends on a number of factors, including not only the accuracy of the results, but also the robustness, portability and user-friendliness of the programs. New features include NEXUS and FASTA format output, printing range numbers and faster tree calculation. Although, Clustal was originally developed to run on a local computer, numerous Web servers have been set up, notably at the EBI
The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling
- BIOINFORMATICS
, 2005
"... Motivation: Homology models of proteins are of great interest for planning and analyzing biological experiments when no experimental three-dimensional structures are available. Building homology models requires specialized programs and up-to-date sequence and structural databases. Integrating all re ..."
Abstract
-
Cited by 575 (5 self)
- Add to MetaCart
Motivation: Homology models of proteins are of great interest for planning and analyzing biological experiments when no experimental three-dimensional structures are available. Building homology models requires specialized programs and up-to-date sequence and structural databases. Integrating all required tools, programs and databases into a single web-based workspace facilitates access to homology modelling from a computer with web connection without the need of downloading and installing large program packages and databases. Results: SWISS-MODEL Workspace is a web-based integrated service dedicated to protein structure homology modelling. It assists and guides the user in building protein homology models at different levels of complexity. A personal working environment is provided for each user where several modelling projects can be carried out in parallel. Protein sequence and structure databases necessary for modelling are accessible from the workspace and are updated in regular intervals. Tools for template selection, model building, and structure quality evaluation can be invoked from within the workspace. Workflow and usage of the workspace are illustrated by modelling human Cyclin A1 and human Transmembrane Protease
TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes
- Nucleic Acids Res
, 2006
"... The TRANSFAC database on transcription factors, their binding sites, nucleotide distribution matrices and regulated genes as well as the complementing database TRANSCompel on composite elements have been further enhanced on various levels. A new web interface with different search options and inte ..."
Abstract
-
Cited by 482 (5 self)
- Add to MetaCart
(Show Context)
The TRANSFAC database on transcription factors, their binding sites, nucleotide distribution matrices and regulated genes as well as the complementing database TRANSCompel on composite elements have been further enhanced on various levels. A new web interface with different search options and integrated versions of MatchTM and PatchTM provides increased functionality for TRANSFAC. The list of databases which are linked to the common GENE table of TRANSFAC and TRANSCompel has been
UniProt: the Universal Protein Knowledgebase
- NUCLEIC ACIDS RES
, 2004
"... To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide ..."
Abstract
-
Cited by 335 (27 self)
- Add to MetaCart
(Show Context)
To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-references). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online
The Universal Protein Resource (UniProt): an expanding universe of protein information
- Nucleic Acids Res
, 2006
"... The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), compris-ing the manually annotated UniProtKB/Swiss-Prot sec ..."
Abstract
-
Cited by 302 (20 self)
- Add to MetaCart
(Show Context)
The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), compris-ing the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/ TrEMBL section, is the preeminent storehouse of pro-tein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to anal-yse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merg-ing sequences that are 100 % (UniRef100), 90% (UniRef90) or 50 % (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New down-load availability includes all major releases of UniProtKB, sequence collections by taxonomic divi-sion and complete proteomes. A bibliography map-ping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at
Pfam: clans, web tools and services
- Nucleic Acids Res
, 2006
"... Pfam is a database of protein families that currently contains 7973 entries (release 18.0). A recent development in Pfam has enabled the grouping of related families into clans. Pfam clans are described in detail, together with the new associated web pages. Improvements to the range of Pfam web tool ..."
Abstract
-
Cited by 296 (13 self)
- Add to MetaCart
(Show Context)
Pfam is a database of protein families that currently contains 7973 entries (release 18.0). A recent development in Pfam has enabled the grouping of related families into clans. Pfam clans are described in detail, together with the new associated web pages. Improvements to the range of Pfam web tools and the first set of Pfam web services that allow programmatic access to the database and associated tools are also presented. Pfam is available on the web in the UK
Rfam: an RNA family database
- Nucl. Acids Res
, 2003
"... Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at ..."
Abstract
-
Cited by 289 (7 self)
- Add to MetaCart
(Show Context)
Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at