ISREC Profile Homepage
Overview
Application of generalized profiles is a very sensitive method for the discovery of
distant sequence relationships. In contrast to conventional sequence comparison
and database searching methods, not a single sequence is used as a query object
but a profile constructed from a family of related sequences. These profiles are
normally derived from multiple alignments of the initial sequence set. In addition
to the sequences themselves, a profile contains the following information:
- which types of residues are allowed at what position
- which positions are important (=highly conserved), which ones are not
- which positions or regions can allow insertions, which regions may be dispensable
In collaboration with Amos Bairoch in Geneva, we are currently creating
profiles of various protein domains that are being incorporated into the
PROSITE pattern library. For this purpose, we created a new, generalized
profile format containing much more parameters than the previous one.
A new set of profilesearch-programs can take advantage of these new
parameters and allows more sensitive searches and also novel types of
searches.
For a detailed description of this format and related topics see the
documents below.
Selected references
- The original profile method:
-
-
Gribskov, M., McLachlan, A.D. and Eisenberg,D. (1987)
Profile analysis: detection of distantly related proteins.
Proc. Natl. Acad. Sci. USA 84:4355-4358
- Improvements to the profile method:
- Lüthy, R., Xenarios, I. and Bucher, P. (1994)
Improving the sensitivity of the sequence profile method
Prot. Sci. 3:139-146
- Thompson, J.D., Higgins, D.G. and Gibson, T. (1994)
Improved sensitivity of profile searches through the use of sequence
weights and gap excision.
Comput. Applicat. Biosci. 10:19-29
- The generalized sequence profiles:
- Bucher, P. and Bairoch A. (1994)
A generalized profile syntax for biomolecular sequence motifs and its function in automatic sequence
interpretation.
In: Proceedings of the 2nd ISMB Conference, pp. 53-61, AAAI press.
- Bucher, P. Karplus, K. Moeri, N. and Hofmann K. (1996)
A flexible search technique based on generalized profiles.
Computers and Chemistry 20:3-24
- The PROSITE pattern library:
- Bairoch, A., Bucher, P. and Hofmann, K. (1996)
The PROSITE database, its status in 1995.
Nucleic Acids Res. 24:189-196
For various applications of the generalized profile technique, see out
publication list and check
the documents listed below.
Documents on generalized profile syntax and methods
- The syntax of profiles in PROSITE
- This document is part of the current PROSITE release. It contains a
detailed description of the format and provides all information needed for
writing programs that read or write the new format. Note, however, that we
also have released a set of free programs that do sequence comparisons and
database searches with profiles in the new format. This program package
also contains portable routines for reading and writing of the new format
that can be used in other programs as well.
- PROSITE users manual
-
This document, written by Amos Bairoch, explains all the information stored in
PROSITE and how they can be used.
- Methods for the construction of profile entries for the PROSITE database
-
(K.Hofmann and P. Bucher, 1995).
Poster presented at the 3rd International Conference for Intelligent Systems in Molecular Biology,
Cambridge/UK, July 1995. This documents explains, how the generalized profiles in the PROSITE database are constructed.
Issues like iterative profile refinement and profile scaling are briefly discussed.
- Normalized profile scores
- This document deals with the assessment of the statistical significance
of matches found by the profilesearch methods. Application of the 'normalized
profile score' (NScore) is explained.
A collection of posters on profile applications
- Benefits of a Generalized Profile Syntax for Biomolecular Sequence Motifs
-
(K.Hofmann and P. Bucher, 1994).
Poster presented at the 3rd conference on Genes, Proteins and Computers, Chester/UK 1994.
This poster is also available in compressed Postscript format.
It contains a description of the advantages of profile-based database searches. As an example, the detection
of sequence similarity between inositol-monophosphatase, fructose-1,6-bisphosphatase and
inositol polyphosphate 1-monophosphatase is demonstrated.
- Detection and Analysis of Distantly Related C2-like Membrane Attachment Domains
-
(K.Hofmann and P. Bucher, 1995).
Poster presented at the 1st European Protein Society Meeting, Davos/CH 1995.
This poster is also available in compressed Postscript format.
The generalize profile method is used to demonstrate the occurence of C2-like domains in proteins
like the novel PLC isoforms, phospholipase C, cytosolic phospholipase A2, perforin, and many more.
- Conserved sequence domains in cell cycle regulatory proteins
-
(K.Hofmann and P. Bucher, 1996).
Poster presented at the joint ISREC/AACR meeting "Cancer and the Cell cycle", Lausanne/CH January 1996.
This document shows several examples of weakly conserved domains in cell cycle regulatory proteins, which
have been detected using the profile method.
Profile-related software
- ISREC ProfileScan Server
- (Search a the profiles-entries in PROSITE with your sequence).
This is an experimental implementation of the pfscan program.
The profile-entries contained in PROSITE, recognizable by the keyword
MATRIX, can be searched with a single, user-supplied sequence.
Major new data release and Pfam now
searchable!
- Download the pftools package
-
The pftools package contains programs for generalized profile applications. The source
code in FORTRAN77 and executables for various platforms are available. The current
release 1.0 contains the programs
pfsearch,
pfscan, and
GtoP.
Problems should be reported to
Philipp Bucher, the author of the package.
Pftools 2.0 now available!
People who are interested in getting more information on profiles or who
would like to contribute profiles or good multiple alignments of protein
domains should contact
Philipp Bucher or
Kay Hofmann
For getting more information on PROSITE, visit the
PROSITE homepage
in Geneva.
This document was last modified on
Go to the ISREC-bioinformatics home page