Conserved sequence domains in cell cycle regulatory proteins
Kay Hofmann and Philipp Bucher, Swiss Institute for
Experimental Cancer Research, 1066 Epalinges, Switzerland.
Proteins involved in signal transduction and cell cycle regulation
frequently consist of several independent domains, each of them
mediating a particular type of interaction. Typically, such domains
can be identified by analyzing regions of local sequence homology
occurring in otherwise unrelated sequences. In some cases, like
e.g. the SH2 or SH3 domain, these homologies are detectable by
conventional sequence comparison methods. By application of 'generalized
profiles', a very sensitive method for the detection of weak sequence
similarities, we discovered several novel sequence motifs in proteins
involved in cell cycle control. In collaboration with Amos Bairoch
in Geneva, we are assembling a database of motif descriptors (PROSITE),
which is made freely available and can be searched using free
programs or WWW-servers.
Further information is available electronically from the following
sources:
The generalized sequence profile method
Application of sequence profiles is a very sensitive method for
the discovery of distant sequence relationships. In contrast to
conventional sequence comparison and database searching methods,
not a single sequence is used as a query object but a profile
constructed from a family of related sequences. These profiles
are normally derived from multiple alignments of the initial sequence
set. In addition to the sequences themselves, a profile contains
the following information:
- which types of residues are allowed at what position
- which positions are important (=highly conserved), which ones
are not
- which positions or regions can allow insertions, which regions
may be dispensable
For the detection of distant relationships, the assessment of
statistical significance is very important. We derive an estimation
of the error probability from an empirical score distribution
obtained with a randomized protein database.
An additional advantage of the profile method is the possibility
of iterative profile refinement. Sequences with highly significant
similarity to the profile are aligned to the profile and included
in the profile construction process for the next round of database
searches.
Currently, we are constructing generalized profiles for the most
important of the highly divergent protein domains and families.
These profiles will be included in the PROSITE pattern collection.
We are also trying to detect and describe previously unknown homology
domains, with emphasis on proteins involved in signal transduction
and regulatory pathways.
Selected references
The original profile method:
Gribskov, M., McLachlan, A.D. and Eisenberg,D. (1987),
Proc. Natl. Acad. Sci. USA 84:4355-4358.
Improvements to the profile method:
Luthy, R., Xenarios, I. and Bucher, P. (1994),
Prot. Sci. 3:139-146.
The generalized sequence profiles:
Bucher, P. and Bairoch A. (1994),
in: Proceedings of the 2nd ISMB Conference, pp. 53-61, AAAI press.
Bucher, P., Karplus, K., Moeri, N. and Hofmann, K. (1996),
Computers & Chemistry, in press.
The PROSITE pattern and profile library:
Bairoch, A., Bucher, P. and Hofmann, K. (1996),
Nucleic Acids Res. 24:189-196.
The FHA-domain: A sequence motif possibly mediating phospho-Ser/Thr-specific
interactions is found in several cell cycle regulatory proteins
The FHA-domain is a sequence motif found in several nuclear protein
kinases, transcription factors, and other proteins. Several of
these proteins are involved in cell cycle regulation, among them
are :
- S. pombe dma1, a protein involved in the spindle checkpoint
(see poster B27)
- S. cerevisiae DUN1 and RAD53/SAD1/MEC2/SPK1, two protein kinases
linking the S-phase checkpoint to DNA-damage repair.
- S. pombe cds1, a kinase acting in the S-phase checkpoint.
In the plant phosphatase KAPP, a region containing the FHA-domain
in its center has been shown to interact specifically with a receptor-type
Ser/Thr kinase only if the kinase is autophosphorylated. A multiple
alignment of representative sequences and an updated list of proteins
containing FHA-domains are shown below.
Hofmann, K. and Bucher, P. (1995), Trends Biochem. Sci. 20:347-349
[gif]
[Postscript]
Abbreviations: FHA: FHA-domain; RF: RING-finger; FH: fork head domain;
ZF: zinc finger; ATP,da: Walker1/2 ATP binding motifs; white vertical bars:
predicted transmembrane regions.
[gif]
[Postscript]
The UBA-domain: A sequence motif found in multiple enzyme
classes of the ubiquitination pathway
The UBA-domain is a novel sequence motif found in several proteins
having connections to ubiquitin and the ubiquitination pathway:
- Bovine E2-25K and several other ubiquitin-conjugating enzymes
(E2), catalyzing the second step in protein ubiquitination.
- Drosophila hyperplastic discs protein, a putative ubiquitin-protein
ligase (E3), catalyzing the third and final step in protein ubiquitination.
- Human ubiquitin isopeptidase T and several other ubiquitin
C-terminal hydrolases, catalyzing regulatory protein-deubiquitination.
- S.cerevisiae RAD23 and its mammalian homologues. These proteins
act in UV excision repair and contain a N-terminal domain similar
to ubiquitin itself.
- Mammalian proto-oncogene c-cbl, a protein interacting with
multiple signal transduction factors. Cbl has recently been shown
to undergo regulatory ubiquination upon macrophage stimulation.
A multiple alignment of representative sequences and the domain
structure of proteins containing UBA-domains are shown below.
[gif]
[Postscript]
Abbreviations: U: UBA-domain; H1,H2: Deubiquitinase catalytic domains;
UBC: UBC catalytic domain; ub: ubiquitin-homology; RF: RING-finger;
pab: PABP C-terminus homology; KA1: KA1-domain; EH: EPS15-homology domain;
HECT: Ubiquitin-ligase (E3) catalytic domain; UX: UX-domain.
[gif]
[Postscript]
The GRK-domain: a putative Ga-interacting protein domain.
The GRK-domain is a novel sequence motif found in several proteins
working in signalling by heterotrimeric G-proteins. Sequences
containing single or multiple copies of the GRK domain include:
- S. cerevisiae alpha-factor signaling regulator SST2, a protein
probably acting by desensitizing the alpha-factor response.
- C. elegans G-protein signaling regulator EGL-10, which also
contains a region similar to Gg-subunits.
- Human Ga-interacting protein GAIP.
- Human G0/G1 switch regulatory protein 8.
- Human B-cell activation protein BL34.
- The complete family of characterized G-protein coupled receptor
kinases. These proteins phosphorylate activated G-protein coupled
receptors, like e.g. the beta-adrenergic receptor, thus desensitizing
the response. Several of these kinases possess a C-terminal PH-domain
that might interact with Gbg-subunits and targets the kinase to
the membrane. The presence of a GRK-domain in the N-terminal receptor-recognition
region suggests the participation of Ga-subunits in the recognition
process.
The domain structure of proteins containing GRK-domains is shown
below.
[gif]
[Postscript]
Abbreviations: GRK: GRK-domain; PH: PH-domain; GPg: G-protein gamma
subunit homology; black box: GRK-specific extension of the kinase domain.
The BB-domain is present in multiple proteins of the STE pathway and
other signal transduction proteins
The BB-domain is a sequence motif initially detected in the two BEM1-
interacting proteins BEB1 and BOB1. It also occurs in several other
proteins probably involved in signal transduction, including:
- S. pombe ste4 protein.
- S. pombe byr2 protein kinase and its homolog from S.cerevisiae, STE11.
- Drosophila polyhomeotic proximal protein PHP.
- Drosophila BicC protein.
- S.cerevisiae BEB1 and BOB1, two proteins involved in budding.
- A non-receptor tyrosine kinase from Dictyostelium.
- Human LAR-interacting protein. In this protein, the interaction with the
tyrosine phosphatase LAR has been mapped to the region containing three
adjacent copies of the BB-domain.
- The complete Eph-family of tyrosine kinases.
The domain structure of proteins containing BB-domains is shown
below.
[gif]
[Postscript]
Abbreviations: BB: GRK-domain; PH: PH-domain; SH3: SH3-domain; KH:
KH-domain; A: Ankyrin-repeat; DHR: DHR/PDZ-domain; white vertical bars:
predicted transmembrane regions.
go to the ISREC-bioinformatics home page