Overlapping non C2H2 motifs

 

C2H2new pattern

 

Motif

Name

Gene

Pos.

Prefix

XXCXXXXXCXXXXXXXXXXXXHXXXXXXH/C

Suffix

GATA

 

 

 

 

XXCXN---CXXXXTXLWRRXXXGXXX--C

NA C

 

YJL110c

GZF3

129

 

PVCKN---CLTSTTPLWRRDEHGAML--C

NA C

 

YKR034w

DAL80

29

 

PTCQN---CFTVKTPLWRRDEHGTVL--C

NA C

GAL4

 

 

 

 

 GCXX---CRXXKXXCXXXXXXCXX---C

6-8C

 

YDR207c

UME6

770

 

TGCWI---CRLRKKKCTEERPHCFN---C

 6 C

 

YML099c

ARG81

20

 

TGCWT---CRGRKVKCDLRHPHCQR---C

 6 C

 

YLR256w

HAP1

63

 

LSCTI---CRKRKVKCDKLRPHCQQ---C

 8 C

RING1

 

 

 

 

  CXX---C    9-39   CXHXX---C

XX C4-48C2C

 

YBR114w

RAD16

537

 

VICQL---CNDEAEEPIESKCHHKF---C

 2 C 16 C2C

 

YDR266c

 

64

 

ELCVI---CARKLTYVSLTPCHHKT---C

 2 C 12 C2C

RING2

 

 

 

C2C 9-39

  CXHXX-CXXC   4-48         C

 2 C

 

YDL175c

 

63

C2C 7

KDCPHII-CSYCGATDDHYSRHCPKAIQC

 2 C

 

YIL079c

 

76

C2C 7

RNCPHVI-CTYCGFMDDHYSQHCPKAIIC

 2 C

 

YKR017c

 

179

C2C 9

LECGHEY-CINCYRHYIKDKLHEGNIITC

 2 C

C8MOT

 

 

 

C2C10-22

  CXX---CXXXXCXXC 10-17

 C2C

 

YLR005w

SSL1

429

C2C 20

YRCED---CKQEFCVDCDVFIHEIL---H

1C2C

ZZFIN

 

 

 

C2C7-11C

XXCXEYDLCXXC 5-9     H 2-5  H

 

 

YDR448w

ADA2

7

C2C 11C

AICPEYDLCVPCFSQGSYTGKHRPY---H

 

BBOX

 

 

 

C2C 8

  CXX---CXXXXCXXXCXXXH 4-5  C

 

 

YHR040w

 

5

C2C 6

YKCPR---CLVQTCSLECSKKHKTRDN-C

 

 

YML041c

 

244

C2C 6

SSCVN---CGNKICSVSCFKLHNETR--C

 

RPOL

 

 

 

 

CXX---CXXXXXXCXXH 26

  C2C

 

YOR341w

RPA190

62

 

NLCST---CGLDEKFCPGHQGHIELPVPC

16C2C

NEWM1

 

 

 

C2C 9

    H---CXXCXXCXXXXDHHCXXXXXC

 

 

YNL326c

 

106

C2C 5

DRCHH---CSSCDVCILKMDHHCPWFAEC

 

Unknown

 

 

 

 

 

 

 

YML068w

 

294

C2C 3C9

IQCQK---CHFVFCFDCLHAWHGYNNK-C

 

 

YMR187c

 

15

H11

CICWI---CLEESTYDSTWLQHTCG---C

4H2C

Homology to other subfamilies

 

 

 

Subfamily

Name

Gene

Pos.

 

XXCXXXXXCXXXXXXXXXXXXHXXXXXXH/C

 

1 Deamin.

YHR144c

DCD1

180

C5

VGCVIVRECRVIATGYNGTPRHLTN---C

4C2C

2 Kinases

YPR054w

SMK1

152

C5H5H7

ILCTLNG-CLKICDFGLARGIHAGFFK-C

 H4H

3 Ligases

YGR184c

UBR1

136

C2C1C1

DTCVL---CIHCFNPKDHVNHHVCTDI-C

7C1C

4 Dehydr.

YHL041w

 

18

H6H3H1

TSCL----CRIIYVGWKSFWKHFFF---C

1H

 

Figure legend: Wrong fingers overlapping with other non-C2H2 'finger' motifs and questionable fingers found in yeast proteins which belong to homologous non-Zfp subfamilies of proteins.

Overlapping non-C2H2 motifs:

In the first column the name of the non-C2H2 motif is shown; the next three columns contain the yeast ORF names, gene names and location of the 'finger' in the parent protein; the last three columns show the alignments of the overlapping C2H2new pattern sequence part with the non-C2H2 finger motifs (shown in each case in the first lines of the different motifs). Note that all non-C2H2 motifs have additional sequence parts to the left and/or right termini of their C2H2new part, marked with Prefix and Suffix. All numbers given in the sequence columns indicate spacings by amino acids of any type varying in lenght by the given numbers, X means any amino acid. RING1 and RING2 indicate two types of overlaps. Colour code: purple, C; blue, H.

Homology to subfamilies:

In the first column a short name for the protein subfamily is given. All other columns are as described above. The short names mean the following subfamilies, first described for the yeast proteins in the given references: 1 Deamin. = Cytidine deaminase, 2 Kinases = MAP kinase, 3 Ligases = UBR1 ligase, 4 Dehydr. = NADH dehydrogenase.