NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|30695806|ref|NP_191905|]
View 

transducin family protein / WD-40 repeat family protein [Arabidopsis thaliana]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 super family cl29593
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
86-335 1.49e-27

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


The actual alignment was detected with superfamily member cd00200:

Pssm-ID: 475233 [Multi-domain]  Cd Length: 289  Bit Score: 113.97  E-value: 1.49e-27
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   86 LIAGGLvDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSepshfplLK 165
Cdd:cd00200   66 LASGSS-DKTIRLWDL--------ETGECVRTLTGHTSYVSSVAFSP-DGRILSSSSRDKTIKVWDVETGK-------CL 128
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  166 GSGSATQGEISFISWNrKVQQILASTSYNGTTVIWDLRKQKPIINFA---DSVRRrcsvLQWNPNvTTQIMVASDDDssp 242
Cdd:cd00200  129 TTLRGHTDWVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTghtGEVNS----VAFSPD-GEKLLSSSSDG--- 199
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  243 TLKLWDMRNIMSpVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVI 322
Cdd:cd00200  200 TIKLWDLSTGKC-LGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLA 277
                        250
                 ....*....|...
gi 30695806  323 SAsSFDGKIGIYN 335
Cdd:cd00200  278 SG-SADGTIRIWD 289
Atrophin-1 super family cl38111
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
740-960 6.20e-11

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


The actual alignment was detected with superfamily member pfam03154:

Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 66.71  E-value: 6.20e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    740 ESYAEILASQGLLTTAMKYLKVLDSGGLSPELSILRDRISLSAEPETNTTAsgnTQPQSTMPYNQEPTQAQPNvlANPYD 819
Cdd:pfam03154  155 ESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPS---VPPQGSPATSQPPNQTQST--AAPHT 229
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    820 NQYQQPYTDSYYVPQvSHPPMQQPTmfmphqaQPAPQPSFTPAPTSnaQPSMRTTfVPSTPPALKNADQYQQPTMSSHSF 899
Cdd:pfam03154  230 LIQQTPTLHPQRLPS-PHPPLQPMT-------QPPPPSQVSPQPLP--QPSLHGQ-MPPMPHSLQTGPSHMQHPVPPQPF 298
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695806    900 TGPSNNAY-PVPPGPGQYAPSGPSQLGQYP--NPKMPQVVAPAAGPIGFTPMATPGVAPRSVQP 960
Cdd:pfam03154  299 PLTPQSSQsQVPPGPSPAAPGQSQQRIHTPpsQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTP 362
ACE1-Sec16-like super family cl14807
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
566-782 2.01e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


The actual alignment was detected with superfamily member cd09233:

Pssm-ID: 449359 [Multi-domain]  Cd Length: 314  Bit Score: 60.35  E-value: 2.01e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  566 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSSAPYM--------------KVVSAMVNNDL 630
Cdd:cd09233   69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSRFARSESKLNDplqtlyqlfsgnspEAITELADNPA 147
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  631 RSLIytrSHKFWKETLALLCT-FAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAG--------NVDRTVEIWSRSLan 701
Cdd:cd09233  148 EAEW---ALGNWREHLAIILSnRTSNLDLEALV-ELGDLLAQRGLVEAAHICYLLAGvplgpypsSPSSCLLGGAVHN-- 221
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  702 eRDGRSYAELLQDLMEKTLVLALATGNKKFS-ASLC--KLFesYAEILASQGLLTTAMKYL----KVLDSGGLSP----- 769
Cdd:cd09233  222 -KSPRTFATPEAIQLTEIYEYALSLGNPQFGlPHLQpyKLI--HAARLAELGLVSEALKYCeaiaSSLKSLTKSPyydpn 298
                        250
                 ....*....|....*.
gi 30695806  770 ---ELSILRDRISLSA 782
Cdd:cd09233  299 llaQLQDLSERLSGTS 314
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
86-335 1.49e-27

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 113.97  E-value: 1.49e-27
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   86 LIAGGLvDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSepshfplLK 165
Cdd:cd00200   66 LASGSS-DKTIRLWDL--------ETGECVRTLTGHTSYVSSVAFSP-DGRILSSSSRDKTIKVWDVETGK-------CL 128
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  166 GSGSATQGEISFISWNrKVQQILASTSYNGTTVIWDLRKQKPIINFA---DSVRRrcsvLQWNPNvTTQIMVASDDDssp 242
Cdd:cd00200  129 TTLRGHTDWVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTghtGEVNS----VAFSPD-GEKLLSSSSDG--- 199
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  243 TLKLWDMRNIMSpVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVI 322
Cdd:cd00200  200 TIKLWDLSTGKC-LGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLA 277
                        250
                 ....*....|...
gi 30695806  323 SAsSFDGKIGIYN 335
Cdd:cd00200  278 SG-SADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
6-338 2.18e-25

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 110.00  E-value: 2.18e-25
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    6 GVGRSASVALAPDAPYMAAGTMAGAVDLsFSSSANLEIFKLDFQSDdrdlplvgeipsseRFNRLAWGRNGSgseefalg 85
Cdd:COG2319   77 HTAAVLSVAFSPDGRLLASASADGTVRL-WDLATGLLLRTLTGHTG--------------AVRSVAFSPDGK-------- 133
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   86 LIAGGLVDGNIDLWNPLSligsqpseNALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPSHFPllk 165
Cdd:COG2319  134 TLASGSADGTVRLWDLAT--------GKLLRTLTGHSGAVTSVAFSP-DGKLLASGSDDGTVRLWDLATGKLLRTLT--- 201
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  166 gsgsATQGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADSVRRRCSVlQWNPNvtTQIMVASDDDSspTLK 245
Cdd:COG2319  202 ----GHTGAVRSVAFSPD-GKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSV-AFSPD--GRLLASGSADG--TVR 271
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  246 LWDMRNImSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVISAS 325
Cdd:COG2319  272 LWDLATG-ELLRTLTGHSGGVNSVAFSP-DGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGS 349
                        330
                 ....*....|...
gi 30695806  326 SfDGKIGIYNIEG 338
Cdd:COG2319  350 D-DGTVRLWDLAT 361
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
740-960 6.20e-11

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 66.71  E-value: 6.20e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    740 ESYAEILASQGLLTTAMKYLKVLDSGGLSPELSILRDRISLSAEPETNTTAsgnTQPQSTMPYNQEPTQAQPNvlANPYD 819
Cdd:pfam03154  155 ESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPS---VPPQGSPATSQPPNQTQST--AAPHT 229
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    820 NQYQQPYTDSYYVPQvSHPPMQQPTmfmphqaQPAPQPSFTPAPTSnaQPSMRTTfVPSTPPALKNADQYQQPTMSSHSF 899
Cdd:pfam03154  230 LIQQTPTLHPQRLPS-PHPPLQPMT-------QPPPPSQVSPQPLP--QPSLHGQ-MPPMPHSLQTGPSHMQHPVPPQPF 298
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695806    900 TGPSNNAY-PVPPGPGQYAPSGPSQLGQYP--NPKMPQVVAPAAGPIGFTPMATPGVAPRSVQP 960
Cdd:pfam03154  299 PLTPQSSQsQVPPGPSPAAPGQSQQRIHTPpsQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTP 362
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
566-782 2.01e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 60.35  E-value: 2.01e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  566 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSSAPYM--------------KVVSAMVNNDL 630
Cdd:cd09233   69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSRFARSESKLNDplqtlyqlfsgnspEAITELADNPA 147
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  631 RSLIytrSHKFWKETLALLCT-FAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAG--------NVDRTVEIWSRSLan 701
Cdd:cd09233  148 EAEW---ALGNWREHLAIILSnRTSNLDLEALV-ELGDLLAQRGLVEAAHICYLLAGvplgpypsSPSSCLLGGAVHN-- 221
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  702 eRDGRSYAELLQDLMEKTLVLALATGNKKFS-ASLC--KLFesYAEILASQGLLTTAMKYL----KVLDSGGLSP----- 769
Cdd:cd09233  222 -KSPRTFATPEAIQLTEIYEYALSLGNPQFGlPHLQpyKLI--HAARLAELGLVSEALKYCeaiaSSLKSLTKSPyydpn 298
                        250
                 ....*....|....*.
gi 30695806  770 ---ELSILRDRISLSA 782
Cdd:cd09233  299 llaQLQDLSERLSGTS 314
PTZ00420 PTZ00420
coronin; Provisional
104-225 1.03e-08

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 59.19  E-value: 1.03e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   104 LIGSQPSENAL----VGHLSVHKGPVRGLEFNAISSNLLASGADDGEICIWDLlkPSEPSHFPLLKGSGSATQG---EIS 176
Cdd:PTZ00420   52 LIGAIRLENQMrkppVIKLKGHTSSILDLQFNPCFSEILASGSEDLTIRVWEI--PHNDESVKEIKDPQCILKGhkkKIS 129
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*....
gi 30695806   177 FISWNRKVQQILASTSYNGTTVIWDLRKQKPIinFADSVRRRCSVLQWN 225
Cdd:PTZ00420  130 IIDWNPMNYYIMCSSGFDSFVNIWDIENEKRA--FQINMPKKLSSLKWN 176
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
566-779 1.03e-06

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 51.79  E-value: 1.03e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    566 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSsapyMKVVSAMVNNDLRsLIY--------- 635
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFVRSE----FKGSNNKSGESLA-ALYqvfagnsee 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    636 -----TRSHKF-------WKETLALLCTFAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAGNVD-RTVEIWSRSLANE 702
Cdd:pfam12931   75 avdelVPPSKNalwaldnWRETLALVLSNRSPGDVEALL-ALGDLLAQYGRTEAAHICFLLAGLPLsQTVLLGADHVRFP 153
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    703 RDGRSYAE--LLQDLMEktLVLALATGNKKFSASLC----KLFesYAEILASQGLLTTAMKY-------LKVLDSG---- 765
Cdd:pfam12931  154 STFGNDLEsiLLTEIYE--YALSLSPPQPPFVGLPHllpyKLQ--HAAVLAEYGLVSEAQKYcdaitasLKSLTKKspyy 229
                          250
                   ....*....|....*.
gi 30695806    766 --GLSPELSILRDRIS 779
Cdd:pfam12931  230 hpTLLAQLEDLSNRLS 245
PHA03247 PHA03247
large tegument protein UL36; Provisional
784-1018 1.10e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.40  E-value: 1.10e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   784 PETNTTASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPYTDSYYVPQVSHPPMQQPTMF-MPHQAQPAPQPSfTPA 862
Cdd:PHA03247 2631 PSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGsLTSLADPPPPPP-TPE 2709
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   863 PtsnaQPSMRTTFVPsTPPALKNADQyqqptmsshsfTGPSNNAYPVPPGP--GQYAPSGPSQLGQYPNPKMPQVVAPAA 940
Cdd:PHA03247 2710 P----APHALVSATP-LPPGPAAARQ-----------ASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPAPAPPA 2773
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   941 GPIGFTPMAT--PGVAPRSVQ----PASPPTQQAAAQAAPAPATPPPTVQTADTSNVPAHQKPVIATLTRLFNETSEALG 1014
Cdd:PHA03247 2774 APAAGPPRRLtrPAVASLSESreslPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLG 2853

                  ....
gi 30695806  1015 GARA 1018
Cdd:PHA03247 2854 GSVA 2857
YppG COG5894
Spore coat protein YppG [Cell cycle control, cell division, chromosome partitioning];
793-859 1.85e-05

Spore coat protein YppG [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 444596 [Multi-domain]  Cd Length: 112  Bit Score: 44.85  E-value: 1.85e-05
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 30695806  793 NTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPY---TDSYYvPQvsHPPmqQPTMFM---PHQAQPAPQPSF 859
Cdd:COG5894    2 HWYQRNMNMYHYARPALRPEQPYGPYQNQHQQPYyqqTNTQQ-PF--PPP--SPTPYPspkPLQTQPSQFQSL 69
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
114-151 3.02e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 41.91  E-value: 3.02e-05
                            10        20        30
                    ....*....|....*....|....*....|....*...
gi 30695806     114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWD 151
Cdd:smart00320    4 LLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
795-863 6.70e-05

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 44.40  E-value: 6.70e-05
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695806     795 QPQSTMPYNQE--PTQAQPNVLANPYDNQYQQPYTDsyyvPQVSHPPMQQPT---MFMPHQAQPAPQPSFTPAP 863
Cdd:smart00818   71 QPLMPVPGQHSmtPTQHHQPNLPQPAQQPFQPQPLQ----PPQPQQPMQPQPpvhPIPPLPPQPPLPPMFPMQP 140
WD40 pfam00400
WD domain, G-beta repeat;
114-151 9.66e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 40.79  E-value: 9.66e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 30695806    114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWD 151
Cdd:pfam00400    3 LLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
790-942 2.09e-04

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 45.02  E-value: 2.09e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  790 ASGNTQP-QST-MPYNQEPTQAQPNVLANPYDNQYQQPytdsyyVPQVSHppMQQPTMFM--PHQAQpapQPSFTPAPTS 865
Cdd:cd22553  188 AGGGNQAlQAQvIPQLAQAAQLQPQQLAQVSSQGYIQQ------IPANAS--QQQPQMVQqgPNQSG---QIIGQVASAS 256
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  866 NAQPSMRTTFVPSTPPALKNADQYQQ----PTMS----SHSFTGPSNNAYPV----------PPGPGQYAPSGPSQLGQY 927
Cdd:cd22553  257 SIQAAAIPLTVYTGALAGQNGSNQQQvgqiVTSPiqgmTQGLTAPASSSIPTvvqqqaiqgnPLPPGTQIIAAGQQLQQD 336
                        170
                 ....*....|....*....
gi 30695806  928 PN-PKMPQVVA---PAAGP 942
Cdd:cd22553  337 PNdPTKWQVVAdgtPGSKK 355
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
86-335 1.49e-27

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 113.97  E-value: 1.49e-27
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   86 LIAGGLvDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSepshfplLK 165
Cdd:cd00200   66 LASGSS-DKTIRLWDL--------ETGECVRTLTGHTSYVSSVAFSP-DGRILSSSSRDKTIKVWDVETGK-------CL 128
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  166 GSGSATQGEISFISWNrKVQQILASTSYNGTTVIWDLRKQKPIINFA---DSVRRrcsvLQWNPNvTTQIMVASDDDssp 242
Cdd:cd00200  129 TTLRGHTDWVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTghtGEVNS----VAFSPD-GEKLLSSSSDG--- 199
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  243 TLKLWDMRNIMSpVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVI 322
Cdd:cd00200  200 TIKLWDLSTGKC-LGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLA 277
                        250
                 ....*....|...
gi 30695806  323 SAsSFDGKIGIYN 335
Cdd:cd00200  278 SG-SADGTIRIWD 289
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
85-337 7.36e-27

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 111.66  E-value: 7.36e-27
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   85 GLIAGGLVDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPSHFpll 164
Cdd:cd00200   22 KLLATGSGDGTIKVWDL--------ETGELLRTLKGHTGPVRDVAASA-DGTYLASGSSDKTIRLWDLETGECVRTL--- 89
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  165 KGSgsatQGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADsvrRRCSVLQWNPNVTTQIMVASDDDSspTL 244
Cdd:cd00200   90 TGH----TSYVSSVAFSPD-GRILSSSSRDKTIKVWDVETGKCLTTLRG---HTDWVNSVAFSPDGTFVASSSQDG--TI 159
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  245 KLWDMRNiMSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKiPGVISA 324
Cdd:cd00200  160 KLWDLRT-GKCVATLTGHTGEVNSVAFSP-DGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHENGVNSVAFSPD-GYLLAS 236
                        250
                 ....*....|...
gi 30695806  325 SSFDGKIGIYNIE 337
Cdd:cd00200  237 GSEDGTIRVWDLR 249
WD40 COG2319
WD40 repeat [General function prediction only];
6-338 2.18e-25

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 110.00  E-value: 2.18e-25
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    6 GVGRSASVALAPDAPYMAAGTMAGAVDLsFSSSANLEIFKLDFQSDdrdlplvgeipsseRFNRLAWGRNGSgseefalg 85
Cdd:COG2319   77 HTAAVLSVAFSPDGRLLASASADGTVRL-WDLATGLLLRTLTGHTG--------------AVRSVAFSPDGK-------- 133
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   86 LIAGGLVDGNIDLWNPLSligsqpseNALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPSHFPllk 165
Cdd:COG2319  134 TLASGSADGTVRLWDLAT--------GKLLRTLTGHSGAVTSVAFSP-DGKLLASGSDDGTVRLWDLATGKLLRTLT--- 201
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  166 gsgsATQGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADSVRRRCSVlQWNPNvtTQIMVASDDDSspTLK 245
Cdd:COG2319  202 ----GHTGAVRSVAFSPD-GKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSV-AFSPD--GRLLASGSADG--TVR 271
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  246 LWDMRNImSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVISAS 325
Cdd:COG2319  272 LWDLATG-ELLRTLTGHSGGVNSVAFSP-DGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGS 349
                        330
                 ....*....|...
gi 30695806  326 SfDGKIGIYNIEG 338
Cdd:COG2319  350 D-DGTVRLWDLAT 361
WD40 COG2319
WD40 repeat [General function prediction only];
12-338 1.57e-24

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 107.30  E-value: 1.57e-24
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   12 SVALAPDAPYMAAGtmagavdlsfSSSANLEIFkldfqsDDRDLPLVGEIPS-SERFNRLAWGRNGSgseefalgLIAGG 90
Cdd:COG2319  125 SVAFSPDGKTLASG----------SADGTVRLW------DLATGKLLRTLTGhSGAVTSVAFSPDGK--------LLASG 180
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   91 LVDGNIDLWNPLSligsqpseNALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPShfpLLKGSGsa 170
Cdd:COG2319  181 SDDGTVRLWDLAT--------GKLLRTLTGHTGAVRSVAFSP-DGKLLASGSADGTVRLWDLATGKLLR---TLTGHS-- 246
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  171 tqGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADSVRRRCSVlQWNPNvTTQIMVASDDDsspTLKLWDMR 250
Cdd:COG2319  247 --GSVRSVAFSPD-GRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSV-AFSPD-GKLLASGSDDG---TVRLWDLA 318
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  251 NImSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVISAsSFDGK 330
Cdd:COG2319  319 TG-KLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLASG-SADGT 395

                 ....*...
gi 30695806  331 IGIYNIEG 338
Cdd:COG2319  396 VRLWDLAT 403
WD40 COG2319
WD40 repeat [General function prediction only];
50-338 7.21e-18

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 87.27  E-value: 7.21e-18
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   50 SDDRDLPLVGEIPSSERFNRLAWGRNGSGSEEFALGLIAGGLVDGNIDLWNPLSLigsqpsenALVGHLSVHKGPVRGLE 129
Cdd:COG2319   14 ADLALALLAAALGALLLLLLGLAAAVASLAASPDGARLAAGAGDLTLLLLDAAAG--------ALLATLLGHTAAVLSVA 85
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  130 FNAiSSNLLASGADDGEICIWDLLKPSEPSHFpllkgsgSATQGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPII 209
Cdd:COG2319   86 FSP-DGRLLASASADGTVRLWDLATGLLLRTL-------TGHTGAVRSVAFSPD-GKTLASGSADGTVRLWDLATGKLLR 156
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  210 NF---ADSVRRrcsvLQWNPNvtTQIMVASDDDSspTLKLWDMRNImSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDN 286
Cdd:COG2319  157 TLtghSGAVTS----VAFSPD--GKLLASGSDDG--TVRLWDLATG-KLLRTLTGHTGAVRSVAFSP-DGKLLASGSADG 226
                        250       260       270       280       290
                 ....*....|....*....|....*....|....*....|....*....|...
gi 30695806  287 RTICWDTNTAEIVAELPAGNNWNFDVHWYPKipG-VISASSFDGKIGIYNIEG 338
Cdd:COG2319  227 TVRLWDLATGKLLRTLTGHSGSVRSVAFSPD--GrLLASGSADGTVRLWDLAT 277
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
114-340 7.40e-17

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 82.38  E-value: 7.40e-17
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPSHFPLLKGSGSAtqgeISFISWNRKvqqiLASTSY 193
Cdd:cd00200    1 LRRTLKGHTGGVTCVAFSP-DGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRD----VAASADGTY----LASGSS 71
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  194 NGTTVIWDLRKQKPIinfadsvrrrcsvlqwnpnvttqimvasdddssptlklwdmrnimspvREFTGHQRGVIAMEWCP 273
Cdd:cd00200   72 DKTIRLWDLETGECV------------------------------------------------RTLTGHTSYVSSVAFSP 103
                        170       180       190       200       210       220
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 30695806  274 sDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKiPGVISASSFDGKIGIYNIEGCS 340
Cdd:cd00200  104 -DGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPD-GTFVASSSQDGTIKLWDLRTGK 168
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
740-960 6.20e-11

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 66.71  E-value: 6.20e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    740 ESYAEILASQGLLTTAMKYLKVLDSGGLSPELSILRDRISLSAEPETNTTAsgnTQPQSTMPYNQEPTQAQPNvlANPYD 819
Cdd:pfam03154  155 ESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPS---VPPQGSPATSQPPNQTQST--AAPHT 229
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    820 NQYQQPYTDSYYVPQvSHPPMQQPTmfmphqaQPAPQPSFTPAPTSnaQPSMRTTfVPSTPPALKNADQYQQPTMSSHSF 899
Cdd:pfam03154  230 LIQQTPTLHPQRLPS-PHPPLQPMT-------QPPPPSQVSPQPLP--QPSLHGQ-MPPMPHSLQTGPSHMQHPVPPQPF 298
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695806    900 TGPSNNAY-PVPPGPGQYAPSGPSQLGQYP--NPKMPQVVAPAAGPIGFTPMATPGVAPRSVQP 960
Cdd:pfam03154  299 PLTPQSSQsQVPPGPSPAAPGQSQQRIHTPpsQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTP 362
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
566-782 2.01e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 60.35  E-value: 2.01e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  566 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSSAPYM--------------KVVSAMVNNDL 630
Cdd:cd09233   69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSRFARSESKLNDplqtlyqlfsgnspEAITELADNPA 147
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  631 RSLIytrSHKFWKETLALLCT-FAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAG--------NVDRTVEIWSRSLan 701
Cdd:cd09233  148 EAEW---ALGNWREHLAIILSnRTSNLDLEALV-ELGDLLAQRGLVEAAHICYLLAGvplgpypsSPSSCLLGGAVHN-- 221
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  702 eRDGRSYAELLQDLMEKTLVLALATGNKKFS-ASLC--KLFesYAEILASQGLLTTAMKYL----KVLDSGGLSP----- 769
Cdd:cd09233  222 -KSPRTFATPEAIQLTEIYEYALSLGNPQFGlPHLQpyKLI--HAARLAELGLVSEALKYCeaiaSSLKSLTKSPyydpn 298
                        250
                 ....*....|....*.
gi 30695806  770 ---ELSILRDRISLSA 782
Cdd:cd09233  299 llaQLQDLSERLSGTS 314
PTZ00420 PTZ00420
coronin; Provisional
104-225 1.03e-08

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 59.19  E-value: 1.03e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   104 LIGSQPSENAL----VGHLSVHKGPVRGLEFNAISSNLLASGADDGEICIWDLlkPSEPSHFPLLKGSGSATQG---EIS 176
Cdd:PTZ00420   52 LIGAIRLENQMrkppVIKLKGHTSSILDLQFNPCFSEILASGSEDLTIRVWEI--PHNDESVKEIKDPQCILKGhkkKIS 129
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*....
gi 30695806   177 FISWNRKVQQILASTSYNGTTVIWDLRKQKPIinFADSVRRRCSVLQWN 225
Cdd:PTZ00420  130 IIDWNPMNYYIMCSSGFDSFVNIWDIENEKRA--FQINMPKKLSSLKWN 176
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
50-262 1.89e-07

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 55.48  E-value: 1.89e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    50 SDDRDL--PLVgEIPSSERFNRLAWgrngsgsEEFALGLIAGGLVDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRG 127
Cdd:PLN00181  517 KDGRDIhyPVV-ELASRSKLSGICW-------NSYIKSQVASSNFEGVVQVWDV--------ARSQLVTEMKEHEKRVWS 580
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   128 LEFNAISSNLLASGADDGEICIWDLlkpSEPSHFPLLKgsgsaTQGEISFISWNRKVQQILASTSYNGTTVIWDLRKQKP 207
Cdd:PLN00181  581 IDYSSADPTLLASGSDDGSVKLWSI---NQGVSIGTIK-----TKANICCVQFPSESGRSLAFGSADHKVYYYDLRNPKL 652
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   208 IINFADSVRRRCSVLQWnpnVTTQIMVASDDDSspTLKLWDMRNIMS-----PVREFTGH 262
Cdd:PLN00181  653 PLCTMIGHSKTVSYVRF---VDSSTLVSSSTDN--TLKLWDLSMSISginetPLHSFMGH 707
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
751-934 2.12e-07

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 51.96  E-value: 2.12e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    751 LLTTAmkyLKVLDSGGLSPELSILRDRISLSAEPETNTTASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPytdsy 830
Cdd:pfam15240    5 LLTVA---LLALSSAQSSSEDVSQEDSPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPGGPQQP----- 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    831 yVPQVSHPPMQQPtmfmPHQAQPAPQPSFTPAPTSNAQPSmrttfvPSTPPALKNADQYQQPTMSSHSFTGPSNNAYPVP 910
Cdd:pfam15240   77 -PPQGGKQKPQGP----PPQGGPRPPPGKPQGPPPQGGNQ------QQGPPPPGKPQGPPPQGGGPPPQGGNQQGPPPPP 145
                          170       180
                   ....*....|....*....|....
gi 30695806    911 PGPGQYAPSGPSQLGQYPNPkmPQ 934
Cdd:pfam15240  146 PGNPQGPPQRPPQPGNPQGP--PQ 167
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
566-779 1.03e-06

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 51.79  E-value: 1.03e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    566 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSsapyMKVVSAMVNNDLRsLIY--------- 635
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFVRSE----FKGSNNKSGESLA-ALYqvfagnsee 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    636 -----TRSHKF-------WKETLALLCTFAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAGNVD-RTVEIWSRSLANE 702
Cdd:pfam12931   75 avdelVPPSKNalwaldnWRETLALVLSNRSPGDVEALL-ALGDLLAQYGRTEAAHICFLLAGLPLsQTVLLGADHVRFP 153
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    703 RDGRSYAE--LLQDLMEktLVLALATGNKKFSASLC----KLFesYAEILASQGLLTTAMKY-------LKVLDSG---- 765
Cdd:pfam12931  154 STFGNDLEsiLLTEIYE--YALSLSPPQPPFVGLPHllpyKLQ--HAAVLAEYGLVSEAQKYcdaitasLKSLTKKspyy 229
                          250
                   ....*....|....*.
gi 30695806    766 --GLSPELSILRDRIS 779
Cdd:pfam12931  230 hpTLLAQLEDLSNRLS 245
PHA03247 PHA03247
large tegument protein UL36; Provisional
784-1018 1.10e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.40  E-value: 1.10e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   784 PETNTTASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPYTDSYYVPQVSHPPMQQPTMF-MPHQAQPAPQPSfTPA 862
Cdd:PHA03247 2631 PSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGsLTSLADPPPPPP-TPE 2709
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   863 PtsnaQPSMRTTFVPsTPPALKNADQyqqptmsshsfTGPSNNAYPVPPGP--GQYAPSGPSQLGQYPNPKMPQVVAPAA 940
Cdd:PHA03247 2710 P----APHALVSATP-LPPGPAAARQ-----------ASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPAPAPPA 2773
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   941 GPIGFTPMAT--PGVAPRSVQ----PASPPTQQAAAQAAPAPATPPPTVQTADTSNVPAHQKPVIATLTRLFNETSEALG 1014
Cdd:PHA03247 2774 APAAGPPRRLtrPAVASLSESreslPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLG 2853

                  ....
gi 30695806  1015 GARA 1018
Cdd:PHA03247 2854 GSVA 2857
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
781-1003 2.05e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 52.08  E-value: 2.05e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    781 SAEPETNTTASGNTQP-----QSTMPYNQEPTQAQPNVLAnPYDNQYQQPYTDSyyvPQVSHPPMQQPTMFMPHQAQPAP 855
Cdd:pfam03154  262 SPQPLPQPSLHGQMPPmphslQTGPSHMQHPVPPQPFPLT-PQSSQSQVPPGPS---PAAPGQSQQRIHTPPSQSQLQSQ 337
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    856 QPS----FTPAPTSnaQPSMRTTfvPSTP-PALKNADQYQQPTMSS--HSFTGPSNnaYPVPPG-------PGQYAPSG- 920
Cdd:pfam03154  338 QPPreqpLPPAPLS--MPHIKPP--PTTPiPQLPNPQSHKHPPHLSgpSPFQMNSN--LPPPPAlkplsslSTHHPPSAh 411
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    921 --PSQLgqypNPKMPQVVAPAAGPIGFTPMAT---------PGVAPRSVQPASPPTQQAAAQAAPAPATPPPTVQTADTS 989
Cdd:pfam03154  412 ppPLQL----MPQSQQLPPPPAQPPVLTQSQSlpppaashpPTSGLHQVPSQSPFPQHPFVPGGPPPITPPSGPPTSTSS 487
                          250
                   ....*....|....
gi 30695806    990 NVPAHQKPVIATLT 1003
Cdd:pfam03154  488 AMPGIQPPSSASVS 501
PRK10263 PRK10263
DNA translocase FtsK; Provisional
795-955 2.77e-06

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 51.62  E-value: 2.77e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   795 QPQStmpYNQEPTQAQPNVlanPYDNQYQQPYTDSYYVPQVSHPPMQQPTMFMPHQAQPAPQPSFTPAPTsnaQPSMRTT 874
Cdd:PRK10263  376 APEG---YPQQSQYAQPAV---QYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPE---QPVAGNA 446
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   875 FVPSTP-------PALKNADQYQQPTMSSHSFTGPSNNAYP--VPPGPG--QYAPSGP----------------SQLGQY 927
Cdd:PRK10263  447 WQAEEQqstfapqSTYQTEQTYQQPAAQEPLYQQPQPVEQQpvVEPEPVveETKPARPplyyfeeveekrarerEQLAAW 526
                         170       180       190
                  ....*....|....*....|....*....|...
gi 30695806   928 ----PNP-KMPQVVAPAAGPigFTPMATPGVAP 955
Cdd:PRK10263  527 yqpiPEPvKEPEPIKSSLKA--PSVAAVPPVEA 557
DUF3824 pfam12868
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It ...
817-920 6.24e-06

Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known.


Pssm-ID: 372351 [Multi-domain]  Cd Length: 145  Bit Score: 47.04  E-value: 6.24e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    817 PYDNQYQQPYTDSYYVPQVSHPPMQqptmfmPHQAQPAPQP---SFTPAPTSNAQPSmrTTFVPSTPPALKNADQYQQPT 893
Cdd:pfam12868   43 RYEDDYRDYYEDPYSPSPYPPSPAG------PYASQGQYYPetnYFPPPPGSTPQPP--VDPQPNAPPPPYNPADYPPPP 114
                           90       100
                   ....*....|....*....|....*..
gi 30695806    894 MSSHSftgPSNNAYPVPPGPGQYAPSG 920
Cdd:pfam12868  115 GAAPP---PQPYQYPPPPGPDPYAPRP 138
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
783-1008 1.16e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 49.65  E-value: 1.16e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    783 EPETNTTASGNTQPQSTMP-YNQEPTQAQPNvlANPYDNQYQqPYTDSYYVP--QVSH-----PPMQQPtmFMPHQAQPA 854
Cdd:pfam09770  106 QPAARAAQSSAQPPASSLPqYQYASQQSQQP--SKPVRTGYE-KYKEPEPIPdlQVDAslwgvAPKKAA--APAPAPQPA 180
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    855 PQPSFTPAPTSN------------AQPSMRTTFVPS---TPPALKNADQYQQPTMSSHSFTGPSNNAYPVPPGPGQYAPS 919
Cdd:pfam09770  181 AQPASLPAPSRKmmsleeveaamrAQAKKPAQQPAPapaQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQG 260
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    920 GPSQLGQYPNPKMPQVVAPAAGPIGFTpmATPGVAPRSVQPAS--------PPTQQAAAQAAPAPATPPPTVQTADTSNV 991
Cdd:pfam09770  261 HPVTILQRPQSPQPDPAQPSIQPQAQQ--FHQQPPPVPVQPTQilqnpnrlSAARVGYPQNPQPGVQPAPAHQAHRQQGS 338
                          250       260
                   ....*....|....*....|
gi 30695806    992 PAHQKPVIAT---LTRLFNE 1008
Cdd:pfam09770  339 FGRQAPIITHpqqLAQLSEE 358
PRK10263 PRK10263
DNA translocase FtsK; Provisional
781-909 1.39e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 49.31  E-value: 1.39e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   781 SAEPETNTTASGNTQPQSTMPYNQEPTQ---AQPNVLANPyDNQYQQPYTDSYYVPQVSHPpmQQPTMFMPHQAQP---- 853
Cdd:PRK10263  748 IVEPVQQPQQPVAPQQQYQQPQQPVAPQpqyQQPQQPVAP-QPQYQQPQQPVAPQPQYQQP--QQPVAPQPQYQQPqqpv 824
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 30695806   854 APQPSFTpAPTSNAQPSMRTTFVpsTPPALKNADQ--YQQPTMSSHS---FTGPSNNAYPV 909
Cdd:PRK10263  825 APQPQYQ-QPQQPVAPQPQDTLL--HPLLMRNGDSrpLHKPTTPLPSldlLTPPPSEVEPV 882
YppG COG5894
Spore coat protein YppG [Cell cycle control, cell division, chromosome partitioning];
793-859 1.85e-05

Spore coat protein YppG [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 444596 [Multi-domain]  Cd Length: 112  Bit Score: 44.85  E-value: 1.85e-05
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 30695806  793 NTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPY---TDSYYvPQvsHPPmqQPTMFM---PHQAQPAPQPSF 859
Cdd:COG5894    2 HWYQRNMNMYHYARPALRPEQPYGPYQNQHQQPYyqqTNTQQ-PF--PPP--SPTPYPspkPLQTQPSQFQSL 69
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
114-151 3.02e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 41.91  E-value: 3.02e-05
                            10        20        30
                    ....*....|....*....|....*....|....*...
gi 30695806     114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWD 151
Cdd:smart00320    4 LLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
795-983 3.46e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 48.22  E-value: 3.46e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    795 QPQSTMPYNQEPTQAQPNV-------LANPYDNQYQQPYTDSyyVPQVSHPPMQQP----TMFMPHQAQPAPQP-SFTPA 862
Cdd:pfam03154  291 HPVPPQPFPLTPQSSQSQVppgpspaAPGQSQQRIHTPPSQS--QLQSQQPPREQPlppaPLSMPHIKPPPTTPiPQLPN 368
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    863 PTSNAQP-------------------------SMRTTFVPST-PPALKNADQYQ-------QPTMSSHSFTGPSNNAYPV 909
Cdd:pfam03154  369 PQSHKHPphlsgpspfqmnsnlppppalkplsSLSTHHPPSAhPPPLQLMPQSQqlppppaQPPVLTQSQSLPPPAASHP 448
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 30695806    910 PPGPGQYAPSGPSqLGQYP-NPKMPQVVAPAAGPIGFTPMATPGVAPRSVQPASPPTQQAAAqaapapatPPPTV 983
Cdd:pfam03154  449 PTSGLHQVPSQSP-FPQHPfVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAA--------VSCPL 514
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
795-863 6.70e-05

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 44.40  E-value: 6.70e-05
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695806     795 QPQSTMPYNQE--PTQAQPNVLANPYDNQYQQPYTDsyyvPQVSHPPMQQPT---MFMPHQAQPAPQPSFTPAP 863
Cdd:smart00818   71 QPLMPVPGQHSmtPTQHHQPNLPQPAQQPFQPQPLQ----PPQPQQPMQPQPpvhPIPPLPPQPPLPPMFPMQP 140
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
253-292 7.99e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 40.76  E-value: 7.99e-05
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 30695806     253 MSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWD 292
Cdd:smart00320    2 GELLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
WD40 pfam00400
WD domain, G-beta repeat;
114-151 9.66e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 40.79  E-value: 9.66e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 30695806    114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWD 151
Cdd:pfam00400    3 LLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
PRK13335 PRK13335
superantigen-like protein SSL3; Reviewed;
777-892 1.28e-04

superantigen-like protein SSL3; Reviewed;


Pssm-ID: 139494 [Multi-domain]  Cd Length: 356  Bit Score: 45.50  E-value: 1.28e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   777 RISLSAEPETNTTASGNTQPQSTMPYNQEPTQAQPNVLANPYDnQYQQPYTDSYYVPQVSHPPmqQPTMFMPhQAQP--- 853
Cdd:PRK13335   51 MINITAGANSATTQAANTRQERTPKLEKAPNTNEEKTSASKIE-KISQPKQEEQKSLNISATP--APKQEQS-QTTTest 126
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|
gi 30695806   854 APQPSFTPAPTSNA-QPSMRTTFVPSTPPALKNADQYQQP 892
Cdd:PRK13335  127 TPKTKVTTPPSTNTpQPMQSTKSDTPQSPTIKQAQTDMTP 166
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
780-952 1.63e-04

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 45.77  E-value: 1.63e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    780 LSAEPETNTTASGNTQPQSTMPYNQEPT-QAQPNVLANPYDNQYQQPYTDSYYVPQ------VSHPPMQQPTMFMphqaQ 852
Cdd:pfam09606  298 MSIGDQNNYQQQQTRQQQQQQGGNHPAAhQQQMNQSVGQGGQVVALGGLNHLETWNpgnfggLGANPMQRGQPGM----M 373
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    853 PAPQPsftpaptsnaqpsmrttfVPSTPPALKNADQYQQPTMSSHsftgpsnNAYPVPPG--PGQYAPSGPS---QLGQY 927
Cdd:pfam09606  374 SSPSP------------------VPGQQVRQVTPNQFMRQSPQPS-------VPSPQGPGsqPPQSHPGGMIpspALIPS 428
                          170       180       190
                   ....*....|....*....|....*....|
gi 30695806    928 PNPKMPQVVA-----PAAGPIGftPMATPG 952
Cdd:pfam09606  429 PSPQMSQQPAqqrtiGQDSPGG--SLNTPG 456
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
790-942 2.09e-04

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 45.02  E-value: 2.09e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  790 ASGNTQP-QST-MPYNQEPTQAQPNVLANPYDNQYQQPytdsyyVPQVSHppMQQPTMFM--PHQAQpapQPSFTPAPTS 865
Cdd:cd22553  188 AGGGNQAlQAQvIPQLAQAAQLQPQQLAQVSSQGYIQQ------IPANAS--QQQPQMVQqgPNQSG---QIIGQVASAS 256
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806  866 NAQPSMRTTFVPSTPPALKNADQYQQ----PTMS----SHSFTGPSNNAYPV----------PPGPGQYAPSGPSQLGQY 927
Cdd:cd22553  257 SIQAAAIPLTVYTGALAGQNGSNQQQvgqiVTSPiqgmTQGLTAPASSSIPTvvqqqaiqgnPLPPGTQIIAAGQQLQQD 336
                        170
                 ....*....|....*....
gi 30695806  928 PN-PKMPQVVA---PAAGP 942
Cdd:cd22553  337 PNdPTKWQVVAdgtPGSKK 355
PHA02682 PHA02682
ORF080 virion core protein; Provisional
786-1056 3.70e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 43.70  E-value: 3.70e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   786 TNTTASGNTQ-PQSTMPYNQEPtqAQPNVLANPYDnQYQQPYTDSYYVPQVSHPP--MQQPT---MFMPHQAQPAPQPSF 859
Cdd:PHA02682   19 ADTSSSLFTKcPQATIPAPAAP--CPPDADVDPLD-KYSVKEAGRYYQSRLKANSacMQRPSgqsPLAPSPACAAPAPAC 95
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   860 -TPAPTSNAQPSMRTTFVPSTPPALknadqyqqptmsshSFTGPSNNAYPVPPGPGQYAPsgPSQLGQYPNPKMPQVV-A 937
Cdd:PHA02682   96 pACAPAAPAPAVTCPAPAPACPPAT--------------APTCPPPAVCPAPARPAPACP--PSTRQCPPAPPLPTPKpA 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   938 PAAGPIGFTPMATPGVAPRSVQPaspptqqaaaqaapapatpppTVQTAdtsnvPAhQKPVIAtlTRLFNETSEALGGAR 1017
Cdd:PHA02682  160 PAAKPIFLHNQLPPPDYPAASCP---------------------TIETA-----PA-ASPVLE--PRIPDKIIDADNDDK 210
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*.
gi 30695806  1018 aNTTKKR--EIEDNSRKLGALFVKL-----NSGDISKNAADKLAQL 1056
Cdd:PHA02682  211 -DLIKKElaDIADSVRDLNAESLSLtrdieNAKSTTQAAIDDLRRL 255
WD40 pfam00400
WD domain, G-beta repeat;
253-292 3.79e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 38.87  E-value: 3.79e-04
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 30695806    253 MSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWD 292
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
PHA03247 PHA03247
large tegument protein UL36; Provisional
781-963 4.19e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.93  E-value: 4.19e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   781 SAEPETNTTASGNTQPQSTMPYNQEPTQAQP-------NVLANPYDNQYQQPYTDSYYVPQV-SHPP---MQQPTMFMPH 849
Cdd:PHA03247 2816 AALPPAASPAGPLPPPTSAQPTAPPPPPGPPppslplgGSVAPGGDVRRRPPSRSPAAKPAApARPPvrrLARPAVSRST 2895
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   850 QAQPAPQPSFTPAPTSNAQPSMRTTfvPSTPPALKNADQYQQPTMSSHsftgpsnnayPVPPGPGQYAPSGPSqlGQYPN 929
Cdd:PHA03247 2896 ESFALPPDQPERPPQPQAPPPPQPQ--PQPPPPPQPQPPPPPPPRPQP----------PLAPTTDPAGAGEPS--GAVPQ 2961
                         170       180       190
                  ....*....|....*....|....*....|....
gi 30695806   930 PKMPQVVApaaGPIGFTPMATPGVAPRSVQPASP 963
Cdd:PHA03247 2962 PWLGALVP---GRVAVPRFRVPQPAPSREAPASS 2992
DUF4106 pfam13388
Protein of unknown function (DUF4106); This family of proteins are found in large numbers in ...
784-873 6.72e-04

Protein of unknown function (DUF4106); This family of proteins are found in large numbers in the Trichomonas vaginalis proteome. The function of this protein is unknown.


Pssm-ID: 404296  Cd Length: 431  Bit Score: 43.35  E-value: 6.72e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    784 PETNTTASGNTQPQSTMPYNQEPTQaQPNVlanpyDNQYQQPytdsyyvpqVSHPPMQQPTMFMPHQAQPAPQPSFTPAP 863
Cdd:pfam13388  186 PKTFTSSHGHRHRHAPKPTVQNPAQ-QPTV-----QNPAQQP---------TQQPTVQNPAQQQNPAQQPPPQPAQQPTV 250
                           90
                   ....*....|
gi 30695806    864 TSNAQPSMRT 873
Cdd:pfam13388  251 QNPAQQQPQT 260
PHA03378 PHA03378
EBNA-3B; Provisional
790-958 6.72e-04

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 43.90  E-value: 6.72e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   790 ASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPytdsyyvPQVSHP----PMQQPtmfmphQAQPAPQPSFTPAPTS 865
Cdd:PHA03378  730 APGRARPPAAAPGRARPPAAAPGRARPPAAAPGRAR-------PPAAAPgaptPQPPP------QAPPAPQQRPRGAPTP 796
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   866 NAQPSMRTTFVPSTPPALKnADQYQQPTMSSHSFTGPSNNAYPVPPGPGQYAPSGPsqLGQYPNPKMPQVVAPAAGPIGF 945
Cdd:PHA03378  797 QPPPQAGPTSMQLMPRAAP-GQQGPTKQILRQLLTGGVKRGRPSLKKPAALERQAA--AGPTPSPGSGTSDKIVQAPVFY 873
                         170
                  ....*....|...
gi 30695806   946 TPMATPGVAPRSV 958
Cdd:PHA03378  874 PPVLQPIQVMRQL 886
PHA03378 PHA03378
EBNA-3B; Provisional
788-975 1.62e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 42.75  E-value: 1.62e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   788 TTASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPYTDSYYVPQVSH-PPMQQPTMFMPHQAQPAPQPsFTPAPTsN 866
Cdd:PHA03378  603 SQTPEPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHqPPQVEITPYKPTWTQIGHIP-YQPSPT-G 680
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   867 AQPSMRTTFVPST--PPAlknadqyQQPTMSSHSFTGPSN----NAYPVPPGPGQYAPSGPSQLGQYPNPKMPQVVAP-- 938
Cdd:PHA03378  681 ANTMLPIQWAPGTmqPPP-------RAPTPMRPPAAPPGRaqrpAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPgr 753
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|..
gi 30695806   939 -----AAGPIGFTPMATPGVAPRSVQPASPPTQQAAAQAAPA 975
Cdd:PHA03378  754 arppaAAPGRARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPT 795
YppG pfam14179
YppG-like protein; The YppG-like protein family includes the B. subtilis YppG protein, which ...
803-858 1.64e-03

YppG-like protein; The YppG-like protein family includes the B. subtilis YppG protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 115 and 181 amino acids in length. There are two completely conserved residues (F and G) that may be functionally important.


Pssm-ID: 372950 [Multi-domain]  Cd Length: 101  Bit Score: 39.02  E-value: 1.64e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 30695806    803 NQEPTQAQPNVLANPYDN---QYQQPYTDSYYVPQVSHPPMQQptmfMPHQAQPAPQPS 858
Cdd:pfam14179    1 YQHNSQPYPYFSQQVYQQpvqPQYPPFAPQQYMPQPPMPYMNP----YPKQQPQQQQPS 55
PHA03378 PHA03378
EBNA-3B; Provisional
790-962 2.20e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 42.36  E-value: 2.20e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   790 ASGNTQPQSTMPYNQEPTQAQPNVLANPYDnqyqqpytdsyyVPQVSHPPMQQPTMFMPHQAQPAPQPSFTPAPTSNAQP 869
Cdd:PHA03378  690 APGTMQPPPRAPTPMRPPAAPPGRAQRPAA------------ATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPP 757
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   870 SMRTTFVPSTPPALKNADQYQQPTmsshsftgpsnnaypVPPGPGQY-----APSGPSQLGQYPNPKMPQVVAPAAGPIG 944
Cdd:PHA03378  758 AAAPGRARPPAAAPGAPTPQPPPQ---------------APPAPQQRprgapTPQPPPQAGPTSMQLMPRAAPGQQGPTK 822
                         170
                  ....*....|....*...
gi 30695806   945 FTPMATPGVAPRSVQPAS 962
Cdd:PHA03378  823 QILRQLLTGGVKRGRPSL 840
PTZ00421 PTZ00421
coronin; Provisional
93-313 2.37e-03

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 41.80  E-value: 2.37e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    93 DGNIDLWN-PLSLIGSQPSENALvgHLSVHKGPVRGLEFNAISSNLLASGADDGEICIWDLLKPSEpshfpllKGSGSAT 171
Cdd:PTZ00421   97 DGTIMGWGiPEEGLTQNISDPIV--HLQGHTKKVGIVSFHPSAMNVLASAGADMVVNVWDVERGKA-------VEVIKCH 167
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   172 QGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADSVRRRCSVLQWNPNVTTQIMVASDDDSSPTLKLWDMRN 251
Cdd:PTZ00421  168 SDQITSLEWNLD-GSLLCTTSKDKKLNIIDPRDGTIVSSVEAHASAKSQRCLWAKRKDLIITLGCSKSQQRQIMLWDTRK 246
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695806   252 IMSPVREFTGHQRGVIAMEWCPSDSSYLLTCAK-----------DNR-TICWDTNTAEIVAELPAGNNWNFDVH 313
Cdd:PTZ00421  247 MASPYSTVDLDQSSALFIPFFDEDTNLLYIGSKgegnircfelmNERlTFCSSYSSVEPHKGLCMMPKWSLDTR 320
PRK12757 PRK12757
cell division protein FtsN; Provisional
795-882 5.37e-03

cell division protein FtsN; Provisional


Pssm-ID: 237191 [Multi-domain]  Cd Length: 256  Bit Score: 40.03  E-value: 5.37e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   795 QPQSTMpyNQEPTQAqPNVlanPYDNQYQQPYTDSYYVPQVSHPPMQQPTMFMPHQAQPAPQPSfTPAPTSNAQPSMRTT 874
Cdd:PRK12757   90 QMQADM--RQQPTQL-SEV---PYNEQTPQVPRSTVQIQQQAQQQQPPATTAQPQPVTPPRQTT-APVQPQTPAPVRTQP 162

                  ....*...
gi 30695806   875 FVPSTPPA 882
Cdd:PRK12757  163 AAPVTQAV 170
PRK10263 PRK10263
DNA translocase FtsK; Provisional
813-997 5.41e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 40.84  E-value: 5.41e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   813 VLANPYD-----NQYQQPYTDSYYvPQVSHPPMQQPTmfmphqAQPAPQPSFTPAPTSNAQPSMRTTFVPSTPPALknad 887
Cdd:PRK10263  285 VAADPDDvlfsgNRATQPEYDEYD-PLLNGAPITEPV------AVAAAATTATQSWAAPVEPVTQTPPVASVDVPP---- 353
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806   888 qyQQPTMSSHSFTGPSNNAYPVPPGPGQYAPSgpsqlGQYPNPKMPQvVAPAAGPIgftPMATPGVAPRSVQPASPPTQQ 967
Cdd:PRK10263  354 --AQPTVAWQPVPGPQTGEPVIAPAPEGYPQQ-----SQYAQPAVQY-NEPLQQPV---QPQQPYYAPAAEQPAQQPYYA 422
                         170       180       190
                  ....*....|....*....|....*....|
gi 30695806   968 AAAQAAPAPATPPPTVQTADTSNVPAHQKP 997
Cdd:PRK10263  423 PAPEQPAQQPYYAPAPEQPVAGNAWQAEEQ 452
Gag_spuma pfam03276
Spumavirus gag protein;
855-960 9.26e-03

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 40.12  E-value: 9.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695806    855 PQPSFTPAPTSNAQPSMRTTfVPSTPPALKNADQyqqpTMSSHSFTGPSNNAYPVPPGPG-QYAPSGP--SQLGQYPN-- 929
Cdd:pfam03276  196 PSLPAIGGIHLPAIPGIHAR-APPGNIARSLGDD----IMPSLGDAGMPQPRFAFHPGNPfAEAEGHPfaEAEGERPRdi 270
                           90       100       110
                   ....*....|....*....|....*....|..
gi 30695806    930 PKMPQVVAPAA-GPIGFTPMATPGVAPRSVQP 960
Cdd:pfam03276  271 PRAPRIDAPSApAIPAIQPIAPPMIPPIGAPI 302
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH