NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|30695804|ref|NP_851024|]
View 

transducin family protein / WD-40 repeat family protein [Arabidopsis thaliana]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 super family cl29593
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
86-335 1.57e-27

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


The actual alignment was detected with superfamily member cd00200:

Pssm-ID: 475233 [Multi-domain]  Cd Length: 289  Bit Score: 113.58  E-value: 1.57e-27
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   86 LIAGGLvDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSepshfplLK 165
Cdd:cd00200   66 LASGSS-DKTIRLWDL--------ETGECVRTLTGHTSYVSSVAFSP-DGRILSSSSRDKTIKVWDVETGK-------CL 128
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  166 GSGSATQGEISFISWNrKVQQILASTSYNGTTVIWDLRKQKPIINFA---DSVRRrcsvLQWNPNvTTQIMVASDDDssp 242
Cdd:cd00200  129 TTLRGHTDWVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTghtGEVNS----VAFSPD-GEKLLSSSSDG--- 199
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  243 TLKLWDMRNIMSpVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVI 322
Cdd:cd00200  200 TIKLWDLSTGKC-LGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLA 277
                        250
                 ....*....|...
gi 30695804  323 SAsSFDGKIGIYN 335
Cdd:cd00200  278 SG-SADGTIRIWD 289
Atrophin-1 super family cl38111
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
742-962 6.00e-11

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


The actual alignment was detected with superfamily member pfam03154:

Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 67.10  E-value: 6.00e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    742 ESYAEILASQGLLTTAMKYLKVLDSGGLSPELSILRDRISLSAEPETNTTAsgnTQPQSTMPYNQEPTQAQPNvlANPYD 821
Cdd:pfam03154  155 ESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPS---VPPQGSPATSQPPNQTQST--AAPHT 229
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    822 NQYQQPYTDSYYVPQvSHPPMQQPTmfmphqaQPAPQPSFTPAPTSnaQPSMRTTfVPSTPPALKNADQYQQPTMSSHSF 901
Cdd:pfam03154  230 LIQQTPTLHPQRLPS-PHPPLQPMT-------QPPPPSQVSPQPLP--QPSLHGQ-MPPMPHSLQTGPSHMQHPVPPQPF 298
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695804    902 TGPSNNAY-PVPPGPGQYAPSGPSQLGQYP--NPKMPQVVAPAAGPIGFTPMATPGVAPRSVQP 962
Cdd:pfam03154  299 PLTPQSSQsQVPPGPSPAAPGQSQQRIHTPpsQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTP 362
ACE1-Sec16-like super family cl14807
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
568-784 2.02e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


The actual alignment was detected with superfamily member cd09233:

Pssm-ID: 449359 [Multi-domain]  Cd Length: 314  Bit Score: 60.35  E-value: 2.02e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  568 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSSAPYM--------------KVVSAMVNNDL 632
Cdd:cd09233   69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSRFARSESKLNDplqtlyqlfsgnspEAITELADNPA 147
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  633 RSLIytrSHKFWKETLALLCT-FAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAG--------NVDRTVEIWSRSLan 703
Cdd:cd09233  148 EAEW---ALGNWREHLAIILSnRTSNLDLEALV-ELGDLLAQRGLVEAAHICYLLAGvplgpypsSPSSCLLGGAVHN-- 221
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  704 eRDGRSYAELLQDLMEKTLVLALATGNKKFS-ASLC--KLFesYAEILASQGLLTTAMKYL----KVLDSGGLSP----- 771
Cdd:cd09233  222 -KSPRTFATPEAIQLTEIYEYALSLGNPQFGlPHLQpyKLI--HAARLAELGLVSEALKYCeaiaSSLKSLTKSPyydpn 298
                        250
                 ....*....|....*.
gi 30695804  772 ---ELSILRDRISLSA 784
Cdd:cd09233  299 llaQLQDLSERLSGTS 314
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
86-335 1.57e-27

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 113.58  E-value: 1.57e-27
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   86 LIAGGLvDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSepshfplLK 165
Cdd:cd00200   66 LASGSS-DKTIRLWDL--------ETGECVRTLTGHTSYVSSVAFSP-DGRILSSSSRDKTIKVWDVETGK-------CL 128
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  166 GSGSATQGEISFISWNrKVQQILASTSYNGTTVIWDLRKQKPIINFA---DSVRRrcsvLQWNPNvTTQIMVASDDDssp 242
Cdd:cd00200  129 TTLRGHTDWVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTghtGEVNS----VAFSPD-GEKLLSSSSDG--- 199
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  243 TLKLWDMRNIMSpVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVI 322
Cdd:cd00200  200 TIKLWDLSTGKC-LGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLA 277
                        250
                 ....*....|...
gi 30695804  323 SAsSFDGKIGIYN 335
Cdd:cd00200  278 SG-SADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
6-338 2.29e-25

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 110.00  E-value: 2.29e-25
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    6 GVGRSASVALAPDAPYMAAGTMAGAVDLsFSSSANLEIFKLDFQSDdrdlplvgeipsseRFNRLAWGRNGSgseefalg 85
Cdd:COG2319   77 HTAAVLSVAFSPDGRLLASASADGTVRL-WDLATGLLLRTLTGHTG--------------AVRSVAFSPDGK-------- 133
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   86 LIAGGLVDGNIDLWNPLSligsqpseNALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPSHFPllk 165
Cdd:COG2319  134 TLASGSADGTVRLWDLAT--------GKLLRTLTGHSGAVTSVAFSP-DGKLLASGSDDGTVRLWDLATGKLLRTLT--- 201
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  166 gsgsATQGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADSVRRRCSVlQWNPNvtTQIMVASDDDSspTLK 245
Cdd:COG2319  202 ----GHTGAVRSVAFSPD-GKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSV-AFSPD--GRLLASGSADG--TVR 271
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  246 LWDMRNImSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVISAS 325
Cdd:COG2319  272 LWDLATG-ELLRTLTGHSGGVNSVAFSP-DGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGS 349
                        330
                 ....*....|...
gi 30695804  326 SfDGKIGIYNIEG 338
Cdd:COG2319  350 D-DGTVRLWDLAT 361
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
742-962 6.00e-11

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 67.10  E-value: 6.00e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    742 ESYAEILASQGLLTTAMKYLKVLDSGGLSPELSILRDRISLSAEPETNTTAsgnTQPQSTMPYNQEPTQAQPNvlANPYD 821
Cdd:pfam03154  155 ESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPS---VPPQGSPATSQPPNQTQST--AAPHT 229
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    822 NQYQQPYTDSYYVPQvSHPPMQQPTmfmphqaQPAPQPSFTPAPTSnaQPSMRTTfVPSTPPALKNADQYQQPTMSSHSF 901
Cdd:pfam03154  230 LIQQTPTLHPQRLPS-PHPPLQPMT-------QPPPPSQVSPQPLP--QPSLHGQ-MPPMPHSLQTGPSHMQHPVPPQPF 298
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695804    902 TGPSNNAY-PVPPGPGQYAPSGPSQLGQYP--NPKMPQVVAPAAGPIGFTPMATPGVAPRSVQP 962
Cdd:pfam03154  299 PLTPQSSQsQVPPGPSPAAPGQSQQRIHTPpsQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTP 362
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
568-784 2.02e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 60.35  E-value: 2.02e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  568 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSSAPYM--------------KVVSAMVNNDL 632
Cdd:cd09233   69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSRFARSESKLNDplqtlyqlfsgnspEAITELADNPA 147
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  633 RSLIytrSHKFWKETLALLCT-FAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAG--------NVDRTVEIWSRSLan 703
Cdd:cd09233  148 EAEW---ALGNWREHLAIILSnRTSNLDLEALV-ELGDLLAQRGLVEAAHICYLLAGvplgpypsSPSSCLLGGAVHN-- 221
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  704 eRDGRSYAELLQDLMEKTLVLALATGNKKFS-ASLC--KLFesYAEILASQGLLTTAMKYL----KVLDSGGLSP----- 771
Cdd:cd09233  222 -KSPRTFATPEAIQLTEIYEYALSLGNPQFGlPHLQpyKLI--HAARLAELGLVSEALKYCeaiaSSLKSLTKSPyydpn 298
                        250
                 ....*....|....*.
gi 30695804  772 ---ELSILRDRISLSA 784
Cdd:cd09233  299 llaQLQDLSERLSGTS 314
PTZ00420 PTZ00420
coronin; Provisional
104-225 1.03e-08

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 59.19  E-value: 1.03e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   104 LIGSQPSENAL----VGHLSVHKGPVRGLEFNAISSNLLASGADDGEICIWDLlkPSEPSHFPLLKGSGSATQG---EIS 176
Cdd:PTZ00420   52 LIGAIRLENQMrkppVIKLKGHTSSILDLQFNPCFSEILASGSEDLTIRVWEI--PHNDESVKEIKDPQCILKGhkkKIS 129
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*....
gi 30695804   177 FISWNRKVQQILASTSYNGTTVIWDLRKQKPIinFADSVRRRCSVLQWN 225
Cdd:PTZ00420  130 IIDWNPMNYYIMCSSGFDSFVNIWDIENEKRA--FQINMPKKLSSLKWN 176
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
568-781 9.66e-07

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 51.79  E-value: 9.66e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    568 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSsapyMKVVSAMVNNDLRsLIY--------- 637
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFVRSE----FKGSNNKSGESLA-ALYqvfagnsee 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    638 -----TRSHKF-------WKETLALLCTFAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAGNVD-RTVEIWSRSLANE 704
Cdd:pfam12931   75 avdelVPPSKNalwaldnWRETLALVLSNRSPGDVEALL-ALGDLLAQYGRTEAAHICFLLAGLPLsQTVLLGADHVRFP 153
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    705 RDGRSYAE--LLQDLMEktLVLALATGNKKFSASLC----KLFesYAEILASQGLLTTAMKY-------LKVLDSG---- 767
Cdd:pfam12931  154 STFGNDLEsiLLTEIYE--YALSLSPPQPPFVGLPHllpyKLQ--HAAVLAEYGLVSEAQKYcdaitasLKSLTKKspyy 229
                          250
                   ....*....|....*.
gi 30695804    768 --GLSPELSILRDRIS 781
Cdd:pfam12931  230 hpTLLAQLEDLSNRLS 245
PHA03247 PHA03247
large tegument protein UL36; Provisional
786-1020 1.09e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.40  E-value: 1.09e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   786 PETNTTASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPYTDSYYVPQVSHPPMQQPTMF-MPHQAQPAPQPSfTPA 864
Cdd:PHA03247 2631 PSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGsLTSLADPPPPPP-TPE 2709
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   865 PtsnaQPSMRTTFVPsTPPALKNADQyqqptmsshsfTGPSNNAYPVPPGP--GQYAPSGPSQLGQYPNPKMPQVVAPAA 942
Cdd:PHA03247 2710 P----APHALVSATP-LPPGPAAARQ-----------ASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPAPAPPA 2773
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   943 GPIGFTPMAT--PGVAPRSVQ----PASPPTQQAAAQAAPAPATPPPTVQTADTSNVPAHQKPVIATLTRLFNETSEALG 1016
Cdd:PHA03247 2774 APAAGPPRRLtrPAVASLSESreslPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLG 2853

                  ....
gi 30695804  1017 GARA 1020
Cdd:PHA03247 2854 GSVA 2857
YppG COG5894
Spore coat protein YppG [Cell cycle control, cell division, chromosome partitioning];
795-861 1.84e-05

Spore coat protein YppG [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 444596 [Multi-domain]  Cd Length: 112  Bit Score: 44.85  E-value: 1.84e-05
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 30695804  795 NTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPY---TDSYYvPQvsHPPmqQPTMFM---PHQAQPAPQPSF 861
Cdd:COG5894    2 HWYQRNMNMYHYARPALRPEQPYGPYQNQHQQPYyqqTNTQQ-PF--PPP--SPTPYPspkPLQTQPSQFQSL 69
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
114-151 3.08e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 41.91  E-value: 3.08e-05
                            10        20        30
                    ....*....|....*....|....*....|....*...
gi 30695804     114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWD 151
Cdd:smart00320    4 LLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
797-865 6.46e-05

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 44.78  E-value: 6.46e-05
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695804     797 QPQSTMPYNQE--PTQAQPNVLANPYDNQYQQPYTDsyyvPQVSHPPMQQPT---MFMPHQAQPAPQPSFTPAP 865
Cdd:smart00818   71 QPLMPVPGQHSmtPTQHHQPNLPQPAQQPFQPQPLQ----PPQPQQPMQPQPpvhPIPPLPPQPPLPPMFPMQP 140
WD40 pfam00400
WD domain, G-beta repeat;
114-151 9.68e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 40.79  E-value: 9.68e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 30695804    114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWD 151
Cdd:pfam00400    3 LLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
792-944 2.17e-04

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 45.02  E-value: 2.17e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  792 ASGNTQP-QST-MPYNQEPTQAQPNVLANPYDNQYQQPytdsyyVPQVSHppMQQPTMFM--PHQAQpapQPSFTPAPTS 867
Cdd:cd22553  188 AGGGNQAlQAQvIPQLAQAAQLQPQQLAQVSSQGYIQQ------IPANAS--QQQPQMVQqgPNQSG---QIIGQVASAS 256
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  868 NAQPSMRTTFVPSTPPALKNADQYQQ----PTMS----SHSFTGPSNNAYPV----------PPGPGQYAPSGPSQLGQY 929
Cdd:cd22553  257 SIQAAAIPLTVYTGALAGQNGSNQQQvgqiVTSPiqgmTQGLTAPASSSIPTvvqqqaiqgnPLPPGTQIIAAGQQLQQD 336
                        170
                 ....*....|....*....
gi 30695804  930 PN-PKMPQVVA---PAAGP 944
Cdd:cd22553  337 PNdPTKWQVVAdgtPGSKK 355
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
86-335 1.57e-27

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 113.58  E-value: 1.57e-27
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   86 LIAGGLvDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSepshfplLK 165
Cdd:cd00200   66 LASGSS-DKTIRLWDL--------ETGECVRTLTGHTSYVSSVAFSP-DGRILSSSSRDKTIKVWDVETGK-------CL 128
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  166 GSGSATQGEISFISWNrKVQQILASTSYNGTTVIWDLRKQKPIINFA---DSVRRrcsvLQWNPNvTTQIMVASDDDssp 242
Cdd:cd00200  129 TTLRGHTDWVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTghtGEVNS----VAFSPD-GEKLLSSSSDG--- 199
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  243 TLKLWDMRNIMSpVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVI 322
Cdd:cd00200  200 TIKLWDLSTGKC-LGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLA 277
                        250
                 ....*....|...
gi 30695804  323 SAsSFDGKIGIYN 335
Cdd:cd00200  278 SG-SADGTIRIWD 289
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
85-337 8.04e-27

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 111.66  E-value: 8.04e-27
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   85 GLIAGGLVDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPSHFpll 164
Cdd:cd00200   22 KLLATGSGDGTIKVWDL--------ETGELLRTLKGHTGPVRDVAASA-DGTYLASGSSDKTIRLWDLETGECVRTL--- 89
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  165 KGSgsatQGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADsvrRRCSVLQWNPNVTTQIMVASDDDSspTL 244
Cdd:cd00200   90 TGH----TSYVSSVAFSPD-GRILSSSSRDKTIKVWDVETGKCLTTLRG---HTDWVNSVAFSPDGTFVASSSQDG--TI 159
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  245 KLWDMRNiMSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKiPGVISA 324
Cdd:cd00200  160 KLWDLRT-GKCVATLTGHTGEVNSVAFSP-DGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHENGVNSVAFSPD-GYLLAS 236
                        250
                 ....*....|...
gi 30695804  325 SSFDGKIGIYNIE 337
Cdd:cd00200  237 GSEDGTIRVWDLR 249
WD40 COG2319
WD40 repeat [General function prediction only];
6-338 2.29e-25

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 110.00  E-value: 2.29e-25
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    6 GVGRSASVALAPDAPYMAAGTMAGAVDLsFSSSANLEIFKLDFQSDdrdlplvgeipsseRFNRLAWGRNGSgseefalg 85
Cdd:COG2319   77 HTAAVLSVAFSPDGRLLASASADGTVRL-WDLATGLLLRTLTGHTG--------------AVRSVAFSPDGK-------- 133
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   86 LIAGGLVDGNIDLWNPLSligsqpseNALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPSHFPllk 165
Cdd:COG2319  134 TLASGSADGTVRLWDLAT--------GKLLRTLTGHSGAVTSVAFSP-DGKLLASGSDDGTVRLWDLATGKLLRTLT--- 201
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  166 gsgsATQGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADSVRRRCSVlQWNPNvtTQIMVASDDDSspTLK 245
Cdd:COG2319  202 ----GHTGAVRSVAFSPD-GKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSV-AFSPD--GRLLASGSADG--TVR 271
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  246 LWDMRNImSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVISAS 325
Cdd:COG2319  272 LWDLATG-ELLRTLTGHSGGVNSVAFSP-DGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGS 349
                        330
                 ....*....|...
gi 30695804  326 SfDGKIGIYNIEG 338
Cdd:COG2319  350 D-DGTVRLWDLAT 361
WD40 COG2319
WD40 repeat [General function prediction only];
12-338 1.72e-24

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 107.30  E-value: 1.72e-24
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   12 SVALAPDAPYMAAGtmagavdlsfSSSANLEIFkldfqsDDRDLPLVGEIPS-SERFNRLAWGRNGSgseefalgLIAGG 90
Cdd:COG2319  125 SVAFSPDGKTLASG----------SADGTVRLW------DLATGKLLRTLTGhSGAVTSVAFSPDGK--------LLASG 180
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   91 LVDGNIDLWNPLSligsqpseNALVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPShfpLLKGSGsa 170
Cdd:COG2319  181 SDDGTVRLWDLAT--------GKLLRTLTGHTGAVRSVAFSP-DGKLLASGSADGTVRLWDLATGKLLR---TLTGHS-- 246
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  171 tqGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADSVRRRCSVlQWNPNvTTQIMVASDDDsspTLKLWDMR 250
Cdd:COG2319  247 --GSVRSVAFSPD-GRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSV-AFSPD-GKLLASGSDDG---TVRLWDLA 318
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  251 NImSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKIPGVISAsSFDGK 330
Cdd:COG2319  319 TG-KLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLASG-SADGT 395

                 ....*...
gi 30695804  331 IGIYNIEG 338
Cdd:COG2319  396 VRLWDLAT 403
WD40 COG2319
WD40 repeat [General function prediction only];
50-338 7.84e-18

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 87.27  E-value: 7.84e-18
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   50 SDDRDLPLVGEIPSSERFNRLAWGRNGSGSEEFALGLIAGGLVDGNIDLWNPLSLigsqpsenALVGHLSVHKGPVRGLE 129
Cdd:COG2319   14 ADLALALLAAALGALLLLLLGLAAAVASLAASPDGARLAAGAGDLTLLLLDAAAG--------ALLATLLGHTAAVLSVA 85
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  130 FNAiSSNLLASGADDGEICIWDLLKPSEPSHFpllkgsgSATQGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPII 209
Cdd:COG2319   86 FSP-DGRLLASASADGTVRLWDLATGLLLRTL-------TGHTGAVRSVAFSPD-GKTLASGSADGTVRLWDLATGKLLR 156
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  210 NF---ADSVRRrcsvLQWNPNvtTQIMVASDDDSspTLKLWDMRNImSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDN 286
Cdd:COG2319  157 TLtghSGAVTS----VAFSPD--GKLLASGSDDG--TVRLWDLATG-KLLRTLTGHTGAVRSVAFSP-DGKLLASGSADG 226
                        250       260       270       280       290
                 ....*....|....*....|....*....|....*....|....*....|...
gi 30695804  287 RTICWDTNTAEIVAELPAGNNWNFDVHWYPKipG-VISASSFDGKIGIYNIEG 338
Cdd:COG2319  227 TVRLWDLATGKLLRTLTGHSGSVRSVAFSPD--GrLLASGSADGTVRLWDLAT 277
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
114-340 7.77e-17

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 82.38  E-value: 7.77e-17
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWDLLKPSEPSHFPLLKGSGSAtqgeISFISWNRKvqqiLASTSY 193
Cdd:cd00200    1 LRRTLKGHTGGVTCVAFSP-DGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRD----VAASADGTY----LASGSS 71
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  194 NGTTVIWDLRKQKPIinfadsvrrrcsvlqwnpnvttqimvasdddssptlklwdmrnimspvREFTGHQRGVIAMEWCP 273
Cdd:cd00200   72 DKTIRLWDLETGECV------------------------------------------------RTLTGHTSYVSSVAFSP 103
                        170       180       190       200       210       220
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 30695804  274 sDSSYLLTCAKDNRTICWDTNTAEIVAELPAGNNWNFDVHWYPKiPGVISASSFDGKIGIYNIEGCS 340
Cdd:cd00200  104 -DGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPD-GTFVASSSQDGTIKLWDLRTGK 168
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
742-962 6.00e-11

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 67.10  E-value: 6.00e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    742 ESYAEILASQGLLTTAMKYLKVLDSGGLSPELSILRDRISLSAEPETNTTAsgnTQPQSTMPYNQEPTQAQPNvlANPYD 821
Cdd:pfam03154  155 ESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPS---VPPQGSPATSQPPNQTQST--AAPHT 229
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    822 NQYQQPYTDSYYVPQvSHPPMQQPTmfmphqaQPAPQPSFTPAPTSnaQPSMRTTfVPSTPPALKNADQYQQPTMSSHSF 901
Cdd:pfam03154  230 LIQQTPTLHPQRLPS-PHPPLQPMT-------QPPPPSQVSPQPLP--QPSLHGQ-MPPMPHSLQTGPSHMQHPVPPQPF 298
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695804    902 TGPSNNAY-PVPPGPGQYAPSGPSQLGQYP--NPKMPQVVAPAAGPIGFTPMATPGVAPRSVQP 962
Cdd:pfam03154  299 PLTPQSSQsQVPPGPSPAAPGQSQQRIHTPpsQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTP 362
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
568-784 2.02e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 60.35  E-value: 2.02e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  568 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSSAPYM--------------KVVSAMVNNDL 632
Cdd:cd09233   69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSRFARSESKLNDplqtlyqlfsgnspEAITELADNPA 147
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  633 RSLIytrSHKFWKETLALLCT-FAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAG--------NVDRTVEIWSRSLan 703
Cdd:cd09233  148 EAEW---ALGNWREHLAIILSnRTSNLDLEALV-ELGDLLAQRGLVEAAHICYLLAGvplgpypsSPSSCLLGGAVHN-- 221
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  704 eRDGRSYAELLQDLMEKTLVLALATGNKKFS-ASLC--KLFesYAEILASQGLLTTAMKYL----KVLDSGGLSP----- 771
Cdd:cd09233  222 -KSPRTFATPEAIQLTEIYEYALSLGNPQFGlPHLQpyKLI--HAARLAELGLVSEALKYCeaiaSSLKSLTKSPyydpn 298
                        250
                 ....*....|....*.
gi 30695804  772 ---ELSILRDRISLSA 784
Cdd:cd09233  299 llaQLQDLSERLSGTS 314
PTZ00420 PTZ00420
coronin; Provisional
104-225 1.03e-08

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 59.19  E-value: 1.03e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   104 LIGSQPSENAL----VGHLSVHKGPVRGLEFNAISSNLLASGADDGEICIWDLlkPSEPSHFPLLKGSGSATQG---EIS 176
Cdd:PTZ00420   52 LIGAIRLENQMrkppVIKLKGHTSSILDLQFNPCFSEILASGSEDLTIRVWEI--PHNDESVKEIKDPQCILKGhkkKIS 129
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*....
gi 30695804   177 FISWNRKVQQILASTSYNGTTVIWDLRKQKPIinFADSVRRRCSVLQWN 225
Cdd:PTZ00420  130 IIDWNPMNYYIMCSSGFDSFVNIWDIENEKRA--FQINMPKKLSSLKWN 176
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
50-262 1.89e-07

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 55.48  E-value: 1.89e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    50 SDDRDL--PLVgEIPSSERFNRLAWgrngsgsEEFALGLIAGGLVDGNIDLWNPlsligsqpSENALVGHLSVHKGPVRG 127
Cdd:PLN00181  517 KDGRDIhyPVV-ELASRSKLSGICW-------NSYIKSQVASSNFEGVVQVWDV--------ARSQLVTEMKEHEKRVWS 580
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   128 LEFNAISSNLLASGADDGEICIWDLlkpSEPSHFPLLKgsgsaTQGEISFISWNRKVQQILASTSYNGTTVIWDLRKQKP 207
Cdd:PLN00181  581 IDYSSADPTLLASGSDDGSVKLWSI---NQGVSIGTIK-----TKANICCVQFPSESGRSLAFGSADHKVYYYDLRNPKL 652
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   208 IINFADSVRRRCSVLQWnpnVTTQIMVASDDDSspTLKLWDMRNIMS-----PVREFTGH 262
Cdd:PLN00181  653 PLCTMIGHSKTVSYVRF---VDSSTLVSSSTDN--TLKLWDLSMSISginetPLHSFMGH 707
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
753-936 2.14e-07

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 51.96  E-value: 2.14e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    753 LLTTAmkyLKVLDSGGLSPELSILRDRISLSAEPETNTTASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPytdsy 832
Cdd:pfam15240    5 LLTVA---LLALSSAQSSSEDVSQEDSPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPGGPQQP----- 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    833 yVPQVSHPPMQQPtmfmPHQAQPAPQPSFTPAPTSNAQPSmrttfvPSTPPALKNADQYQQPTMSSHSFTGPSNNAYPVP 912
Cdd:pfam15240   77 -PPQGGKQKPQGP----PPQGGPRPPPGKPQGPPPQGGNQ------QQGPPPPGKPQGPPPQGGGPPPQGGNQQGPPPPP 145
                          170       180
                   ....*....|....*....|....
gi 30695804    913 PGPGQYAPSGPSQLGQYPNPkmPQ 936
Cdd:pfam15240  146 PGNPQGPPQRPPQPGNPQGP--PQ 167
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
568-781 9.66e-07

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 51.79  E-value: 9.66e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    568 IQRALIVGDYKEAVDQCItANKM-ADALVIAHVGGTALWESTREKYLKTSsapyMKVVSAMVNNDLRsLIY--------- 637
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFVRSE----FKGSNNKSGESLA-ALYqvfagnsee 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    638 -----TRSHKF-------WKETLALLCTFAQGEQWTTLCdALASKLMAAGNTLAAVLCYICAGNVD-RTVEIWSRSLANE 704
Cdd:pfam12931   75 avdelVPPSKNalwaldnWRETLALVLSNRSPGDVEALL-ALGDLLAQYGRTEAAHICFLLAGLPLsQTVLLGADHVRFP 153
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    705 RDGRSYAE--LLQDLMEktLVLALATGNKKFSASLC----KLFesYAEILASQGLLTTAMKY-------LKVLDSG---- 767
Cdd:pfam12931  154 STFGNDLEsiLLTEIYE--YALSLSPPQPPFVGLPHllpyKLQ--HAAVLAEYGLVSEAQKYcdaitasLKSLTKKspyy 229
                          250
                   ....*....|....*.
gi 30695804    768 --GLSPELSILRDRIS 781
Cdd:pfam12931  230 hpTLLAQLEDLSNRLS 245
PHA03247 PHA03247
large tegument protein UL36; Provisional
786-1020 1.09e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.40  E-value: 1.09e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   786 PETNTTASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPYTDSYYVPQVSHPPMQQPTMF-MPHQAQPAPQPSfTPA 864
Cdd:PHA03247 2631 PSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGsLTSLADPPPPPP-TPE 2709
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   865 PtsnaQPSMRTTFVPsTPPALKNADQyqqptmsshsfTGPSNNAYPVPPGP--GQYAPSGPSQLGQYPNPKMPQVVAPAA 942
Cdd:PHA03247 2710 P----APHALVSATP-LPPGPAAARQ-----------ASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPAPAPPA 2773
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   943 GPIGFTPMAT--PGVAPRSVQ----PASPPTQQAAAQAAPAPATPPPTVQTADTSNVPAHQKPVIATLTRLFNETSEALG 1016
Cdd:PHA03247 2774 APAAGPPRRLtrPAVASLSESreslPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLG 2853

                  ....
gi 30695804  1017 GARA 1020
Cdd:PHA03247 2854 GSVA 2857
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
783-1005 2.01e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 52.08  E-value: 2.01e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    783 SAEPETNTTASGNTQP-----QSTMPYNQEPTQAQPNVLAnPYDNQYQQPYTDSyyvPQVSHPPMQQPTMFMPHQAQPAP 857
Cdd:pfam03154  262 SPQPLPQPSLHGQMPPmphslQTGPSHMQHPVPPQPFPLT-PQSSQSQVPPGPS---PAAPGQSQQRIHTPPSQSQLQSQ 337
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    858 QPS----FTPAPTSnaQPSMRTTfvPSTP-PALKNADQYQQPTMSS--HSFTGPSNnaYPVPPG-------PGQYAPSG- 922
Cdd:pfam03154  338 QPPreqpLPPAPLS--MPHIKPP--PTTPiPQLPNPQSHKHPPHLSgpSPFQMNSN--LPPPPAlkplsslSTHHPPSAh 411
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    923 --PSQLgqypNPKMPQVVAPAAGPIGFTPMAT---------PGVAPRSVQPASPPTQQAAAQAAPAPATPPPTVQTADTS 991
Cdd:pfam03154  412 ppPLQL----MPQSQQLPPPPAQPPVLTQSQSlpppaashpPTSGLHQVPSQSPFPQHPFVPGGPPPITPPSGPPTSTSS 487
                          250
                   ....*....|....
gi 30695804    992 NVPAHQKPVIATLT 1005
Cdd:pfam03154  488 AMPGIQPPSSASVS 501
PRK10263 PRK10263
DNA translocase FtsK; Provisional
797-957 2.75e-06

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 51.62  E-value: 2.75e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   797 QPQStmpYNQEPTQAQPNVlanPYDNQYQQPYTDSYYVPQVSHPPMQQPTMFMPHQAQPAPQPSFTPAPTsnaQPSMRTT 876
Cdd:PRK10263  376 APEG---YPQQSQYAQPAV---QYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPE---QPVAGNA 446
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   877 FVPSTP-------PALKNADQYQQPTMSSHSFTGPSNNAYP--VPPGPG--QYAPSGP----------------SQLGQY 929
Cdd:PRK10263  447 WQAEEQqstfapqSTYQTEQTYQQPAAQEPLYQQPQPVEQQpvVEPEPVveETKPARPplyyfeeveekrarerEQLAAW 526
                         170       180       190
                  ....*....|....*....|....*....|...
gi 30695804   930 ----PNP-KMPQVVAPAAGPigFTPMATPGVAP 957
Cdd:PRK10263  527 yqpiPEPvKEPEPIKSSLKA--PSVAAVPPVEA 557
DUF3824 pfam12868
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It ...
819-922 6.02e-06

Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known.


Pssm-ID: 372351 [Multi-domain]  Cd Length: 145  Bit Score: 47.04  E-value: 6.02e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    819 PYDNQYQQPYTDSYYVPQVSHPPMQqptmfmPHQAQPAPQP---SFTPAPTSNAQPSmrTTFVPSTPPALKNADQYQQPT 895
Cdd:pfam12868   43 RYEDDYRDYYEDPYSPSPYPPSPAG------PYASQGQYYPetnYFPPPPGSTPQPP--VDPQPNAPPPPYNPADYPPPP 114
                           90       100
                   ....*....|....*....|....*..
gi 30695804    896 MSSHSftgPSNNAYPVPPGPGQYAPSG 922
Cdd:pfam12868  115 GAAPP---PQPYQYPPPPGPDPYAPRP 138
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
785-1010 1.13e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 49.65  E-value: 1.13e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    785 EPETNTTASGNTQPQSTMP-YNQEPTQAQPNvlANPYDNQYQqPYTDSYYVP--QVSH-----PPMQQPtmFMPHQAQPA 856
Cdd:pfam09770  106 QPAARAAQSSAQPPASSLPqYQYASQQSQQP--SKPVRTGYE-KYKEPEPIPdlQVDAslwgvAPKKAA--APAPAPQPA 180
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    857 PQPSFTPAPTSN------------AQPSMRTTFVPS---TPPALKNADQYQQPTMSSHSFTGPSNNAYPVPPGPGQYAPS 921
Cdd:pfam09770  181 AQPASLPAPSRKmmsleeveaamrAQAKKPAQQPAPapaQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQG 260
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    922 GPSQLGQYPNPKMPQVVAPAAGPIGFTpmATPGVAPRSVQPAS--------PPTQQAAAQAAPAPATPPPTVQTADTSNV 993
Cdd:pfam09770  261 HPVTILQRPQSPQPDPAQPSIQPQAQQ--FHQQPPPVPVQPTQilqnpnrlSAARVGYPQNPQPGVQPAPAHQAHRQQGS 338
                          250       260
                   ....*....|....*....|
gi 30695804    994 PAHQKPVIAT---LTRLFNE 1010
Cdd:pfam09770  339 FGRQAPIITHpqqLAQLSEE 358
PRK10263 PRK10263
DNA translocase FtsK; Provisional
783-911 1.39e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 49.31  E-value: 1.39e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   783 SAEPETNTTASGNTQPQSTMPYNQEPTQ---AQPNVLANPyDNQYQQPYTDSYYVPQVSHPpmQQPTMFMPHQAQP---- 855
Cdd:PRK10263  748 IVEPVQQPQQPVAPQQQYQQPQQPVAPQpqyQQPQQPVAP-QPQYQQPQQPVAPQPQYQQP--QQPVAPQPQYQQPqqpv 824
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 30695804   856 APQPSFTpAPTSNAQPSMRTTFVpsTPPALKNADQ--YQQPTMSSHS---FTGPSNNAYPV 911
Cdd:PRK10263  825 APQPQYQ-QPQQPVAPQPQDTLL--HPLLMRNGDSrpLHKPTTPLPSldlLTPPPSEVEPV 882
YppG COG5894
Spore coat protein YppG [Cell cycle control, cell division, chromosome partitioning];
795-861 1.84e-05

Spore coat protein YppG [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 444596 [Multi-domain]  Cd Length: 112  Bit Score: 44.85  E-value: 1.84e-05
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 30695804  795 NTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPY---TDSYYvPQvsHPPmqQPTMFM---PHQAQPAPQPSF 861
Cdd:COG5894    2 HWYQRNMNMYHYARPALRPEQPYGPYQNQHQQPYyqqTNTQQ-PF--PPP--SPTPYPspkPLQTQPSQFQSL 69
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
114-151 3.08e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 41.91  E-value: 3.08e-05
                            10        20        30
                    ....*....|....*....|....*....|....*...
gi 30695804     114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWD 151
Cdd:smart00320    4 LLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
797-985 3.35e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 48.22  E-value: 3.35e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    797 QPQSTMPYNQEPTQAQPNV-------LANPYDNQYQQPYTDSyyVPQVSHPPMQQP----TMFMPH---------QAQPA 856
Cdd:pfam03154  291 HPVPPQPFPLTPQSSQSQVppgpspaAPGQSQQRIHTPPSQS--QLQSQQPPREQPlppaPLSMPHikpppttpiPQLPN 368
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    857 PQ----PSFTPAPT-----SNAQP--------SMRTTFVPST-PPALKNADQYQ-------QPTMSSHSFTGPSNNAYPV 911
Cdd:pfam03154  369 PQshkhPPHLSGPSpfqmnSNLPPppalkplsSLSTHHPPSAhPPPLQLMPQSQqlppppaQPPVLTQSQSLPPPAASHP 448
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 30695804    912 PPGPGQYAPSGPSqLGQYP-NPKMPQVVAPAAGPIGFTPMATPGVAPRSVQPASPPTQQAAAqaapapatPPPTV 985
Cdd:pfam03154  449 PTSGLHQVPSQSP-FPQHPfVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAA--------VSCPL 514
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
797-865 6.46e-05

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 44.78  E-value: 6.46e-05
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695804     797 QPQSTMPYNQE--PTQAQPNVLANPYDNQYQQPYTDsyyvPQVSHPPMQQPT---MFMPHQAQPAPQPSFTPAP 865
Cdd:smart00818   71 QPLMPVPGQHSmtPTQHHQPNLPQPAQQPFQPQPLQ----PPQPQQPMQPQPpvhPIPPLPPQPPLPPMFPMQP 140
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
253-292 8.08e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 40.76  E-value: 8.08e-05
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 30695804     253 MSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWD 292
Cdd:smart00320    2 GELLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
WD40 pfam00400
WD domain, G-beta repeat;
114-151 9.68e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 40.79  E-value: 9.68e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 30695804    114 LVGHLSVHKGPVRGLEFNAiSSNLLASGADDGEICIWD 151
Cdd:pfam00400    3 LLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
PRK13335 PRK13335
superantigen-like protein SSL3; Reviewed;
779-894 1.28e-04

superantigen-like protein SSL3; Reviewed;


Pssm-ID: 139494 [Multi-domain]  Cd Length: 356  Bit Score: 45.50  E-value: 1.28e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   779 RISLSAEPETNTTASGNTQPQSTMPYNQEPTQAQPNVLANPYDnQYQQPYTDSYYVPQVSHPPmqQPTMFMPhQAQP--- 855
Cdd:PRK13335   51 MINITAGANSATTQAANTRQERTPKLEKAPNTNEEKTSASKIE-KISQPKQEEQKSLNISATP--APKQEQS-QTTTest 126
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|
gi 30695804   856 APQPSFTPAPTSNA-QPSMRTTFVPSTPPALKNADQYQQP 894
Cdd:PRK13335  127 TPKTKVTTPPSTNTpQPMQSTKSDTPQSPTIKQAQTDMTP 166
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
782-954 1.64e-04

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 45.77  E-value: 1.64e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    782 LSAEPETNTTASGNTQPQSTMPYNQEPT-QAQPNVLANPYDNQYQQPYTDSYYVPQ------VSHPPMQQPTMFMphqaQ 854
Cdd:pfam09606  298 MSIGDQNNYQQQQTRQQQQQQGGNHPAAhQQQMNQSVGQGGQVVALGGLNHLETWNpgnfggLGANPMQRGQPGM----M 373
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    855 PAPQPsftpaptsnaqpsmrttfVPSTPPALKNADQYQQPTMSSHsftgpsnNAYPVPPG--PGQYAPSGPS---QLGQY 929
Cdd:pfam09606  374 SSPSP------------------VPGQQVRQVTPNQFMRQSPQPS-------VPSPQGPGsqPPQSHPGGMIpspALIPS 428
                          170       180       190
                   ....*....|....*....|....*....|
gi 30695804    930 PNPKMPQVVA-----PAAGPIGftPMATPG 954
Cdd:pfam09606  429 PSPQMSQQPAqqrtiGQDSPGG--SLNTPG 456
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
792-944 2.17e-04

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 45.02  E-value: 2.17e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  792 ASGNTQP-QST-MPYNQEPTQAQPNVLANPYDNQYQQPytdsyyVPQVSHppMQQPTMFM--PHQAQpapQPSFTPAPTS 867
Cdd:cd22553  188 AGGGNQAlQAQvIPQLAQAAQLQPQQLAQVSSQGYIQQ------IPANAS--QQQPQMVQqgPNQSG---QIIGQVASAS 256
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804  868 NAQPSMRTTFVPSTPPALKNADQYQQ----PTMS----SHSFTGPSNNAYPV----------PPGPGQYAPSGPSQLGQY 929
Cdd:cd22553  257 SIQAAAIPLTVYTGALAGQNGSNQQQvgqiVTSPiqgmTQGLTAPASSSIPTvvqqqaiqgnPLPPGTQIIAAGQQLQQD 336
                        170
                 ....*....|....*....
gi 30695804  930 PN-PKMPQVVA---PAAGP 944
Cdd:cd22553  337 PNdPTKWQVVAdgtPGSKK 355
PHA02682 PHA02682
ORF080 virion core protein; Provisional
788-1058 3.74e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 43.70  E-value: 3.74e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   788 TNTTASGNTQ-PQSTMPYNQEPtqAQPNVLANPYDnQYQQPYTDSYYVPQVSHPP--MQQPT---MFMPHQAQPAPQPSF 861
Cdd:PHA02682   19 ADTSSSLFTKcPQATIPAPAAP--CPPDADVDPLD-KYSVKEAGRYYQSRLKANSacMQRPSgqsPLAPSPACAAPAPAC 95
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   862 -TPAPTSNAQPSMRTTFVPSTPPALknadqyqqptmsshSFTGPSNNAYPVPPGPGQYAPsgPSQLGQYPNPKMPQVV-A 939
Cdd:PHA02682   96 pACAPAAPAPAVTCPAPAPACPPAT--------------APTCPPPAVCPAPARPAPACP--PSTRQCPPAPPLPTPKpA 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   940 PAAGPIGFTPMATPGVAPRSVQPaspptqqaaaqaapapatpppTVQTAdtsnvPAhQKPVIAtlTRLFNETSEALGGAR 1019
Cdd:PHA02682  160 PAAKPIFLHNQLPPPDYPAASCP---------------------TIETA-----PA-ASPVLE--PRIPDKIIDADNDDK 210
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*.
gi 30695804  1020 aNTTKKR--EIEDNSRKLGALFVKL-----NSGDISKNAADKLAQL 1058
Cdd:PHA02682  211 -DLIKKElaDIADSVRDLNAESLSLtrdieNAKSTTQAAIDDLRRL 255
WD40 pfam00400
WD domain, G-beta repeat;
253-292 3.80e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 38.87  E-value: 3.80e-04
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|
gi 30695804    253 MSPVREFTGHQRGVIAMEWCPsDSSYLLTCAKDNRTICWD 292
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVWD 39
PHA03247 PHA03247
large tegument protein UL36; Provisional
783-965 4.16e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.93  E-value: 4.16e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   783 SAEPETNTTASGNTQPQSTMPYNQEPTQAQP-------NVLANPYDNQYQQPYTDSYYVPQV-SHPP---MQQPTMFMPH 851
Cdd:PHA03247 2816 AALPPAASPAGPLPPPTSAQPTAPPPPPGPPppslplgGSVAPGGDVRRRPPSRSPAAKPAApARPPvrrLARPAVSRST 2895
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   852 QAQPAPQPSFTPAPTSNAQPSMRTTfvPSTPPALKNADQYQQPTMSSHsftgpsnnayPVPPGPGQYAPSGPSqlGQYPN 931
Cdd:PHA03247 2896 ESFALPPDQPERPPQPQAPPPPQPQ--PQPPPPPQPQPPPPPPPRPQP----------PLAPTTDPAGAGEPS--GAVPQ 2961
                         170       180       190
                  ....*....|....*....|....*....|....
gi 30695804   932 PKMPQVVApaaGPIGFTPMATPGVAPRSVQPASP 965
Cdd:PHA03247 2962 PWLGALVP---GRVAVPRFRVPQPAPSREAPASS 2992
DUF4106 pfam13388
Protein of unknown function (DUF4106); This family of proteins are found in large numbers in ...
786-875 6.18e-04

Protein of unknown function (DUF4106); This family of proteins are found in large numbers in the Trichomonas vaginalis proteome. The function of this protein is unknown.


Pssm-ID: 404296  Cd Length: 431  Bit Score: 43.73  E-value: 6.18e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    786 PETNTTASGNTQPQSTMPYNQEPTQaQPNVlanpyDNQYQQPytdsyyvpqVSHPPMQQPTMFMPHQAQPAPQPSFTPAP 865
Cdd:pfam13388  186 PKTFTSSHGHRHRHAPKPTVQNPAQ-QPTV-----QNPAQQP---------TQQPTVQNPAQQQNPAQQPPPQPAQQPTV 250
                           90
                   ....*....|
gi 30695804    866 TSNAQPSMRT 875
Cdd:pfam13388  251 QNPAQQQPQT 260
PHA03378 PHA03378
EBNA-3B; Provisional
792-960 6.80e-04

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 43.90  E-value: 6.80e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   792 ASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPytdsyyvPQVSHP----PMQQPtmfmphQAQPAPQPSFTPAPTS 867
Cdd:PHA03378  730 APGRARPPAAAPGRARPPAAAPGRARPPAAAPGRAR-------PPAAAPgaptPQPPP------QAPPAPQQRPRGAPTP 796
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   868 NAQPSMRTTFVPSTPPALKnADQYQQPTMSSHSFTGPSNNAYPVPPGPGQYAPSGPsqLGQYPNPKMPQVVAPAAGPIGF 947
Cdd:PHA03378  797 QPPPQAGPTSMQLMPRAAP-GQQGPTKQILRQLLTGGVKRGRPSLKKPAALERQAA--AGPTPSPGSGTSDKIVQAPVFY 873
                         170
                  ....*....|...
gi 30695804   948 TPMATPGVAPRSV 960
Cdd:PHA03378  874 PPVLQPIQVMRQL 886
YppG pfam14179
YppG-like protein; The YppG-like protein family includes the B. subtilis YppG protein, which ...
805-860 1.61e-03

YppG-like protein; The YppG-like protein family includes the B. subtilis YppG protein, which is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 115 and 181 amino acids in length. There are two completely conserved residues (F and G) that may be functionally important.


Pssm-ID: 372950 [Multi-domain]  Cd Length: 101  Bit Score: 39.02  E-value: 1.61e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 30695804    805 NQEPTQAQPNVLANPYDN---QYQQPYTDSYYVPQVSHPPMQQptmfMPHQAQPAPQPS 860
Cdd:pfam14179    1 YQHNSQPYPYFSQQVYQQpvqPQYPPFAPQQYMPQPPMPYMNP----YPKQQPQQQQPS 55
PHA03378 PHA03378
EBNA-3B; Provisional
790-977 1.63e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 42.75  E-value: 1.63e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   790 TTASGNTQPQSTMPYNQEPTQAQPNVLANPYDNQYQQPYTDSYYVPQVSH-PPMQQPTMFMPHQAQPAPQPsFTPAPTsN 868
Cdd:PHA03378  603 SQTPEPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHqPPQVEITPYKPTWTQIGHIP-YQPSPT-G 680
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   869 AQPSMRTTFVPST--PPAlknadqyQQPTMSSHSFTGPSN----NAYPVPPGPGQYAPSGPSQLGQYPNPKMPQVVAP-- 940
Cdd:PHA03378  681 ANTMLPIQWAPGTmqPPP-------RAPTPMRPPAAPPGRaqrpAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPgr 753
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|..
gi 30695804   941 -----AAGPIGFTPMATPGVAPRSVQPASPPTQQAAAQAAPA 977
Cdd:PHA03378  754 arppaAAPGRARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPT 795
PHA03378 PHA03378
EBNA-3B; Provisional
792-964 2.21e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 42.36  E-value: 2.21e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   792 ASGNTQPQSTMPYNQEPTQAQPNVLANPYDnqyqqpytdsyyVPQVSHPPMQQPTMFMPHQAQPAPQPSFTPAPTSNAQP 871
Cdd:PHA03378  690 APGTMQPPPRAPTPMRPPAAPPGRAQRPAA------------ATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPP 757
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   872 SMRTTFVPSTPPALKNADQYQQPTmsshsftgpsnnaypVPPGPGQY-----APSGPSQLGQYPNPKMPQVVAPAAGPIG 946
Cdd:PHA03378  758 AAAPGRARPPAAAPGAPTPQPPPQ---------------APPAPQQRprgapTPQPPPQAGPTSMQLMPRAAPGQQGPTK 822
                         170
                  ....*....|....*...
gi 30695804   947 FTPMATPGVAPRSVQPAS 964
Cdd:PHA03378  823 QILRQLLTGGVKRGRPSL 840
PTZ00421 PTZ00421
coronin; Provisional
93-313 2.37e-03

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 41.80  E-value: 2.37e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    93 DGNIDLWN-PLSLIGSQPSENALvgHLSVHKGPVRGLEFNAISSNLLASGADDGEICIWDLLKPSEpshfpllKGSGSAT 171
Cdd:PTZ00421   97 DGTIMGWGiPEEGLTQNISDPIV--HLQGHTKKVGIVSFHPSAMNVLASAGADMVVNVWDVERGKA-------VEVIKCH 167
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   172 QGEISFISWNRKvQQILASTSYNGTTVIWDLRKQKPIINFADSVRRRCSVLQWNPNVTTQIMVASDDDSSPTLKLWDMRN 251
Cdd:PTZ00421  168 SDQITSLEWNLD-GSLLCTTSKDKKLNIIDPRDGTIVSSVEAHASAKSQRCLWAKRKDLIITLGCSKSQQRQIMLWDTRK 246
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 30695804   252 IMSPVREFTGHQRGVIAMEWCPSDSSYLLTCAK-----------DNR-TICWDTNTAEIVAELPAGNNWNFDVH 313
Cdd:PTZ00421  247 MASPYSTVDLDQSSALFIPFFDEDTNLLYIGSKgegnircfelmNERlTFCSSYSSVEPHKGLCMMPKWSLDTR 320
PRK12757 PRK12757
cell division protein FtsN; Provisional
797-884 5.38e-03

cell division protein FtsN; Provisional


Pssm-ID: 237191 [Multi-domain]  Cd Length: 256  Bit Score: 40.03  E-value: 5.38e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   797 QPQSTMpyNQEPTQAqPNVlanPYDNQYQQPYTDSYYVPQVSHPPMQQPTMFMPHQAQPAPQPSfTPAPTSNAQPSMRTT 876
Cdd:PRK12757   90 QMQADM--RQQPTQL-SEV---PYNEQTPQVPRSTVQIQQQAQQQQPPATTAQPQPVTPPRQTT-APVQPQTPAPVRTQP 162

                  ....*...
gi 30695804   877 FVPSTPPA 884
Cdd:PRK12757  163 AAPVTQAV 170
PRK10263 PRK10263
DNA translocase FtsK; Provisional
815-999 5.38e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 40.84  E-value: 5.38e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   815 VLANPYD-----NQYQQPYTDSYYvPQVSHPPMQQPTmfmphqAQPAPQPSFTPAPTSNAQPSMRTTFVPSTPPALknad 889
Cdd:PRK10263  285 VAADPDDvlfsgNRATQPEYDEYD-PLLNGAPITEPV------AVAAAATTATQSWAAPVEPVTQTPPVASVDVPP---- 353
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804   890 qyQQPTMSSHSFTGPSNNAYPVPPGPGQYAPSgpsqlGQYPNPKMPQvVAPAAGPIgftPMATPGVAPRSVQPASPPTQQ 969
Cdd:PRK10263  354 --AQPTVAWQPVPGPQTGEPVIAPAPEGYPQQ-----SQYAQPAVQY-NEPLQQPV---QPQQPYYAPAAEQPAQQPYYA 422
                         170       180       190
                  ....*....|....*....|....*....|
gi 30695804   970 AAAQAAPAPATPPPTVQTADTSNVPAHQKP 999
Cdd:PRK10263  423 PAPEQPAQQPYYAPAPEQPVAGNAWQAEEQ 452
Gag_spuma pfam03276
Spumavirus gag protein;
857-962 9.36e-03

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 40.12  E-value: 9.36e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 30695804    857 PQPSFTPAPTSNAQPSMRTTfVPSTPPALKNADQyqqpTMSSHSFTGPSNNAYPVPPGPG-QYAPSGP--SQLGQYPN-- 931
Cdd:pfam03276  196 PSLPAIGGIHLPAIPGIHAR-APPGNIARSLGDD----IMPSLGDAGMPQPRFAFHPGNPfAEAEGHPfaEAEGERPRdi 270
                           90       100       110
                   ....*....|....*....|....*....|..
gi 30695804    932 PKMPQVVAPAA-GPIGFTPMATPGVAPRSVQP 962
Cdd:pfam03276  271 PRAPRIDAPSApAIPAIQPIAPPMIPPIGAPI 302
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH