NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|21553119|ref|NP_660140|]
View 

EMILIN-2 isoform 1 precursor [Mus musculus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
48-118 5.12e-21

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


:

Pssm-ID: 462204  Cd Length: 69  Bit Score: 87.86  E-value: 5.12e-21
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 21553119     48 KNWCAYivnKNVSCTVQEGSESFIQAQYNCP----WNQMPCPSalvYRVNFRPRFVTRYKIVTQLEWRCCPGFRG 118
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTGTESYVQPVYKPYltwcAGHRRCST---YRTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
C1q super family cl23878
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
929-1067 3.04e-13

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


The actual alignment was detected with superfamily member pfam00386:

Pssm-ID: 420072 [Multi-domain]  Cd Length: 126  Bit Score: 67.31  E-value: 3.04e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    929 FSAGLTQKPFPSDGGVVLFNKVLVNDGDVYNPNTGIFTAPYDGRYLITATLT--PERDTYVEavLSVSNASVAQLHTAGY 1006
Cdd:pfam00386    2 FSAGRTTGLTAPNEQPVRFDKVLTNIGGHYDPATGKFTCPVPGVYYFSYHITtvDGKSLYVS--LVKNGQEVVSFYDQPQ 79
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 21553119   1007 RREFleyhrppgavHTCGGpgafHLIVHLKAGDGVNVVVTGGRLAHTDFDEMYSTFSGvFL 1067
Cdd:pfam00386   80 KGSL----------DVASG----SVVLELQRGDEVWLQLTGYNGLYYDGSDTDSTFSG-FL 125
DR0291 super family cl34310
Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General ...
265-393 8.15e-06

Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General function prediction only];


The actual alignment was detected with superfamily member COG1579:

Pssm-ID: 441187 [Multi-domain]  Cd Length: 236  Bit Score: 48.38  E-value: 8.15e-06
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  265 KDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQ------EAAQGptvTMTTNELYQA------YVDSKIDALRE 332
Cdd:COG1579   34 AELEDELAALEARLEAAKTELEDLEKEIKRLELEIEEVEarikkyEEQLG---NVRNNKEYEAlqkeieSLKRRISDLED 110
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 21553119  333 ELMEGMDRkLADLKNtceyKLVGLQQQCDDygssylgVIELIGEKEASLKKDIADLRAQLQ 393
Cdd:COG1579  111 EILELMER-IEELEE----ELAELEAELAE-------LEAELEEKKAELDEELAELEAELE 159
COG4372 COG4372
Uncharacterized protein, contains DUF3084 domain [Function unknown];
264-581 9.75e-05

Uncharacterized protein, contains DUF3084 domain [Function unknown];


:

Pssm-ID: 443500 [Multi-domain]  Cd Length: 370  Bit Score: 46.05  E-value: 9.75e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  264 MKDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAaqgptvtmttnelyQAYVDSKIDALREELmEGMDRKLA 343
Cdd:COG4372   54 LEQAREELEQLEEELEQARSELEQLEEELEELNEQLQAAQAE--------------LAQAQEELESLQEEA-EELQEELE 118
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  344 DLKNtcEYKLvgLQQQCDDYGSSYLGVIELIGEKEA---SLKKDIADLRAQLQDPVAQPSCCNGQKSSdfgPQIKALDQK 420
Cdd:COG4372  119 ELQK--ERQD--LEQQRKQLEAQIAELQSEIAEREEelkELEEQLESLQEELAALEQELQALSEAEAE---QALDELLKE 191
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  421 IERVA-------EATRMLNGRLDNEFDRLSVPEPDADFDARWTELDARINVTEKNAEEHCFYIEETLRGTINGEVDDLRK 493
Cdd:COG4372  192 ANRNAekeeelaEAEKLIESLPRELAEELLEAKDSLEAKLGLALSALLDALELEEDKEELLEEVILKEIEELELAILVEK 271
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  494 LLNEKIHSLEDRLGIVLQAANSSDVELTPMGPALPE-QPGAENEQVLMELSRLKDKVQVVEDFCLQSLPHGIDGALPSVE 572
Cdd:COG4372  272 DTEEEELEIAALELEALEEAALELKLLALLLNLAALsLIGALEDALLAALLELAKKLELALAILLAELADLLQLLLVGLL 351

                 ....*....
gi 21553119  573 DLTHVSLSL 581
Cdd:COG4372  352 DNDVLELLS 360
Collagen pfam01391
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ...
851-911 1.86e-04

Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.


:

Pssm-ID: 460189 [Multi-domain]  Cd Length: 57  Bit Score: 40.17  E-value: 1.86e-04
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 21553119    851 PGRTGLPFLPGSSGvimETGEAGPPGRMGVSGRGLPRGVDGQMGQ-GPihssEGYAGAPGYP 911
Cdd:pfam01391    3 PGPPGPPGPPGPPG---PPGPPGPPGPPGPPGEPGPPGPPGPPGPpGP----PGAPGAPGPP 57
PHA03378 super family cl33729
EBNA-3B; Provisional
780-924 3.41e-03

EBNA-3B; Provisional


The actual alignment was detected with superfamily member PHA03378:

Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 41.59  E-value: 3.41e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   780 KFQPSAT-EEPSEATEGPSGKTPLESTRPSEEAPTEPPRLTPLPEDPAGPPQTGQQPvlpqRPLQPPPLPAWPGRTGLPF 858
Cdd:PHA03378  599 VPHPSQTpEPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPP----QVEITPYKPTWTQIGHIPY 674
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 21553119   859 LPGSSGVIMETGEAGPPGRMGVSGRGLPRGVDGQMGQGPIHSSegyAGAPGyPKSPPVTTPGVPLP 924
Cdd:PHA03378  675 QPSPTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAPPGRAQRP---AAATG-RARPPAAAPGRARP 736
 
Name Accession Description Interval E-value
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
48-118 5.12e-21

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


Pssm-ID: 462204  Cd Length: 69  Bit Score: 87.86  E-value: 5.12e-21
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 21553119     48 KNWCAYivnKNVSCTVQEGSESFIQAQYNCP----WNQMPCPSalvYRVNFRPRFVTRYKIVTQLEWRCCPGFRG 118
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTGTESYVQPVYKPYltwcAGHRRCST---YRTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
C1q pfam00386
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
929-1067 3.04e-13

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


Pssm-ID: 395310 [Multi-domain]  Cd Length: 126  Bit Score: 67.31  E-value: 3.04e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    929 FSAGLTQKPFPSDGGVVLFNKVLVNDGDVYNPNTGIFTAPYDGRYLITATLT--PERDTYVEavLSVSNASVAQLHTAGY 1006
Cdd:pfam00386    2 FSAGRTTGLTAPNEQPVRFDKVLTNIGGHYDPATGKFTCPVPGVYYFSYHITtvDGKSLYVS--LVKNGQEVVSFYDQPQ 79
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 21553119   1007 RREFleyhrppgavHTCGGpgafHLIVHLKAGDGVNVVVTGGRLAHTDFDEMYSTFSGvFL 1067
Cdd:pfam00386   80 KGSL----------DVASG----SVVLELQRGDEVWLQLTGYNGLYYDGSDTDSTFSG-FL 125
C1Q smart00110
Complement component C1q domain; Globular domain found in many collagens and eponymously in ...
927-1069 9.64e-10

Complement component C1q domain; Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor.


Pssm-ID: 128420  Cd Length: 135  Bit Score: 57.70  E-value: 9.64e-10
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119     927 VSFSAGLTqKPFPSDGGVVLFNKVLVNDGDVYNPNTGIFTAPYDGRYLITATLTpERDTYVEAVLSVSNASVaqlhtagy 1006
Cdd:smart00110    8 SAFSVIRS-NRPPPPGQPIRFDKVLYNQQGHYDPRTGKFTCPVPGVYYFSYHVE-SKGRNVKVSLMKNGIQV-------- 77
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 21553119    1007 RREFLEYHRpPGAVHTCGGpgafhLIVHLKAGDGV---NVVVTGGRLAHTdfdEMYSTFSGVFLYP 1069
Cdd:smart00110   78 MSTYDEYQK-GLYDVASGG-----ALLQLRQGDQVwleLPDEKNGLYAGE---YVDSTFSGFLLFP 134
DR0291 COG1579
Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General ...
265-393 8.15e-06

Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General function prediction only];


Pssm-ID: 441187 [Multi-domain]  Cd Length: 236  Bit Score: 48.38  E-value: 8.15e-06
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  265 KDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQ------EAAQGptvTMTTNELYQA------YVDSKIDALRE 332
Cdd:COG1579   34 AELEDELAALEARLEAAKTELEDLEKEIKRLELEIEEVEarikkyEEQLG---NVRNNKEYEAlqkeieSLKRRISDLED 110
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 21553119  333 ELMEGMDRkLADLKNtceyKLVGLQQQCDDygssylgVIELIGEKEASLKKDIADLRAQLQ 393
Cdd:COG1579  111 EILELMER-IEELEE----ELAELEAELAE-------LEAELEEKKAELDEELAELEAELE 159
COG4372 COG4372
Uncharacterized protein, contains DUF3084 domain [Function unknown];
264-581 9.75e-05

Uncharacterized protein, contains DUF3084 domain [Function unknown];


Pssm-ID: 443500 [Multi-domain]  Cd Length: 370  Bit Score: 46.05  E-value: 9.75e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  264 MKDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAaqgptvtmttnelyQAYVDSKIDALREELmEGMDRKLA 343
Cdd:COG4372   54 LEQAREELEQLEEELEQARSELEQLEEELEELNEQLQAAQAE--------------LAQAQEELESLQEEA-EELQEELE 118
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  344 DLKNtcEYKLvgLQQQCDDYGSSYLGVIELIGEKEA---SLKKDIADLRAQLQDPVAQPSCCNGQKSSdfgPQIKALDQK 420
Cdd:COG4372  119 ELQK--ERQD--LEQQRKQLEAQIAELQSEIAEREEelkELEEQLESLQEELAALEQELQALSEAEAE---QALDELLKE 191
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  421 IERVA-------EATRMLNGRLDNEFDRLSVPEPDADFDARWTELDARINVTEKNAEEHCFYIEETLRGTINGEVDDLRK 493
Cdd:COG4372  192 ANRNAekeeelaEAEKLIESLPRELAEELLEAKDSLEAKLGLALSALLDALELEEDKEELLEEVILKEIEELELAILVEK 271
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  494 LLNEKIHSLEDRLGIVLQAANSSDVELTPMGPALPE-QPGAENEQVLMELSRLKDKVQVVEDFCLQSLPHGIDGALPSVE 572
Cdd:COG4372  272 DTEEEELEIAALELEALEEAALELKLLALLLNLAALsLIGALEDALLAALLELAKKLELALAILLAELADLLQLLLVGLL 351

                 ....*....
gi 21553119  573 DLTHVSLSL 581
Cdd:COG4372  352 DNDVLELLS 360
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
261-614 1.37e-04

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 46.20  E-value: 1.37e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    261 ESGMKDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAAQGPTVTMTTNElyqayvdSKIDALREELMEgMDR 340
Cdd:TIGR02168  746 EERIAQLSKELTELEAEIEELEERLEEAEEELAEAEAEIEELEAQIEQLKEELKALR-------EALDELRAELTL-LNE 817
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    341 KLADLKNTCEyklvGLQQQCDDYGSSylgvIELIGEKEASLKKDIADLRAQLQDPVAQPSccngqKSSDfgpQIKALDQK 420
Cdd:TIGR02168  818 EAANLRERLE----SLERRIAATERR----LEDLEEQIEELSEDIESLAAEIEELEELIE-----ELES---ELEALLNE 881
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    421 IERVAEATRMLNGRLDNefdrlsvpepdadfdarwteLDARINVTEKNaeehcfyieetlrgtiNGEVDDLRKLLNEKIH 500
Cdd:TIGR02168  882 RASLEEALALLRSELEE--------------------LSEELRELESK----------------RSELRRELEELREKLA 925
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    501 SLEDRL-GIVLQAAN-----SSDVELTPMGP-ALPEQPGAENEQVLMELSRLKDKVQ--------VVEDFC-----LQSL 560
Cdd:TIGR02168  926 QLELRLeGLEVRIDNlqerlSEEYSLTLEEAeALENKIEDDEEEARRRLKRLENKIKelgpvnlaAIEEYEelkerYDFL 1005
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|....
gi 21553119    561 PHGIDGALPSVEDLthvsLSLLESLNDTMHRQFQETshsIQKLQEDVNALHSQL 614
Cdd:TIGR02168 1006 TAQKEDLTEAKETL----EEAIEEIDREARERFKDT---FDQVNENFQRVFPKL 1052
Collagen pfam01391
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ...
851-911 1.86e-04

Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.


Pssm-ID: 460189 [Multi-domain]  Cd Length: 57  Bit Score: 40.17  E-value: 1.86e-04
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 21553119    851 PGRTGLPFLPGSSGvimETGEAGPPGRMGVSGRGLPRGVDGQMGQ-GPihssEGYAGAPGYP 911
Cdd:pfam01391    3 PGPPGPPGPPGPPG---PPGPPGPPGPPGPPGEPGPPGPPGPPGPpGP----PGAPGAPGPP 57
Cast pfam10174
RIM-binding protein of the cytomatrix active zone; This is a family of proteins that form part ...
269-442 4.72e-04

RIM-binding protein of the cytomatrix active zone; This is a family of proteins that form part of the CAZ (cytomatrix at the active zone) complex which is involved in determining the site of synaptic vesicle fusion. The C-terminus is a PDZ-binding motif that binds directly to RIM (a small G protein Rab-3A effector). The family also contains four coiled-coil domains.


Pssm-ID: 431111 [Multi-domain]  Cd Length: 766  Bit Score: 44.43  E-value: 4.72e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    269 SELAEVKDTLKTKS-------DKLEELDGKVKGYEGQLKQLQEAAQG------PTVT-MTTNELYQAYVDSKIDALRE-- 332
Cdd:pfam10174  380 GEIRDLKDMLDVKErkinvlqKKIENLQEQLRDKDKQLAGLKERVKSlqtdssNTDTaLTTLEEALSEKERIIERLKEqr 459
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    333 --------ELMEGMDRKLADLKNTC----------EYKLVGLQQQCDDYGSSYLGVIELIGEKEASLKKDIAD-LRAQLQ 393
Cdd:pfam10174  460 eredrerlEELESLKKENKDLKEKVsalqpeltekESSLIDLKEHASSLASSGLKKDSKLKSLEIAVEQKKEEcSKLENQ 539
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 21553119    394 DPVAQPSCCNGQKSSDFGPQIKALDQKIERVAEATrmlnGRLDNEFDRL 442
Cdd:pfam10174  540 LKKAHNAEEAVRTNPEINDRIRLLEQEVARYKEES----GKAQAEVERL 584
SPEC cd00176
Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members ...
266-471 7.66e-04

Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here


Pssm-ID: 238103 [Multi-domain]  Cd Length: 213  Bit Score: 42.05  E-value: 7.66e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  266 DIKSELAEVKDTLKtksdKLEELDGKVKGYEGQLKQLQEAAQgptvTMTTNELYQA-YVDSKIDALR---EELMEGMDRK 341
Cdd:cd00176   27 DYGDDLESVEALLK----KHEALEAELAAHEERVEALNELGE----QLIEEGHPDAeEIQERLEELNqrwEELRELAEER 98
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  342 LADLKNtcEYKLVGLQQQCDDygssylgVIELIGEKEASLK-----KDIADLRAQLqdpvaqpsccngQKSSDFGPQIKA 416
Cdd:cd00176   99 RQRLEE--ALDLQQFFRDADD-------LEQWLEEKEAALAsedlgKDLESVEELL------------KKHKELEEELEA 157
                        170       180       190       200       210
                 ....*....|....*....|....*....|....*....|....*....|....*
gi 21553119  417 LDQKIERVAEATRMLNGRLdNEFDRLSVPEPDADFDARWTELDARINVTEKNAEE 471
Cdd:cd00176  158 HEPRLKSLNELAEELLEEG-HPDADEEIEEKLEELNERWEELLELAEERQKKLEE 211
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
265-507 1.18e-03

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 43.13  E-value: 1.18e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   265 KDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAAqgptvtmttnelyqayvdSKIDALREEL--MEGMDRKL 342
Cdd:PRK03918  196 KEKEKELEEVLREINEISSELPELREELEKLEKEVKELEELK------------------EEIEELEKELesLEGSKRKL 257
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   343 -ADLKNTCEY------KLVGLQQQCDD------YGSSYLGVIEL---IGEKEASLKKDIADLRAQLQdpvaqpsccNGQK 406
Cdd:PRK03918  258 eEKIRELEERieelkkEIEELEEKVKElkelkeKAEEYIKLSEFyeeYLDELREIEKRLSRLEEEIN---------GIEE 328
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   407 ssdfgpQIKALDQKIERVAEatrmLNGRLDNEFDRLSVPEPDAdfdarwtELDARINVTEKNAEEH-----CFYIEEtlr 481
Cdd:PRK03918  329 ------RIKELEEKEERLEE----LKKKLKELEKRLEELEERH-------ELYEEAKAKKEELERLkkrltGLTPEK--- 388
                         250       260
                  ....*....|....*....|....*....
gi 21553119   482 gtINGEVDDLRKL---LNEKIHSLEDRLG 507
Cdd:PRK03918  389 --LEKELEELEKAkeeIEEEISKITARIG 415
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
248-426 2.84e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 41.98  E-value: 2.84e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    248 AEPSQLPGIPSSKESGMKDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAAQGPTVTMttNELYQ--AYVDS 325
Cdd:TIGR02169  301 AEIASLERSIAEKERELEDAEERLAKLEAEIDKLLAEIEELEREIEEERKRRDKLTEEYAELKEEL--EDLRAelEEVDK 378
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    326 KIDALREELM------EGMDRKLADLKNTCEYKLVGLQQqcddygssylgvielIGEKEASLKKDIADLRAQLQDPVAQp 399
Cdd:TIGR02169  379 EFAETRDELKdyreklEKLKREINELKRELDRLQEELQR---------------LSEELADLNAAIAGIEAKINELEEE- 442
                          170       180
                   ....*....|....*....|....*..
gi 21553119    400 sccngqkSSDFGPQIKALDQKIERVAE 426
Cdd:TIGR02169  443 -------KEDKALEIKKQEWKLEQLAA 462
PHA03378 PHA03378
EBNA-3B; Provisional
780-924 3.41e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 41.59  E-value: 3.41e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   780 KFQPSAT-EEPSEATEGPSGKTPLESTRPSEEAPTEPPRLTPLPEDPAGPPQTGQQPvlpqRPLQPPPLPAWPGRTGLPF 858
Cdd:PHA03378  599 VPHPSQTpEPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPP----QVEITPYKPTWTQIGHIPY 674
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 21553119   859 LPGSSGVIMETGEAGPPGRMGVSGRGLPRGVDGQMGQGPIHSSegyAGAPGyPKSPPVTTPGVPLP 924
Cdd:PHA03378  675 QPSPTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAPPGRAQRP---AAATG-RARPPAAAPGRARP 736
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
782-922 7.18e-03

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 40.27  E-value: 7.18e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   782 QPSATEEPSEATE-GPSGKT-PLESTRPSEEAPTEPPRLTPLPEDPAGPPQTGQQPVLPQRPLQPPPLPAWP-------- 851
Cdd:NF038329  181 EAGAKGPAGEKGPqGPRGETgPAGEQGPAGPAGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPdgpagkdg 260
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   852 --GRTGLPFLPGSSGVIMETGEAGPPGRMGVSG-RGLP--------------RGVDGQMGQGPIHSSEGYAGAPGYPKSP 914
Cdd:NF038329  261 prGDRGEAGPDGPDGKDGERGPVGPAGKDGQNGkDGLPgkdgkdgqngkdglPGKDGKDGQPGKDGLPGKDGKDGQPGKP 340

                  ....*...
gi 21553119   915 PVTTPGVP 922
Cdd:NF038329  341 APKTPEVP 348
 
Name Accession Description Interval E-value
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
48-118 5.12e-21

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


Pssm-ID: 462204  Cd Length: 69  Bit Score: 87.86  E-value: 5.12e-21
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 21553119     48 KNWCAYivnKNVSCTVQEGSESFIQAQYNCP----WNQMPCPSalvYRVNFRPRFVTRYKIVTQLEWRCCPGFRG 118
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTGTESYVQPVYKPYltwcAGHRRCST---YRTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
C1q pfam00386
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
929-1067 3.04e-13

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


Pssm-ID: 395310 [Multi-domain]  Cd Length: 126  Bit Score: 67.31  E-value: 3.04e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    929 FSAGLTQKPFPSDGGVVLFNKVLVNDGDVYNPNTGIFTAPYDGRYLITATLT--PERDTYVEavLSVSNASVAQLHTAGY 1006
Cdd:pfam00386    2 FSAGRTTGLTAPNEQPVRFDKVLTNIGGHYDPATGKFTCPVPGVYYFSYHITtvDGKSLYVS--LVKNGQEVVSFYDQPQ 79
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 21553119   1007 RREFleyhrppgavHTCGGpgafHLIVHLKAGDGVNVVVTGGRLAHTDFDEMYSTFSGvFL 1067
Cdd:pfam00386   80 KGSL----------DVASG----SVVLELQRGDEVWLQLTGYNGLYYDGSDTDSTFSG-FL 125
C1Q smart00110
Complement component C1q domain; Globular domain found in many collagens and eponymously in ...
927-1069 9.64e-10

Complement component C1q domain; Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor.


Pssm-ID: 128420  Cd Length: 135  Bit Score: 57.70  E-value: 9.64e-10
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119     927 VSFSAGLTqKPFPSDGGVVLFNKVLVNDGDVYNPNTGIFTAPYDGRYLITATLTpERDTYVEAVLSVSNASVaqlhtagy 1006
Cdd:smart00110    8 SAFSVIRS-NRPPPPGQPIRFDKVLYNQQGHYDPRTGKFTCPVPGVYYFSYHVE-SKGRNVKVSLMKNGIQV-------- 77
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 21553119    1007 RREFLEYHRpPGAVHTCGGpgafhLIVHLKAGDGV---NVVVTGGRLAHTdfdEMYSTFSGVFLYP 1069
Cdd:smart00110   78 MSTYDEYQK-GLYDVASGG-----ALLQLRQGDQVwleLPDEKNGLYAGE---YVDSTFSGFLLFP 134
DR0291 COG1579
Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General ...
265-393 8.15e-06

Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General function prediction only];


Pssm-ID: 441187 [Multi-domain]  Cd Length: 236  Bit Score: 48.38  E-value: 8.15e-06
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  265 KDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQ------EAAQGptvTMTTNELYQA------YVDSKIDALRE 332
Cdd:COG1579   34 AELEDELAALEARLEAAKTELEDLEKEIKRLELEIEEVEarikkyEEQLG---NVRNNKEYEAlqkeieSLKRRISDLED 110
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 21553119  333 ELMEGMDRkLADLKNtceyKLVGLQQQCDDygssylgVIELIGEKEASLKKDIADLRAQLQ 393
Cdd:COG1579  111 EILELMER-IEELEE----ELAELEAELAE-------LEAELEEKKAELDEELAELEAELE 159
COG4372 COG4372
Uncharacterized protein, contains DUF3084 domain [Function unknown];
264-581 9.75e-05

Uncharacterized protein, contains DUF3084 domain [Function unknown];


Pssm-ID: 443500 [Multi-domain]  Cd Length: 370  Bit Score: 46.05  E-value: 9.75e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  264 MKDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAaqgptvtmttnelyQAYVDSKIDALREELmEGMDRKLA 343
Cdd:COG4372   54 LEQAREELEQLEEELEQARSELEQLEEELEELNEQLQAAQAE--------------LAQAQEELESLQEEA-EELQEELE 118
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  344 DLKNtcEYKLvgLQQQCDDYGSSYLGVIELIGEKEA---SLKKDIADLRAQLQDPVAQPSCCNGQKSSdfgPQIKALDQK 420
Cdd:COG4372  119 ELQK--ERQD--LEQQRKQLEAQIAELQSEIAEREEelkELEEQLESLQEELAALEQELQALSEAEAE---QALDELLKE 191
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  421 IERVA-------EATRMLNGRLDNEFDRLSVPEPDADFDARWTELDARINVTEKNAEEHCFYIEETLRGTINGEVDDLRK 493
Cdd:COG4372  192 ANRNAekeeelaEAEKLIESLPRELAEELLEAKDSLEAKLGLALSALLDALELEEDKEELLEEVILKEIEELELAILVEK 271
                        250       260       270       280       290       300       310       320
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  494 LLNEKIHSLEDRLGIVLQAANSSDVELTPMGPALPE-QPGAENEQVLMELSRLKDKVQVVEDFCLQSLPHGIDGALPSVE 572
Cdd:COG4372  272 DTEEEELEIAALELEALEEAALELKLLALLLNLAALsLIGALEDALLAALLELAKKLELALAILLAELADLLQLLLVGLL 351

                 ....*....
gi 21553119  573 DLTHVSLSL 581
Cdd:COG4372  352 DNDVLELLS 360
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
261-614 1.37e-04

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 46.20  E-value: 1.37e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    261 ESGMKDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAAQGPTVTMTTNElyqayvdSKIDALREELMEgMDR 340
Cdd:TIGR02168  746 EERIAQLSKELTELEAEIEELEERLEEAEEELAEAEAEIEELEAQIEQLKEELKALR-------EALDELRAELTL-LNE 817
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    341 KLADLKNTCEyklvGLQQQCDDYGSSylgvIELIGEKEASLKKDIADLRAQLQDPVAQPSccngqKSSDfgpQIKALDQK 420
Cdd:TIGR02168  818 EAANLRERLE----SLERRIAATERR----LEDLEEQIEELSEDIESLAAEIEELEELIE-----ELES---ELEALLNE 881
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    421 IERVAEATRMLNGRLDNefdrlsvpepdadfdarwteLDARINVTEKNaeehcfyieetlrgtiNGEVDDLRKLLNEKIH 500
Cdd:TIGR02168  882 RASLEEALALLRSELEE--------------------LSEELRELESK----------------RSELRRELEELREKLA 925
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    501 SLEDRL-GIVLQAAN-----SSDVELTPMGP-ALPEQPGAENEQVLMELSRLKDKVQ--------VVEDFC-----LQSL 560
Cdd:TIGR02168  926 QLELRLeGLEVRIDNlqerlSEEYSLTLEEAeALENKIEDDEEEARRRLKRLENKIKelgpvnlaAIEEYEelkerYDFL 1005
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|....
gi 21553119    561 PHGIDGALPSVEDLthvsLSLLESLNDTMHRQFQETshsIQKLQEDVNALHSQL 614
Cdd:TIGR02168 1006 TAQKEDLTEAKETL----EEAIEEIDREARERFKDT---FDQVNENFQRVFPKL 1052
Collagen pfam01391
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ...
851-911 1.86e-04

Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.


Pssm-ID: 460189 [Multi-domain]  Cd Length: 57  Bit Score: 40.17  E-value: 1.86e-04
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 21553119    851 PGRTGLPFLPGSSGvimETGEAGPPGRMGVSGRGLPRGVDGQMGQ-GPihssEGYAGAPGYP 911
Cdd:pfam01391    3 PGPPGPPGPPGPPG---PPGPPGPPGPPGPPGEPGPPGPPGPPGPpGP----PGAPGAPGPP 57
COG4913 COG4913
Uncharacterized conserved protein, contains a C-terminal ATPase domain [Function unknown];
265-547 4.16e-04

Uncharacterized conserved protein, contains a C-terminal ATPase domain [Function unknown];


Pssm-ID: 443941 [Multi-domain]  Cd Length: 1089  Bit Score: 44.52  E-value: 4.16e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  265 KDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAAQgptvtMTTNELYQAYVDSKIDALREELmEGMDRKLAD 344
Cdd:COG4913  613 AALEAELAELEEELAEAEERLEALEAELDALQERREALQRLAE-----YSWDEIDVASAEREIAELEAEL-ERLDASSDD 686
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  345 LKntceyklvGLQQQcddygssylgvieligekEASLKKDIADLRAQLQDpvaqpscCNGqkssdfgpQIKALDQKIERV 424
Cdd:COG4913  687 LA--------ALEEQ------------------LEELEAELEELEEELDE-------LKG--------EIGRLEKELEQA 725
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  425 AEATRMLNGRLDnEFDRLSVPEPDADFDARWTELDarinvteknAEEHcfyiEETLRGTINGEVDDLRkllnEKIHSLED 504
Cdd:COG4913  726 EEELDELQDRLE-AAEDLARLELRALLEERFAAAL---------GDAV----ERELRENLEERIDALR----ARLNRAEE 787
                        250       260       270       280
                 ....*....|....*....|....*....|....*....|....
gi 21553119  505 RLGIVLQAANSS-DVELTPMGPALpeqpgAENEQVLMELSRLKD 547
Cdd:COG4913  788 ELERAMRAFNREwPAETADLDADL-----ESLPEYLALLDRLEE 826
Cast pfam10174
RIM-binding protein of the cytomatrix active zone; This is a family of proteins that form part ...
269-442 4.72e-04

RIM-binding protein of the cytomatrix active zone; This is a family of proteins that form part of the CAZ (cytomatrix at the active zone) complex which is involved in determining the site of synaptic vesicle fusion. The C-terminus is a PDZ-binding motif that binds directly to RIM (a small G protein Rab-3A effector). The family also contains four coiled-coil domains.


Pssm-ID: 431111 [Multi-domain]  Cd Length: 766  Bit Score: 44.43  E-value: 4.72e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    269 SELAEVKDTLKTKS-------DKLEELDGKVKGYEGQLKQLQEAAQG------PTVT-MTTNELYQAYVDSKIDALRE-- 332
Cdd:pfam10174  380 GEIRDLKDMLDVKErkinvlqKKIENLQEQLRDKDKQLAGLKERVKSlqtdssNTDTaLTTLEEALSEKERIIERLKEqr 459
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    333 --------ELMEGMDRKLADLKNTC----------EYKLVGLQQQCDDYGSSYLGVIELIGEKEASLKKDIAD-LRAQLQ 393
Cdd:pfam10174  460 eredrerlEELESLKKENKDLKEKVsalqpeltekESSLIDLKEHASSLASSGLKKDSKLKSLEIAVEQKKEEcSKLENQ 539
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 21553119    394 DPVAQPSCCNGQKSSDFGPQIKALDQKIERVAEATrmlnGRLDNEFDRL 442
Cdd:pfam10174  540 LKKAHNAEEAVRTNPEINDRIRLLEQEVARYKEES----GKAQAEVERL 584
SPEC cd00176
Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members ...
266-471 7.66e-04

Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here


Pssm-ID: 238103 [Multi-domain]  Cd Length: 213  Bit Score: 42.05  E-value: 7.66e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  266 DIKSELAEVKDTLKtksdKLEELDGKVKGYEGQLKQLQEAAQgptvTMTTNELYQA-YVDSKIDALR---EELMEGMDRK 341
Cdd:cd00176   27 DYGDDLESVEALLK----KHEALEAELAAHEERVEALNELGE----QLIEEGHPDAeEIQERLEELNqrwEELRELAEER 98
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  342 LADLKNtcEYKLVGLQQQCDDygssylgVIELIGEKEASLK-----KDIADLRAQLqdpvaqpsccngQKSSDFGPQIKA 416
Cdd:cd00176   99 RQRLEE--ALDLQQFFRDADD-------LEQWLEEKEAALAsedlgKDLESVEELL------------KKHKELEEELEA 157
                        170       180       190       200       210
                 ....*....|....*....|....*....|....*....|....*....|....*
gi 21553119  417 LDQKIERVAEATRMLNGRLdNEFDRLSVPEPDADFDARWTELDARINVTEKNAEE 471
Cdd:cd00176  158 HEPRLKSLNELAEELLEEG-HPDADEEIEEKLEELNERWEELLELAEERQKKLEE 211
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
265-507 1.18e-03

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 43.13  E-value: 1.18e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   265 KDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAAqgptvtmttnelyqayvdSKIDALREEL--MEGMDRKL 342
Cdd:PRK03918  196 KEKEKELEEVLREINEISSELPELREELEKLEKEVKELEELK------------------EEIEELEKELesLEGSKRKL 257
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   343 -ADLKNTCEY------KLVGLQQQCDD------YGSSYLGVIEL---IGEKEASLKKDIADLRAQLQdpvaqpsccNGQK 406
Cdd:PRK03918  258 eEKIRELEERieelkkEIEELEEKVKElkelkeKAEEYIKLSEFyeeYLDELREIEKRLSRLEEEIN---------GIEE 328
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   407 ssdfgpQIKALDQKIERVAEatrmLNGRLDNEFDRLSVPEPDAdfdarwtELDARINVTEKNAEEH-----CFYIEEtlr 481
Cdd:PRK03918  329 ------RIKELEEKEERLEE----LKKKLKELEKRLEELEERH-------ELYEEAKAKKEELERLkkrltGLTPEK--- 388
                         250       260
                  ....*....|....*....|....*....
gi 21553119   482 gtINGEVDDLRKL---LNEKIHSLEDRLG 507
Cdd:PRK03918  389 --LEKELEELEKAkeeIEEEISKITARIG 415
DR0291 COG1579
Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General ...
266-442 1.70e-03

Predicted nucleic acid-binding protein DR0291, contains C4-type Zn-ribbon domain [General function prediction only];


Pssm-ID: 441187 [Multi-domain]  Cd Length: 236  Bit Score: 41.45  E-value: 1.70e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  266 DIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAAQGPTVTMTTNELYQAYVDSKIDALREELMEGMDRK---- 341
Cdd:COG1579   14 ELDSELDRLEHRLKELPAELAELEDELAALEARLEAAKTELEDLEKEIKRLELEIEEVEARIKKYEEQLGNVRNNKeyea 93
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  342 -------LADLKNTCEYKLVGLQQQCDDYGSsylgVIELIGEKEASLKKDIADLRAQLQDpvaqpsccngqkssdfgpQI 414
Cdd:COG1579   94 lqkeiesLKRRISDLEDEILELMERIEELEE----ELAELEAELAELEAELEEKKAELDE------------------EL 151
                        170       180       190
                 ....*....|....*....|....*....|..
gi 21553119  415 KALDQKIERV----AEATRMLNGRLDNEFDRL 442
Cdd:COG1579  152 AELEAELEELeaerEELAAKIPPELLALYERI 183
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
248-426 2.84e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 41.98  E-value: 2.84e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    248 AEPSQLPGIPSSKESGMKDIKSELAEVKDTLKTKSDKLEELDGKVKGYEGQLKQLQEAAQGPTVTMttNELYQ--AYVDS 325
Cdd:TIGR02169  301 AEIASLERSIAEKERELEDAEERLAKLEAEIDKLLAEIEELEREIEEERKRRDKLTEEYAELKEEL--EDLRAelEEVDK 378
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119    326 KIDALREELM------EGMDRKLADLKNTCEYKLVGLQQqcddygssylgvielIGEKEASLKKDIADLRAQLQDPVAQp 399
Cdd:TIGR02169  379 EFAETRDELKdyreklEKLKREINELKRELDRLQEELQR---------------LSEELADLNAAIAGIEAKINELEEE- 442
                          170       180
                   ....*....|....*....|....*..
gi 21553119    400 sccngqkSSDFGPQIKALDQKIERVAE 426
Cdd:TIGR02169  443 -------KEDKALEIKKQEWKLEQLAA 462
PHA03378 PHA03378
EBNA-3B; Provisional
780-924 3.41e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 41.59  E-value: 3.41e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   780 KFQPSAT-EEPSEATEGPSGKTPLESTRPSEEAPTEPPRLTPLPEDPAGPPQTGQQPvlpqRPLQPPPLPAWPGRTGLPF 858
Cdd:PHA03378  599 VPHPSQTpEPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPP----QVEITPYKPTWTQIGHIPY 674
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 21553119   859 LPGSSGVIMETGEAGPPGRMGVSGRGLPRGVDGQMGQGPIHSSegyAGAPGyPKSPPVTTPGVPLP 924
Cdd:PHA03378  675 QPSPTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAPPGRAQRP---AAATG-RARPPAAAPGRARP 736
COG4487 COG4487
Uncharacterized conserved protein, contains DUF2130 domain [Function unknown];
265-424 3.72e-03

Uncharacterized conserved protein, contains DUF2130 domain [Function unknown];


Pssm-ID: 443580 [Multi-domain]  Cd Length: 425  Bit Score: 41.08  E-value: 3.72e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  265 KDIKSELAEVKDTlktKSDKLEELDGKVKGYEGQLKQLQEAAQgptVTMTTNELYQAY-VDSKIDALREELMEgMDRKLA 343
Cdd:COG4487   40 ADAAKREAALELA---EAKAKAQLQEQVAEKDAEIAELRARLE---AEERKKALAVAEeKEKELAALQEALAE-KDAKLA 112
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119  344 DLKNTcEYKLVGLQQQCDDYGSsylgviELIGEKEASLKKDIADLRAQLQDPVAQPSCCNGQ-KSSDFGPQIKALDQKIE 422
Cdd:COG4487  113 ELQAK-ELELLKKERELEDAKR------EAELTVEKERDEELDELKEKLKKEEEEKQLAEKSlKVAEYEKQLKDMQEQIE 185

                 ..
gi 21553119  423 RV 424
Cdd:COG4487  186 EL 187
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
782-922 7.18e-03

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 40.27  E-value: 7.18e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   782 QPSATEEPSEATE-GPSGKT-PLESTRPSEEAPTEPPRLTPLPEDPAGPPQTGQQPVLPQRPLQPPPLPAWP-------- 851
Cdd:NF038329  181 EAGAKGPAGEKGPqGPRGETgPAGEQGPAGPAGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPdgpagkdg 260
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 21553119   852 --GRTGLPFLPGSSGVIMETGEAGPPGRMGVSG-RGLP--------------RGVDGQMGQGPIHSSEGYAGAPGYPKSP 914
Cdd:NF038329  261 prGDRGEAGPDGPDGKDGERGPVGPAGKDGQNGkDGLPgkdgkdgqngkdglPGKDGKDGQPGKDGLPGKDGKDGQPGKP 340

                  ....*...
gi 21553119   915 PVTTPGVP 922
Cdd:NF038329  341 APKTPEVP 348
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH