Protein Sequence Analysis Group

Bioinformatics Institute
  Supplementary materials to:
  More than 1001 problems with protein domain databases: transmembrane regions, signalpeptides and the issue of sequence homology

Wing-Cheong Wong, Sebastian Maurer-Stroh, Frank Eisenhaber

Full reference
PLoS Computational Biology, v6, n7, doi:10.1371/journal.pcbi.1000867, PMID: 20686689

List of problematic Pfam domains (Release 23)
List of 'clean' alignments and HMM (non-SP/TM segment >= 40 positions)

Below, we provide three individual lists of domain models from Pfam release 23 (with TM region, with SP region and with both). In a tabulated form, the reader can look up the location of the SP&TM regions in the model, the cleaned alignment (in "aln" format and the HMM derived from it (the latter two as individual files). Further down, we have an additional table with Pfam domain models where the remaining globular part would be shorter than 40 residue positions; therefore, no cleanup alignment or HMM are provided.

  • TM-containing domains (783 of 1050 entries) click here

  • SP-containing domains (127 of 135 entries) click here

  • SP&TM-containing domains (25 of 29 entries) click here

  • SPTM-containing HMM (935 models, 9MB) download
    (file "Pfam_rel23_globalHMM_cleanup.rar" is compressed by WinRAR)

Supplementary information of Supplementary file Table S1, Figure 4
Supplementary information of Supplementary file Table S2, Figure 5
Supplementary information of Supplementary file Table S3
Supplementary information of Table 5
