Order-Sensitive Clustering for Remote Homologous Protein Detection
Traditional sequence alignment methods are effectivein identifying homologous proteins that are highly similar. However, these approaches are not able to perform well when they are dealing with remote homologous proteins (proteins whose 3D structures are similar but their sequencesare not). Recent biological research reveals that protein sequences contain residues that determine the 3Dstructure of proteins.
In this work, we investigate incorporating this information to aid in the clustering of proteindatabases. We capture protein residues in the form of patterns with fixed order among them. First, the significant patternsare extracted from the protein sequences. Based on the extracted patterns, we perform sequence mining to generate the order among them. Finally, we adopt a partition-based method to cluster protein sequences using the patterns and order features.
Experiments on COG and SCOP40 datasets show that our new approach is able to generate high quality clusters that are similar to those determined manually by the biologists.
--- by Jin Chen, Wynne Hsu and Mong Li Lee, ICTAI, 2003.
In this work, we investigate incorporating this information to aid in the clustering of proteindatabases. We capture protein residues in the form of patterns with fixed order among them. First, the significant patternsare extracted from the protein sequences. Based on the extracted patterns, we perform sequence mining to generate the order among them. Finally, we adopt a partition-based method to cluster protein sequences using the patterns and order features.
Experiments on COG and SCOP40 datasets show that our new approach is able to generate high quality clusters that are similar to those determined manually by the biologists.
--- by Jin Chen, Wynne Hsu and Mong Li Lee, ICTAI, 2003.

0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home