Friday, October 08, 2004

Systematic Assessment of High-Throughput Experimental Data for Reliable Protein Interactions using Network Topology

Current protein interaction detection via high-throughput experimental methods such as yeast-two-hybrid has been reported to be highly erroneous. This work introduces a novel measure called IRAP for assessing the reliability of protein interaction based on the underlying topology of the protein interaction network.

A candidateprotein interaction is considered to be reliable if it is involved in a closed loop in which the alternative path of interactions between the two interacting proteins is strong. We design an algorithm to compute the IRAP value for each interaction in aprotein interaction network. Validation of IRAP as a measure for assessing the reliability of protein-protein interactions from conventional high-throughput experiments is performed.

We devise aheuristic algorithm to compute IRAP that is able to achieve a 40% speedup in runtime while maintaining a 95% accuracy.

---by Jin Chen, Wynne Hsu, Mong Li Lee and See-Kiong Ng, ICTAI, 2004

Order-Sensitive Clustering for Remote Homologous Protein Detection

Traditional sequence alignment methods are effectivein identifying homologous proteins that are highly similar. However, these approaches are not able to perform well when they are dealing with remote homologous proteins (proteins whose 3D structures are similar but their sequencesare not). Recent biological research reveals that protein sequences contain residues that determine the 3Dstructure of proteins.

In this work, we investigate incorporating this information to aid in the clustering of proteindatabases. We capture protein residues in the form of patterns with fixed order among them. First, the significant patternsare extracted from the protein sequences. Based on the extracted patterns, we perform sequence mining to generate the order among them. Finally, we adopt a partition-based method to cluster protein sequences using the patterns and order features.

Experiments on COG and SCOP40 datasets show that our new approach is able to generate high quality clusters that are similar to those determined manually by the biologists.

--- by Jin Chen, Wynne Hsu and Mong Li Lee, ICTAI, 2003.