Combination of direct information methods and alignments improves contact prediction.



Recently, several new contact prediction methods have been published. They use large sets of multiple aligned sequences and assume that correlations between columns in these alignments can be the results of indirect interaction.

These methods are clearly superior to earlier methods when it comes to predicting contacts in proteins. Here, we ask the question if combining different multiple sequence alignments and different prediction programs can improve the predictions further.


We demonstrate that combining predictions from two prediction methods, PSICOV and plmDCA, and two alignment methods, HHblits and jackhmmer at four different e-value cutoffs (1e-40. 1e-10, 1e-4, 1), provides a relative improvement of more than 20% in comparison to the best single method and a general prediction precision of around 70%, while considering top L non-short range contacts (i.e. separated by at least 5 residues in sequence space), for an protein chain of length L.


The source code for PconsC is freely available at http://c.pcons.net/