Authors: Olav Lenz, Frank Keul, Sebastian Bremm, Kay Hamacher, Tatiana von Landesberger
Abstract: Proteins are essential parts in all living organisms. They consist of sequences of amino acids. An interaction with reactive agent can stimulate a mutation at a specific position in the sequence. This mutation may set off a chain reaction, which effects other amino acids in the protein. Chain reactions need to be analyzed, as they may invoke unwanted side effects in drug treatment. A mutation chain is represented by a directed acyclic graph, where amino acids are connected by their mutation dependencies. As \ each amino acid may mutate individually, many mutation graphs exist. To determine important impacts of mutations, experts need to analyze and compare common patterns in these mutations graphs. Experts, however, lack suitable tools for this purpose. We present a new system for the search and the exploration of frequent patterns (i.e., motifs) in mutation graphs. We present a fast pattern search algorithm specifically developed for finding biologically relevant patterns in many mutation graphs (i.e., many labeled acyclic directed graphs). Our visualization system allows an interactive exploration and comparison of the found patterns. It enables locating the found patterns in the mutation graphs and in the 3D protein structures. In this way, potentially interesting patterns can be discovered. These patterns serve as starting point for a further biological analysis. In cooperation with biologists, we use our approach for analyzing a real world data set based on multiple HIV protease sequences.