Projektnews SIMAP: Project News August 13, 2007

P3D-Bot

Bot von P3D
Mitglied seit
09.04.2006
Beiträge
47.824
Renomée
180
Standort
Das Boot 3.0
SIMAP provides data for Gene3D: SIMAP has started to provide monthly large-scale protein similarity and feature data for the Gene3D project.The Gene3D project aims to characterise the distribution of protein structural domains in nature and use this information to perform investigations into protein evolution and function. In living cells proteins, encoded by the DNA, form the functional entities. They perform both as catalysts, hence underpinning cellular metabolism, and as structural units, providing structure and organisation to the cell. Almost all proteins are made of one or more domains. Domains are semi-independent protein sub-sequences that form a distinctive topology known as folds, of which there are believed to be only a few thousand with around 20 superfolds dominating the vast majority of domain structures.Gene3D's sister database, CATH, uses a suite of computer tools combined with expert analysis to determine the boundaries of the folds in 3D structural data - such as that produced through X-ray diffraction of crystals - and to place the folds within a hierarchy based on their structural features and likely evolutionary associations. Gene3D then takes the sequences (proteins are made from strings of amino acids) and uses them to build models - known as profile Hidden Markov Models (HMMs) - of the domains. These models specifically identify sequences that are likely to be evolutionary related to the seed CATH domains. From this we can infer that they will form the same structure.There are currently >6000 HMMs in the CATH-Gene3D library. These models are scanned against all known protein sequences (over 7 million) and used to determine their domain composition. This represents a huge amount of computation and is normally only feasible on large computer grids. From comparison of these domain architectures as well as direct analysis of the domain's sequence similarities allow us to transfer experimentally derived knowledge from the very small number of characterised proteins to the very large number that have been deduced from DNA sequencing (i.e. the human genome project). Furthermore it is possible to directly infer functional relationships through the identification of subtle evolutionary signals, such as co-evolution using phylogenetic profiling; essentially the applications are myriad. As a consequence, many investigations based CATH & Gene3D, and more so protein structure in general, have had a significant input into understanding disease states and developing new pharmaceuticals.

Weiterlesen...
 
Zurück
Oben Unten