CATH-Structural database in bioinformatics

Define the term CATH used in the Structural database of bioinformatics?

E

Expert

Verified

CATH term goes like this:

C-CLASS – It is determined according to the composition of the secondary structure and their packing. The three major classes which are recognised are alpha, beta and alpha-beta. The last class includes both alternating alpha-beta and beta –alpha as well as alpha + beta.  It differs from SCOP in that it incorporates some automation in classifying protein structures.

A fourth class has come into existence which comprises of those proteins which have low secondary structure content. Although CATH unlike SCOP is not fully automated

A–ARCHITECTURE – It classifies according to the overall shape of the domain structures which are determined by the orientations of the secondary structures but that ignores any similarity between them. Currently this classification is done manually. Here reference for literature is also found if they are well known.

T-TOPOLOGY – It almost similar to the architecture only difference is that here structural similarity is also taken into account. The algorithm which is followed by this to do this is SSAP which was developed by taylor and Orengo in 1989 and another one is the CATHEDRAL developed by Harrison and et al.

SSAP score of 70 % and where larger proteins matches with the smaller proteins by 60 % are assigned one fold.

Some highly populated fold groups are found in this category such as beta 2- layer sandwich and alpha-beta-3 –layer sandwich.

Other structure based algorithms used by this database are DETECTIVE, PUU and DOMAK

NOTE: Due to how secondary structures are interconnected, varying topologies can still result in the same overall architecture.

H-HOMOLOGUS SUPERFAMILY - Those protein domains which share a common ancestor are categorized here. Similarities between them are found out by the SSAP by sequence profiling or structural similarity finding. the following criteria has to be followed if they have to be categorized under this platform.

• Sequence identity >= 35%, overlap >= 60% of larger structure equivalent to smaller.
• SSAP score >= 80.0, sequence identity >= 20%, 60% of larger structure equivalent to smaller.
• SSAP score >= 70.0, 60% of larger structure equivalent to smaller, and domains which have related functions, which is informed by the literature and Pfam protein family database.
• Significant () similarity from HMM-sequence searches and HMM-HMM comparisons using SAM, HMMER and PRC

   Related Questions in Biology

©TutorsGlobe All rights reserved 2022-2023.