Monday, October 13, 2014

Protein 3D structure

Protein Data Bank

It is a repository for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. This data is obtained open source to all the people around the world. The structures are obtained by two major methods which are X-ray crystallography and NMR spectroscopy.

  • Structures in PDB have respective id's (identifiers) by which we can access them this could be done by typing there id's in PDB search engine.
  • My pet gene is CYP2C9 the protein associated to this gene is having a respective PDB id 1OG5 but on further research regarding my protein I found  that even 1OG2 also corresponds to my pet gene but ligands bound to protein vary. 
  • 1OG5 has two ligands associated to it S-WARFARIN and Heme C group (the one which I am considering right now).
  • 1OG2 has Heme C group. 
  • 1OG5 belongs to homo sapiens and S-WARFARIN is the perfect ligand which is bound to it.
                                                               3D structure of 1OG5


Cn3D Protein Visualization Tool:

  • Cn3D ("see in 3D") is a helper application for your web browser that allows you to view 3-dimensional structures from NCBI's Entrez Structure database. Cn3D is provided for Windows and MCn3D ("see in 3D") is a helper application for your web browser that allows you to view 3-dimensional structures from NCBI's Entrez Structure database.
  •  Cn3D simultaneously displays structure, sequence, and alignment, and now has powerful annotation and alignment editing features.
  • From structure module in NCBI we obtained our Cn3D structure and we visualize using this tool.
  • We can change the type of view by going to Style menu ----> Rendering shortcuts ------> Space fill.
  • We select the disease causing site ILE at 359th position select the site in the sequence window.


 Fig represents Molecule(pink color) in space fill format and ILE at 359th position is represented in the sequence window (yellow color) 

  • For showing the particular site 359th position in "Worms" view Style menu ----> Rendering shortcuts ------> Worms. 

 Fig represents ILE site 359th position in "Worms" view
  • Then Select menu ------> Show selected residues.

 Fig represents ILE (specifically) site 359th position in "Worms" view

Secondary Structures:

The secondary structure associated to the site (ILE at 359th protein) according to PDB secondary structure prediction which uses CATH and SCOP to calculate.


         Fig represents ILE at position 359 has an Alpha helix


DNA STAR Protein Secondary Structure Analysis:

Protean software was used from DNA STAR to view secondary structures of protein which uses 2 algorithms Garnier Robson and Chou Fasman. Both these algorithms state that this site is a Beta sheet.

 Fig representing that ILE at 359th position is a Beta sheet

We could see that there is a contrast in the prediction DNA STAR algorithms predicts the site as sheet were in PDB (SCOP) predict it to a Helix.

Monday, October 6, 2014

Protein Structure prediction and human variation

This blog is much about the protein and allelic variant analysis that I have done using DNA STAR.

Online Mendelian Inheritance in Man (OMIM)provides the allelic variants that are 
required for the analysis. My gene has 3 allelic variants. These changes corresponds to 
change in the amino acids by substitution, deletion etc.,

TOLBUTAMIDE POOR METABOLIZER - ILU359LEU
WARFARIN SENSITIVITY - ARG144CYS
WARFARIN SENSITIVITY - LEU208VAL


 I considered the first allelic variant and performed analysis for my protein.

The ile359-to-leu (I359L) substitution results from a 1075A-C transversion in the CYP2C9 gene and is also known as rs1057910 and CYP2C9*3. The variant leads to reduced warfarin metabolism and increased risk of bleeding.

This variation is done using Edit seq module from DNA STAR where we can edit the protein sequence and convert it into allelic variant protein sequence.
A quick screenshot showing the change in the protein sequence at position 359.




Note:
If you want to recheck that the aminoacid substitution is correct try reverse translating the protein which is edited see that you have a corresponding change in gene sequence.
Aminoacid changed has been performed which is converted to Leucine to Isoleucine.
After the replacement of the amino acid then we need to analyze the changes that we notice in the protein structure, hydrophobicity and other physical properties. For this we have an application called Protean and Protean 3D ( next version of protean where it is better in visualization but little difficult to navigate through the window) I have selected a window of 355-365 amino acid to know the changes either in the secondary structures or the hydrophobicity. The reason we do this is because the values for either hydrophobicity vary with different algorithms. We can use the DNA stat protean tool to view the changes in the physical properties.


                                                                        CYP2C9 actual protein



                                                                      Allelic variant (ILE359LEU)


We notice that there is no significant change in the allelic region to that of the original protein features. We can see the Kyte-Doolittle Hydrophobicity plot was showing no significant difference even after the amino acid change. We have the threshold values for the aminoacid's hydrophobicity obtained by different algorithms.


Table 1: Representing threshold values for hydrophobicity in amino acids

Secondary structure prediction:

I used Protean 3D for analyzing the secondary structure of natural and variant proteins. This prediction is done by Chou-Fasman algorithm which is an empirical technique developed for prediction of secondary structures in proteins.

                                                                      Natural protein




                                                                         Allelic Variant




First we need to select the 10 aminoacid window in which we need to see that 359th position of amino acid would be median value. With the present procedure for finding out any difference between these two proteins we see that there is not much of a change but Beta sheets show little difference. I could see that rather than flexibility there is not much of a change observed. I suppose with some vivid analysis of the protein in the future we could know the reason why isn't there a change when an amino acid corresponding to the functionality of protein is changed.


My gene is not associated with the membrane so there is not much of a difference in the trans membrane region which we can visualize in Protean-3D

Key points:
A codon change in the CDS region of gene should generally correspond to functional or structural change. This aminoacid or codon change is resulting in a functional change but not a structural or any kind of physical change.