Monday, December 8, 2014

Pathway Anaysis using IPA

Description about IPA:

  • IPA stands for Ingenuity pathway analysis.
  • It majorly deals with the creating molecular networks (algorithmically generated pathways).
  • Dividing data into diseases and biological functions that are over represented in your data.
  • Determining over represented signaling and metabolic canonical pathways.
There are actually 3 major aspects that can be taken from IPA
  • Drug Interactions
  • Growing/Generating a pathway
  • Canonical Pathways
As my pet gene is CYP2C9 I had to find out if it is actually present in IPA for pathway analysis. Then I had to see if search it in IPA which would give a tabular form of data where I found my gene but there were no drug interactions associated to my gene.

Growing Pathway:
  • I searched for CYP2C9 grew my pathway then created my pathway
  • Add to my pathway—> build—> grow—> (direct/human/remove chemical molecule types and biologic drug)
  • Then click on Apply.


  • But in this scenario we could notice that there are only 10 molecules as 10 relationships in Direct association.
  • So, we need to add some more pathways associated with CYP2C9 so that we can notice atleast of 20-25 relationships.


  • So, the molecules being constant the number of relationships increased to 23.
  • In case if there are more than 25 relationships we use "Trim" option to reduce the number of relationships.

  •  We can see that there are no inhibitors and activators but definitely Protein-Protein interactions for an example between CYP2C9 and CYP2D6 etc.,
 
Canonical pathway:
  •  Canonical actually mean the standard or unique pathway.
  • In here I actually considered Xenobiotic metabolism as my canonical pathway.

  • I want to explain the reason I choose this pathway that because the major disease associated with my gene is anti coagulation which is the reason for slow warfarin metabolism.

XENOBIOTICS

A xenobiotic is a foreign chemical substance found within an organism that is not normally naturally produced by or expected to be present within that organism. It can also cover substance which are present in much higher concentrations than are usual. Specifically, drugs such as antibiotics are xenobiotics in humans because the human body does not produce them itself, nor are they part of a normal food. 

 

  •  Major genes associated to my pathway Xenobiotic metabolic signalling are CYP2C9 which is quite obvious I choose this pathway!! Inclusive to this we observe CYP3A4 and CYP2C. In the below figure we can see the disease associated to my pet gene. This gene plays a vital role in Deep vein thrombosis which is a disease associated with anti coagulation.

     

     

Monday, December 1, 2014

GENOME ANALYSIS

What's a genome?

In modern molecular biology and genetics the genome is the genetic material of an organism. It is encoded either in DNA or, for many types of viruses, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA. This genetic material places a vital role in every person. This helps in understanding a disease condition present or future. In my option analyzing a full genome of a person can definitely help in prevention of the disease.

How can genome analysis be done?

Bioinformatics is playing a great role in genetics. The synergy of both these departments are creating a new entity in the field of science and technology. A bioinformatics approach to genomic analysis needs various entities into consideration which are:

1a. Sequencing
1b. Analysis of nucleic acid seq.
2. Analysis of protein seq.
3. Molecular structure prediction
4. Molecular interaction
5. Metabolic and regulatory networks
6. Gene & Protein expression data
7. Drug screening Ab initio drug design OR Drug compound screening in database of molecules
8. Genetic variability

There are various types of genome analysis 

Quick reference to how these analysis and DNA sequencing are done:

1.) https://www.youtube.com/watch?v=Q6UxQf1sVe8
2.) https://www.youtube.com/watch?v=fFzeGGrV7io

Whole genome sequencing and its uses in prevention of disease

Technological advances often outpace our ability to effectively use them, a situation that certainly could pertain to modern genomics. Breathtaking advances in genetic sequencing technology have the potential to make whole genome sequencing (WGS) available for healthcare and disease prevention. 

In the near future, WGS will transform diagnostic testing in the subset of patients with disorders resulting from disruption of a single gene or chromosomal region. Burgeoning application of WGS in a variety of clinical settings will allow assessment of the diagnostic yield in various subsets of symptomatic patients, guiding its widespread use in this setting. However, although WGS will almost certainly be a powerful diagnostic tool for patients with such disorders, whether such analysis will be a valuable clinical tool for those with common diseases is doubtful, for the simple reason that such disorders have many contributing non-genetic etiologies and because our ability to interpret the combinatorial effects of common genetic variants remains limited.

Three major factors that can be influential during WGS are:

1.) Family history
2.) Protein structure prediction
3.) Sequence comparison

Family History:

First thing is that family history can or sometimes can not play a crucial role in disease because many things influence your overall health and likelihood of developing a disease. Sometimes, it's not clear what causes a disease. Many diseases are thought to be caused by a combination of genetic, lifestyle, and environmental factors. The importance of any particular factor varies from person to person. If you have a disease, does that mean your children and grandchildren will get it, too? Not necessarily. They may have a greater chance of developing the disease than someone without a similar family history. But they are not certain to get the disease. 

Common health problems that can run in a family include:
  • Alzheimer's disease/dementia
  • Arthritis
  • Asthma
  • Blood clots
  • Cancer
  • Depression
  • Diabetes
  • Heart disease
  • High cholesterol
  • High blood pressure
  • Pregnancy losses and birth defects
  • Stroke.  

 

Genetic Diseases

Some diseases are clearly genetic. This means the disease comes from a mutation, or harmful change, in a gene inherited from one or both parents. Genes are small structures in your body's cells that determine how you look and tell your body how to work. Examples of genetic diseases are Huntington's disease, cystic fibrosis, and muscular dystrophy.

Protein Structure prediction and Sequence comparison:

Protein structure should be predicted for the analysis of a particular disease as this is the initial step that has to be done which would let us know the interaction mechanisms with the receptors associated with the it. There are various databases which would let us know the protein structures like Protein Data Bank (http://www.rcsb.org/pdb/home/home.do).
We can notice the scenario where we can not find the protein structure or hypothetical structures would be present. If we come up with those situations we can use sequence comparison technique as design our required protein. Sequence comparison can be done using BLAST tool (http://blast.ncbi.nlm.nih.gov/Blast.cgi).

WGS on Alzheimer's Disease:

This was a project done on various families participated in the intervention. Whole genome sequencing (WGS) for select subjects from large multigenerational extended families with late onset Alzheimer disease to identify novel genes and alleles associated with the occurrence of late-onset AD. They had to consider full pedigree's of individual families as they are concentrating on the inheritance issue.

https://www.niagads.org/adsp/content/study-design
http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41

Let's consider this example scenario in explaining genome analysis:

John has been confirmed as being homozygous for warfarin sensitivity and Deep vein thrombosis a disorder associated variant in gene CYP2C9. He had complex responses to multiple medications and have multiple clinical conditions that are not being explained by the disease variant or by 1year of traditional diagnostic center.


Warfarin sensitivity:

Warfarin is the most commonly prescribed anti-coagulant and is among the top 20 most prescribed drugs in the US. Patients who are suffering from blood clotting diseases like Deep vein thrombosis etc., are given this medication. John is a patient suffering from DVT and warfarin sensitivity. The reason for this disease is  variant *2 or *3 alleles of CYP2C9. Traditional methods of using anti-coagulants were not working on him so upon physician suggestion john's parents had to go with WGS.

I think participating in a clinical trial which offering full exome analysis for John and their parents at no personal cost is a good option because the various medications didn’t work on him,this might be that there were any other variants associated with the genome and this many cause this disease. His parents are also getting the total exome analysis for free of cost as we know blood clotting can also be inherited. I feel this is an advantage both for the researcher's as well as the participant.

Paying $5-$10k for this procedure is not much of use as they have other option to participate in research. If after the research there are any incidental finding that were not found at traditional diagnostic testing center they can use the money for the future medications. This scenario depends on financial and health situation of John.

If there are any incidental findings they play a major role in the John's life. WGS will turn out to be pioneer in disease treatment.


 Part 2


Converting variant to VCF format:

1.) OMIM was used to identify the "rs ID" for the variant that is associated to my disease.


2.) Then I navigated through “Gene” part of NCBI. From “Genomic Region and Transcripts” section, I selected “GRCh38 Primary Assembly”  and  look up my variant by searching “rs1057910” in the   search engine provided. It was zoomed in for the DNA view. The beginning of position 47639 is position 1 for CYP2C9 gene (chromosome 10)



 3.) At exon 7 NCBI viewer indicates my variant.


VCF format according to VCF 4.1:

#CHROM   POS     ID           REF   ALT   QUAL    FILTER   

10       47639   rs1057910    A      C     25      PASS


INFO                    FORMAT    CB00001  

NS=1;DP=35;DB           GT:GQ     1|1:52