IPA Pathways

Today we will be looking at the program IPA and some of what it can accomplish. IPA is a program that is amazing for mapping out pathways and finding connections between genes or proteins. For our exercise we are going to be looking at ASPA and some of its associated connections. IPA also gives you a lot of secondary information about what you want to know. I found that there are no drugs currently associated with ASPA. Below is an example of a pathway that you can build with IPA. ASPA appears to be an understudied protein or not very active because when I limited the pathway to humans IPA did not find any connections. Only when I included multiple species and kept indirect interactions checked did I start to find connections. The two indirect connections are casein and FOLR1

pathway

According to the pathway 11 inhibitors, 2 activator and 2 protein-protein interaction molecules are connected to ASPA. The 11 inhibitors are miRNA which are small noncoding RNA that are used to regulate gene expression. The two activators are enzymes that are used to catalyze a chemical reactions. One protein-protein interaction molecule is a transporter and the other is involved in a few different functions.

In the picture below you can see IPA gave the location of each molecule on the pathway. A majority of the nodes are found in the cytoplasm. This makes sense because all the miRNA would be found where their complimentary mRNA are found. As for the two enzymes, aminoacylase and aspartoacylase, IPA has placed them in other. This is because to IPA the cellular location is unknown. FOLR1 can be found in the plasma membrane because it is a transporter. A side note is FOLR1 may be a cofactor for virus entry into some cells. This includes the filovirus family which Ebola is part of. Casein is in the other category. This probably because there are multiple types of casein found in many different areas.

pathwa2y

Canonical Pathway

Another feature of IPA is canonical pathways. This lists all the canonical pathways that the genes on your pathway exist in. IPA does not have a canonical pathway for ASPA so I will have to use casein. Casein is involved in the Prolactin Signaling pathway. This is a very broad pathway that is involve in six main areas “reproduction and lactation, growth and development, endocrinology and metabolism, brain and behaviour, immunomodulation and osmoregulation”. The pathway begins with the secretion of the hormone prolactin from the pituitary glands. Prolactin itself is involved in more that 300 separate effects. As you can see in the picture below casein or more specifically casein beta is involve in the lactation process.

casein pathway

Genome Analysis

Part 1

In this post I’ll be taking a look at whole genome sequencing. This is a powerful new tool for researchers and doctors. Technology today allows us to semi-cheaply sequence the entire genome of anybody. This can be used for many types of research and even by your doctor to diagnose a condition.This new technology does bring many obstacles and issues. Today I’ll be looking at one such scenario and weighing in on what would be the best plan of action for a patient with Canavan disease.

In the scenario we have a patient with Canavan disease. Some treatments have begun but they have shown a complex response to multiple medications that are not explained by the disease variant or by 1 year of traditional diagnosing. The patients family is faced with the following choices.

  • Participate in a clinical trial offering full exome analysis for [Mary/John] and their parents at no personal cost.
  • Seek full genome analysis and work with their insurance provider to seek coverage, a 4-6 month negotiation.
  • Pay out of pocket for the full genome analysis ($5-10k).
  • Use direct-to-consumer services and perform independent analysis of the raw results.

Unfortunately for Canavan disease deciding the patient’s treatment choice would most likely be a moot point. There is currently no cure for Canavan disease and no standardized treatment. Most patients only live into early childhood with a rare few living beyond that.  Although this may not be true for long as there is some exciting research on the horizon in the form of gene therapy and triacetin supplementation. As for what this family should do, unfortunately I believe they will need to pick the third option. This is based purely on how sever a case of Canavan disease the patient has. There are some cases of Canavan patients living into their twenties, although this is quite rare. Their doctor could measure the patients aspartoacylase levels and get some sense of how sever the disease is. Most likely the patient would only live for about 18 months and if they have already underwent 1 year of traditional diagnoses they simply would not have time to wait for their insurance company. Also because of the severity of the disease they simply couldn’t risk the case study because it might not find the problem and they couldn’t afford wasting that time. As for the fourth option I still believe that it would simply take too much time. Unless someone in their family had experience with whole genomes and could quickly assess the issue.

Another ethical issue with whole genome analysis is incidental information. Incidental information is the information you get from a patient’s genome that they did not originally ask for. So a question is raised if this patient had his genome analysed what should the doctor do with the incidental information? I believe that this should completely be up to the patient. One argument for that sentiment is brought up in the article When Getting Your Genome is Terrifying. In the article the author speaks about how he personally would be scared to learn about major life changing health issues his genome might show. He is living a happy life and doesn’t want that to change. Personally I don’t agree with that sentiment and would learn what my genome has to say about me in a heartbeat. This would allow me to plan and hopefully prevent issues so that in the long run I could have a happier and healthier life. But I do believe that it should still be up to the patient because there are those that could not handle that information in a safe and productive way. I feel that forcing that information on a patient breaks the autonomy that patients have. While the information would most likely be beneficial a lot of people simply don’t want to have the extra burden and worry about what their genome might say about them.

Part 2

In this second part I will be writing about the variant that I have been using in previous assignments for ASPA in VCF format. This is a format for storing whole genome data that only saves the variants from the genome being analysed.

In this picture you can see the rsID given to the variant that causes Canavan disease from ASPA. The ID is rs28940279. SNPID

In this picture you can see the rsID on the NCBI genome browser and look at its actual position on chromosome 17. The variant is at position 3,499,000 from the start of the chromosome and is located on the sixth and last exon. You can see on the second picture the exons outlined in red and the rsID labeled at the top. variatnasdf

exons

Below is the VCF format for my pet gene’s variant given the information in the assignment.

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001
17 3,499,000 rs28940279 A C 25 PASS NS=1; DP=35; AF=1 GT:GQ 0|1:52

ASPA Diagnostics

In this post I’ll be looking at some of the different tools for diagnosing ASPA, or more specifically Canavan disease. The disease state cause by an ASPA variant. I’ll look at the different variants, different test available for Canavan disease, and possible restriction sites that could be used to check for ASPA variants.

Currently for ASPA there are twenty variants listed on ClinVar. Of the twenty total seventeen are listed as likely disease causing and three of them have uncertain significance. There definitely appears to be a large gap in clinical research into the different variants of ASPA. Of the twenty on ClinVar not a single variant is recognized by a professional society or an expert panel. All twenty are from a single submitter. This is semi surprising to me because of the nature of Canavan disease. Tragically once you are diagnosed with Canavan disease at birth there is no current treatment and life expectancy is very short, but parents can scan for the variant in their genome to know the risk of their child receiving Canavan disease. This is a very rare disease that only affects ~1/10,000 Ashenkazi Jews and is unknown in the general population which could explain the lack of research.

Currently there are quite a few tests to diagnose Canavan disease. According to NCBI Gene Testing reference there are fifty-four different labs around the world that will do testing for Canavan disease. Even though there are quite a few different laboratories that can perform Canavan disease screening most follow the same type of testing. All current testing involves checking for the disease causing variants or a deletion/duplication analysis. Some of the techniques are Multiplex Ligation-dependent Probe Amplification, Sanger, next gen sequencing, SNP detection, and microarray.

Another methods for diagnosing some genetic diseases is using restriction enzymes and running an agarose gel. This relies on there being a change to an enzyme cutting site caused from the variant in question. I took a look at the A854C variant in ASPA. This variant actually turns out to be a very good candidate for enzyme restriction analysis. There are three six cutter enzyme sites (NotI, EagI, and BsmI) that were created by the variant as you can see in the picture below. All of them are also unique so in my opinion any of these enzyme could be used for diagnosis.

ASPA variant

I did run into a strange issue with GeneQuest when I was attempting to run an agarose gel simulations. When I added the enzymes mentioned above to GeneQuest the program recognized the new enzymes in the variant but when I ran the agarose simulation the gel was showing the same as the wild type. I tried the simulation with all three enzymes but the results were the same for all three. There was an additional enzyme site created (AciI) and this simulation can be seen below. In the laboratory this wouldn’t be particularly useful because of how small the last bands is. Further laboratory investigation would be needed to find out why none of the other sites were cutting the sequence in the simulation. The band sizes created by AciI are ~4, ~45, and ~1396.

enzyme

Ebola Antigen Analysis

In my analysis I took a look at KM233100.1 and protein AIG96469.1. I used the IEDB MHC-I processing predictions tool. The MHC-I processing predictions tool takes a look at a protein sequence and gives you predicted peptide affinity for MHC class 1. According to the  prediction tool there are three likely epitope candidates. An IC50nM score of less than 50 indicates high affinity, scores between 50 and 500 have intermediate affinity, and sequences between 500 and 5000 have low affinity. As you can see in the picture there are three peptides with an IC50nM of less than 50. Although this isn’t a guarantee that these are the only epitope candidate. Some peptides have a low affinity but are still epitopes. Further analysis of the first peptide in the picture shows that this is already a known epitope with the actual epitope being one AA longer. The second prediction is also a known epitope that is one AA longer than the prediction. Finally the last predicted epitope is also an already known epitope but this time the prediction was 100% correct. All three epitopes where identified in the same paper. This analysis to me shows that the prediction tool is a doing a good job at predicting the correct affinity for peptides. This would be a very powerful tool when conducting research on finding epitopes. You would cut out a lot of the guess work and give a very good foundation to build upon. Another analysis by the Virus Pathogen resource shows a possible 119 predicted epitopes. While the MHC-I processing predictions tool only shows 17 peptides with even intermediate affinity. Further experimental analysis is definitely needed to find all the correct epitopes.

Antigen

Week 6 Assignment

Today I’ll be looking at the 3D structure of my pet gene ASPA. When looking through the NCBI structure database I decided on using the entry with the PDB ID of 2O53. It was one of the most recent 3D models and appeared to have to no alterations. This protein came as a dimer but I hid one of the regions to better examine my target area. On my last post I mentioned the E285A mutation that has been found to cause Canavan disease and I compared the wild type protein to the E285A variant with various structural modeling algorithms. Today I’ll be looking at the 285th AA in the actual 3D model.

Below you can see the space filling model. While hard to see the highlighted region is in a small concave on the 3D model. This is believed to be where the catalytic domain is.

capture

In this next picture is a little bit closer examination of the target spot. From here you can see that the 285 spot is part of a turn between two beta sheets.

zoomwire

When actually examining the 3D structure I found quite a few of the predictions the algorithms gave me in my last post were incorrect or slightly off. First off the 3D models shows that the 285 region is actually a simple turn as you can see above. The Protean 3D algorithm guessed that area would be an alpha or beta region and not a turn. Protean 3D also predicted that the 285 area would be hydrophobic and the area directly after would be hydrophilic. As you can see from the picture below the 285 region is actually hydrophilic. The predictions weren’t totally wrong because the model does show a hydrophobic region followed by a hydrophilic region just a few AAs down the protein.

hydro

The charge region was also a little off as you can see in the picture below. The 285 spot is negative as predicted but Protean 3D also predicted that a few AAs on each side would be negative. But in the 3D model it shows that the AAs around the 285 spot are neutral.

charge2

Overall I would say that the Protean 3D algorithms had some issues predicting what this protein would actually look like. The predictions were definitely helpful when seeing what the E285A variant would do but the actual 3D model has quite a few differences.