PHAR5350 - Bioinformatics Assignment - De Montfort University, UK

All Tasks must be performed INDEPENDENTLY. You should perform the two mini projects and submit them as a single Word document or pdf containing your results.

Part A - Dr Webb

Task 1 - Halothane is an inhaled anaesthetic used for general anaesthesia.

i. Interrogate the PharmGKB database for pharmacogenes associated with Halothane. List the genes you identify and the first variation for each gene.

ii. Use the GWAS catalog to determine if the gene /s you listed in 1a have been associated with a phenotypic response in genome wide association studies. Present your results and then identify any variations relevant to Halothane or its adverse effects. iii. Produce a summary table for each of the SNPs you have identified in 1a and 1b from their dbSNP entries, (this should include comment on any population diversity information that is available). Task 2 - Investigate the small nucleotide polymorphisms database (dbSNP) for variations in the gene /s you identified in question 1a. For each gene: a. How many variations are present in total? b. How many variations are located in the coding region? c. How many variations are nonsynonymous? d. How many variations are pathogenic? i. Create a table that catalogues the information you have collected to answer the above questions. For the first pathogenic variation / s you identified in 2a access the variation viewer view of ~ 1000 bases of it in the GRCh38.p12 assembly (annotation release 109).

ii. For each gene variation create a table that catalogues the variation type and the molecular consequences of the variations in this region. Which population group / s requires the highest and lowest number of Tag SNPS to cover the region?

iv. Repeat the analysis with the r2 at 0.8 and allele frequency altered to 10%. Describe the effect this change has on the results and explain why it does so.

Task 4 - You are required to design a test to identify individuals who are at risk of malignant hyperthermia when using volatile anaesthetics such as halothane.

i. Consider the variations you have identified for question 1 and 2. Discuss which would you include in the test?

ii. Perform a literature search to identify additional variations for inclusion.

iii. Provide a critical evaluation of whether there are any population groups that the test results would be most beneficial for? Part B - Dr Smith Sequence - MSALCWGRGAAGLKRALRPCGRPGLPGKEGTAGGVCGPRRSSSASPQEQDQDRRKDWGHV ELLEVLQARVRQLQAESVSEVVVNRVDVARLPECGSGDGSLQPPRKVQMGAKDATPVPCG RWAKILEKDKRTQQMRMQRLKAKLQMPFQSGEFKALTRRLQVEPRLLSKQMAGCLEDCTR QAPESPWEEQLAQLLQEAPGKLSLDVEQAPSGQHSQAQLSGQQQRLLAFFKCCLLTDQLP LAHHLLVVHHGQRQKRKLLTLDMYNAVMLGWARQGAFKELVYVLFMVKDAGLTPDLLSYA AALQCMGRQDQDAGTIERCLEQMSQEGLKLQALFTAVLLSEEDRATVLKAVHKVKPTFSL PPQLPPPVNTSKLLRDVYAKDGRVSYPKLHLPLKTLQCLFEKQLHMELASRVCVVSVEKP TLPSKEVKHARKTLKTLRDQWEKALCRALRETKNRLEREVYEGRFSLYPFLCLLDEREVV RMLLQVLQALPAQGESFTTLARELSARTFSRHVVQRQRVSGQVQALQNHYRKYLCLLASD AEVPEPCLPRQYWEALGAPEALREQPWPLPVQMELGKLLAEMLVQATQMPCSLDKPHHSS RLVPVLYHVYSFRNVQQIGILKPHPAYVQLLEKAAEPTLTFEAVDVPMLCPPLPWTSPHS GAFLLSPTKLMRTVEGATQHQELLETCPPTALHGALDALTQLGNCAWRVNGRVLDLVLQL FQAKGCPQLGVPAPPSEAPQPPEAHLPHSAAPARKAELRRELAHCQKVAREMHSLRAEAL YRLSLAQHLRDRVFWLPHNMDFRGRTYPCPPHFNHLGSDVARALLEFAQGRPLGPHGLDW LKIHLVNLTGLKKREPLRKRLAFAEEVMDDILDSADQPLTGRKWWMGAEEPWQTLACCME VANAVRASDPAAYVSHLPVHQDGSCNGLQHYAALGRDSVGAASVNLEPSDVPQDVYSGVA AQVEVFRRQDAQRGMRVAQVLEGFITRKVVKQTVMTVVYGVTRYGGRLQIEKRLRELSDF PQEFVWEASHYLVRQVFKSLQEMFSGTRAIQHWLTESARLISHMGSVVEWVTPLGVPVIQ PYRLDSKVKQIGGGIQSITYTHNGDISRKPNTRKQKNGFPPNFIHSLDSSHMMLTALHCY RKGLTFVSVHDCYWTHAADVSVMNQVCREQFVRLHSEPILQDLSRFLVKRFCSEPQKILE ASQLKETLQAVPKPGAFDLEQVKRSTYFFS Task 1 - You have been provided with a sequence of a human protein (above). i) Identify what the sequence is and discuss its function. ii) Prepare a multiple sequence alignment of sequence 1 with other similar sequences in different organisms. iii) Using your multiple sequence alignment, produce a phylogenetic tree. In your answer, consider adding an outlier to help with rooting. Discuss your findings. In your comparison, discuss the experimental design and consider the platforms and the number and range of samples used.

a. Series GSE7905
b. Series GSE2361

iii) Determine the expression profile of CYP2E1 in normal human tissues using the above two series.

a. In which tissue is CYP2E1 expression most prominent?
b. How similar are the results from the two studies?

iv) Compare and discuss the expression profile of CYP2E1 from these two studies with that shown in the 53 GTEx RNA-Seq study in the EBI Gene Expression Atlas.

a. Based on what you know of its function, are these results to be expected? 