Structural bioinformatics - Project#
General information#
In this mini project you will be applying the concepts and methods that we learned in structural bioinformatics to study a human protein that is associated with neurodegenerative disorders. The different sections correspond to aspects that we have learned in the 3 weeks. The code required to answer the questions below are all the same that is available in the online reading material from the previous classes.
The structure of Sphingolipid delta(4)-desaturase (DES1)#
The human Sphingolipid delta(4)-desaturase (DES1, uniprot ID O15121) is an enzyme without a known structure that is involved in sphingolipid metabolism. Mutations in this protein can lead to disease including hypomyelinating leukodystrophy, a defect in the formation of the myelin sheath in the brain, causing a neurodevelopmental disorder. This protein has no experimental structure and it is too divergent from any experimental structure determined to date to be able to build a structural model by homology. In this project we are going to take advantage of AlphaFold2 to study the predicted structure of DES1.
For this purpose you can use the Rstudio and terminal via: http://cousteau-rstudio.ethz.ch
We will make use of a AlphaFold2 predicted structure that is already available in the /nfs/teaching/551-0132-00L/11_Structural_bioinformatics_project/FS24_pdb directory. The file is named <your ETH user name>.fna. Begin by copying your file to your home directory. If you are using Rstudio for this exercise don’t forget to set your working directory.
Using the bio3d library in R, open this PDB file and answer the following questions:
How many chains are in this file?
What is the x coordinate of the atom in the index number 23?
- Using the dm() function in bio3d to calculate all distances between the calpha of the protein
What is the distance between calpha’s with indices 10 and 20, rounded to the first decimal place
What is the mean distance between all residues, rounded to the first decimal place
Comparing the structure of human DES1 with other desaturases#
Given that there are no structures in the PDB that are closely related to DES1 (your file) we are going to study the structural similarity of the human DES1 structure with other AlphaFold2 predicted structures across different species. This would allow us to, for example, understand if studying this protein in different species could give us relevant information to understand the human protein.
To do this you will use the PDB files for the predicted structures for related proteins in X. laevis (frog.pdb), C. elegans (worm.pdb), D. melanogaster (fly,pdb) and S. pombe (pombe.pdb).These will be available in /nfs/teaching/551-0132-00L/11_Structural_bioinformatics_project.
In order to make a list of PDBs for alignment you can use the following command that we did not practice in class:
my_pdbs<- c("./<your ETH user name>.fna","./frog.pdb","./pombe.pdb","./worm.pdb","./fly.pdb")
As we did in class, use the pdbaln and the pdbfit functions to align the sequences and the structures described above. After calculating the similarity of the structures using the rmsd function answer the following questions:
What is the RMSD distance between your structure (<your ETH user name>.fna) and the frog structure (round to the second decimal)?
What is the mean RMSD distance for all pairs of structures (round to the second decimal)?
Mutations in Sphingolipid delta(4)-desaturase#
The human Sphingolipid delta(4)-desaturase is a well known disease associated gene. For example the mutations N255S, A280V, N113D and R133W are known to be disease causing while LA175Q, DA65N, and NA267S are known to be benign. We will take advantage of the predicted structure to study the impact of these mutations at a structural level using FoldX. Such a structural analysis would allow us to study the potential impact of these mutations that already have a known effect and potentially make predictions for new mutations found in patients.
For this, you will find a structure to study in the “foldx” folder (within /nfs/teaching/551-0132-00L/11_Structural_bioinformatics_project) called DES1_foldx.pdb. The following analysis is done in the terminal and don’t forget that you may need run “ml FoldX” in the terminal before running FoldX and you will need to have the “rotabase.txt” file present in the directory where you are running FoldX. If you don’t have this file you can copy it to the folder using:
cp /nfs/nas22/fs2202/biol_micro_teaching/software/easybuild/software/FoldX/4.0/bin/rotabase.txt ./
You will then need to create the file “individual_list.txt” with the mutations described: NA255S; AA280V; NA113D; RA133W; LA175Q; DA65N; NA267S;
Then use the FoldX BuildModel command to generate the mutated structures and predict the impact of the mutations. The result of the predictions will be listed in the file Dif_DES1_foldx.fxout. Analyzing the results of these predictions, answer the following questions:
Which of the mutations can be considered to be destabilizing, with total energy ddG >2:
NA255S;
AA280V;
NA113D;
RA133W;
LA175Q;
DA65N;
NA267S;
For the mutation AA280V, which is the energy term that contributes most strongly to the final energy prediction: Multiple choice:
entropy mainchain
Van der Waals clashes
Solvation Polar
Backbone Hbond
Answer submission#
Answer submission