0

I’m trying to figure out how to use Alphafold, which is a biological analysis software for predicting the folding of amino acid sequences. I’ve been trying to follow the directions on the creators’ website for downloading it and using it but there’s one part I don’t understand (see the sections mentioned in the link below), and because a mistake here could be costly, I would like to understand it before I make a mistake. To me, the website seems to be saying that I have to download the software, and then also download the two terabytes worth of data that the model was trained on, and that having both the sotware and the training data will allow me to predict the folding of new amino acid sequences (kind of like if I’d have to train the model myself). This seems odd to me because although I don’t know much about modelling software, it was my understanding that the software creators use data to train a model, find what model would best predict new inputs, and then give you that pre-set model.

My question - do I have to download both the Alphafold model and all that data in order use the software to predict the folding of new amino acid sequences? In particular, if I really only just want a visualization of the folding and accompanying statistics, is this the simplest way for someone with little background in programming (although my university does have the storage capacity for the data if need be)?

(See the Model Parameters and Running Alphafold sections of this github link: https://github.com/deepmind/alphafold)

Aaron
  • 115
  • 2
  • 2
    I’m voting to close this question because it is asking for details concerning the installation and use of software. Re: "*do I have to download both the Alphafold model and all that data in order use the software to predict the folding of new amino acid sequences?*" – acvill Aug 20 '21 at 18:25
  • @acvill thanks for your comment. I disagree, I think it’s about the best way to analyze the folding of specific protein sequences – Aaron Aug 20 '21 at 18:27
  • Because the focus of my question, and my untimate focus, is on analyzing the folding of protein sequences, I do strongly believe that falls within the scope of allowed questions on this site. In addition, because Alphafold is a groundbreaking method and its source code was only released recently, this question will be relevant to current and future researchers who will analyze protein folding using the software – Aaron Aug 20 '21 at 18:43
  • Your question consists of two parts: (1) *do I have to download 2+ TB of model and data to use the software?* Their github page makes it clear that, yes, you do, if you want to run it AF locally. Domen gives a alternative, but this question is outside the scope of the site. (2) *is AF the simplest way to predict the folding of new amino acid sequences?* [Similar general questions about coding recommendations for protein modeling](https://biology.stackexchange.com/questions/8871) have been asked and closed in the past. – acvill Aug 20 '21 at 18:56
  • That said, [general questions about computational protein structure prediction](https://biology.stackexchange.com/questions/93993) are on-topic and well-received. I suggest taking a look at the [ongoing discussions on biostar](https://www.biostars.org/p/9484744/) concerning the use and accessibility of AlphaFold. – acvill Aug 20 '21 at 19:08
  • Questions regarding analytical methods, such as those related to bioinformatics or biological simulation, that do not ask **explicitly** about the *underlying biology* are considered off-topic. One place where this is covered in the [help] is the page ["What topics can I ask about here?"](https://biology.stackexchange.com/help/on-topic). – tyersome Aug 20 '21 at 20:22
  • In addition to the other comments, these sorts of software are usually run on high performance computing clusters, which have large numbers of CPUs/GPUs as well as the storage needed. Most computers would struggle to run this locally without a specialist build. – bob1 Aug 21 '21 at 09:02

1 Answers1

3

If you have little experience with programming, I strongly suggest using the official Google Colab notebook for AphaFold. It really involves only some button clicks and pasting the amino acid sequence.

This notebook folds without templates, but there is another unofficial notebook that uses the template generation.

Domen
  • 884
  • 5
  • 7