Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for using Rosetta with QM packages #48

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open
1 change: 1 addition & 0 deletions rosetta_basics/Rosetta-Basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
* [[Design-centric guidance terms|design-guidance-terms]]
* [[Hydrogen bonding score term|hbonds]]
* [[Centroid score terms]]
* [[Quantum mechanical energy calculations in Rosetta | qm-energy-calculations ]]
- [[Symmetry]]
- [[Minimization | Minimization Overview]] - Backbone and/or side chain degrees of freedom
- [[Comparing Structures]]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# GAMESS Point Energy Tutorial 1: Running SPE calculation on a cyclic peptide of known structure

[[Back to Quantum mechanical energy calculations in Rosetta|qm-energy-calculations]]

## GAMESS Point Energy Tutorial 1: Running SPE calculation on a cyclic peptide of known structure

A single point energy (SPE) calculation with GAMESS can be run through Rosetta directly from the command line. Depending on the user preference, user can either use an XML script or run directly from the command-line interface. Here we will quickly go through two tutorials to demonstrate how this can be done.


Say we want to know what is the SPE of the peptide 5vav (PDB ID). To do this we first download the structure from PDB. Next we need to create a file which will serve as the flags file for both Rosetta (and GAMESS through Rosetta). We are going to call this file rosetta.flags. Within this flags file we will give the following options:

-----------------------------------------------------------------------------------
-in:file:s 5vav.pdb
-in:file:fullatom true
-score:weights gamess_qm.wts
-GAMESS_executable_version 00
-GAMESS_path /home/bturzo/gamess
-quantum_mechanics::GAMESS::rosetta_GAMESS_bridge_temp_directory ./
-quantum_mechanics::GAMESS::clean_rosetta_GAMESS_bridge_temp_directory false
-quantum_mechanics::GAMESS::clean_GAMESS_scratch_directory false
-quantum_mechanics::GAMESS::GAMESS_threads 10
-quantum_mechanics:GAMESS:gamess_qm_energy_geo_opt false
-out:levels core.quantum_mechanics.RosettaQMDiskAccessManager_GAMESS_Output:500
-----------------------------------------------------------------------------------

In this example,
-in:file:s is used to indicate the pdb (5vav.pdb) on which we are going to run our SPE calculation.
-in:file:fullatom is used to enable full-atom input of PDB (Vikram is this option required?).
-score:weights flag enables the QM score term if the weights file (gamess_qm.wts) consists the line. Additionally the weights file can be used to add further options for GAMESS (discussed later).
-quantum_mechanics::GAMESS::rosetta_GAMESS_bridge_temp_directory flag creates three files .err, .inp, and .log. In this example it will create these three files in the current directory. The .inp file consists of the input that was created for GAMESS. The .log and .err files contain the output and error (respectively) from GAMESS during the runtime.

---------------------
gamess_qm_energy 1.0
---------------------
99 changes: 99 additions & 0 deletions rosetta_basics/scoring/qm-energy-calculations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Quantum mechanical energy calculations in Rosetta

Back to [[Rosetta basics|Rosetta-Basics]].
Page created 16 November 2021 by Vikram K. Mulligan, Flatiron Institute ([email protected]).

[[_TOC_]]

## Summary

Traditionally, Rosetta has used a quasi-Newtonian force field for energy calculations. This has allowed Rosetta protocols to score a large macromolecular structure rapidly (typically in milliseconds) and repeatedly, permitting large-scale sampling of conformation and/or sequence space. The downside, however, has been that force fields are of finite accuracy. In late 2021, we added support for carrying out quantum mechanical energy and geometry optimization calculations in the context of a Rosetta protocol, by calling out to a third-party quantum chemistry software package. This page summarizes how to set up and use this functionality.

## Important considerations

### Molecular system size, level of theory, and computation time

TODO

### Computer memory

TODO

### CPU usage (especially in multi-threaded or multi-process contexts)

TODO

### Disk usage

TODO

## Supported third-party quantum chemistry software packages

All Rosetta QM calculations are performed through calls to third-party quantum chemistry software packages. These must be downloaded and installed separately, and users must have appropriate licences and privilegse to use these. Supported packages include:

### The General Atomic and Molecular Electronic Structure System (GAMESS)

[[GAMESS|https://www.msg.chem.iastate.edu/index.html]] is a versatile quantum chemistry package written in FORTRAN, and developed by the [[Gordon group at Iowa State University|https://www.msg.chem.iastate.edu/group/members.html]]. Users may agree to the licence agreement and obtain the software from [[the GAMESS download page|https://www.msg.chem.iastate.edu/gamess/download.html]].

#### Installation and setup

To use GAMESS with Rosetta... TODO

#### Compiling GAMESS

#### Using GAMESS with Rosetta

##### Point energy calculations with GAMESS within a Rosetta protocol

A single point energy (SPE) calculation with GAMESS can be run through Rosetta directly from the command line. Depending on the user preference, the user can either use an XML script or run directly from the command-line interface. Here are two tutorials to demonstrate how this can be done:

[[Tutorial #1|GAMESSPointEnergyTutorial1]]
[[Tutorial #2|GAMESSPointEnergyTutorial2]]

###### Dealing with the influence of solvent

When modelling biological macromolecules, the effect of solvent cannot be discounted. Indeed, the hydrohobic effect is the dominant effect that causes proteins to fold. Solvent affects computed energies in two ways. First, there is the enthalpy of interaction of a biomolecule with the surrounding solvent. Second, the shape and interacting groups presented by a molecule infuence solvent entropy. The hydrohobic effect is largely an entropic effect: hydrophobic groups that are unable to form hydrogen bonds with water force an ordering of water molecules when they are solvated, since the accessible low-enthalpy states, in which water molecules satisfy their hydrogen bonds by bonding to one another, are fewer when some water conformations are effectively prohibited by the presence of a non-hydrogen bonding group.

GAMESS can model solvent effects in three ways:

1. By ignoring them completely. This is equivalent to modelling the macromolecule in vacuum, and can be achieved by setting `-quantum_mechanics:GAMESS:default_solvent GAS` on the commandline, or `gamess_solvent="GAS"` in the scoring function setup. In most circumstances, this is not advised: one would not expect a protein to fold, for instance, if water is not present.

2. By modelling only electrostatic interactions with the solvent, and electrostatic screening effects. The default PCM (polarizable continuum) model, with solvent set to the default value of `WATER`, will do this. This is roughly equivalent to explicitly modelling water molecules and carrying out a vacuum calculation, up to the limits of accuracy of any continuum (implicit solvent) model. This does _not_ capture the effects of the macromolecule's conformation on solvent entropy. Again, this is not advised in most circumstances. However, since the `gamess_qm_energy` is a scoring _term_ and not an independent scoring _function_, one can use a sum of QM and molecular mechanics terms to capture solvent enthalpy and entropy. For instance:

```xml
<ScoreFunction name="qm_with_rosetta_solvation" weights="empty.wts" >
<!-- Use GAMESS for enthalpic interactions including solvent electrostatics: -->
<Reweight scoretype="gamess_qm_energy" weight="1.0" />
<!-- Use Rosetta's ref2015 energy function for solvent entropy effects: -->
<Reweight scoretype="fa_sol" weight="1.0" />
<Reweight scoretype="fa_intra_sol_xover4" weight="1.0" />
<Reweight scoretype="lk_ball_wtd" weight="1.0" />
<Set ... QM setup here ... />
</ScoreFunction>
```

3. By modelling electrostatic interactions with the water, plus _c_avitation, _d_ispersion, and local solvent _s_tructure effects (CSD model) using an empirically-fitted SMD (solvation model density) model. The CSD part of the SMD model approximates the entropic effects, though this has not been widely tested for biological macromolecules, particuarly to simulate solvent entropy-mediated effects like protein folding. In theory, this model should be more general than Rosetta's solvation model, though, and should support a wider range of solvents (including many organic solvents). To enable the SMD solvation model, set `quantum_mechanics:GAMESS:default_use_smd_solvent true` on the commandline, or `gamess_use_smd_solvent="true"` in the scorefunction settings.

##### Geometry optimization with GAMESS within a Rosetta protocol

TODO

##### Transition state identification with GAMESS within a Rosetta protocol

TODO

#### Rosetta-GAMESS bridge code organization

TODO

### Psi4

TODO

### Orca

TODO

### NWChem

TODO
4 changes: 2 additions & 2 deletions scripting_documentation/RosettaScripts/RosettaScripts.md
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ apps.public.rosetta_scripts.rosetta_scripts:
apps.public.rosetta_scripts.rosetta_scripts: The rosetta_scripts application will now exit.
```

You can also get help on the syntax of any mover, filter, task operation, or residue selector using the `-parser:info <name1> <name2> <name3> ...` flag. For example, the following commandline will provide information on the [[MutateResidue|MutateResidueMover]] mover and the [[HbondsToAtom|HbondsToAtomFilter]] filter:
You can also get help on the syntax of any mover, filter, task operation, packer palette, scorefunction, or residue selector using the `-parser:info <name1> <name2> <name3> ...` flag. For example, the following commandline will provide information on the [[MutateResidue|MutateResidueMover]] mover and the [[HbondsToAtom|HbondsToAtomFilter]] filter:

```
./bin/rosetta_scripts.default.linuxgccrelease -info MutateResidue HbondsToAtom
Expand Down Expand Up @@ -910,4 +910,4 @@ Another good troubleshooting tool is to simplify your XML. Try creating a stripp

<!-- SEO
scriptvars
-->
-->
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ The trRosettaConstraintGenerator requires compilation with Tensorflow support.

References and author information for the trRosettaConstraintGenerator constraint generator:

trRosetta Neural Network's citation(s):
trRosetta neural network's citation(s):
Yang J, Anishchenko I, Park H, Peng Z, Ovchinnikov S, and Baker D. (2020). Improved protein structure prediction using predicted interresidue orientations. Proc Natl Acad Sci USA 117(3):1496-503. doi: 10.1073/pnas.1914677117.

trRosettaConstraintGenerator ConstraintGenerator's author(s):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,10 @@ Computes the binding energy for the complex and if it is below the threshold ret
relax_unbound="(true &bool;)" translate_by="(100 &real;)"
relax_mover="(&string;)" filter="(&string;)" chain_num="(&string;)"
extreme_value_removal="(false &bool;)" dump_pdbs="(false &bool;)"
enable_caching="(false &bool;)"
task_operations="(&task_operation_comma_separated_list;)"
packer_palette="(&named_packer_palette;)" scorefxn="(&string;)"
confidence="(1.0 &real;)" />
final_scorefxn="(&string;)" confidence="(1.0 &real;)" />
```

- **threshold**: If ddG value is lower than this value, filter returns True (passes).
Expand All @@ -36,9 +37,11 @@ Computes the binding energy for the complex and if it is below the threshold ret
- **chain_num**: Allows you to specify a list of chain numbers to use to calculate the ddg, rather than a single jump. You cannot move chain 1, moving all the other chains is the same thing as moving chain 1, so do that instead. Use independently of jump.
- **extreme_value_removal**: Compute ddg value times, sort and remove the top and bottom evaluation. This should reduce the noise levels in trajectories involving 1000s of evaluations. If set to true, repeats must be set to at least 3.
- **dump_pdbs**: Dump debugging PDB files. Dumps 6 pdbs per instance: BOUND_before_repack, BOUND_after_repack, BOUND_after_relax, UNBOUND_before_repack, UNBOUND_after_repack, and UNBOUND_after_relax.
- **enable_caching**: Cache DDG calculations to avoid re-computing during reporting. Each time this object is called it will reset the cache when compute() is run, so it can be used on unique poses. Please note that caching requires updating a data member, which means that if this object is shared by multiple threads there will be data racing (i.e., not thread-safe).
- **task_operations**: A comma-separated list of TaskOperations to use.
- **packer_palette**: A previously-defined PackerPalette to use, which specifies the set of residue types with which to design (to be pruned with TaskOperations).
- **scorefxn**: Name of score function to use
- **final_scorefxn**: A scoring function to use for final scoring of the docked and undocked poses (to compute the difference). If not provided, the scoring function provided with the scorefxn option is used. The option of scoring with a different final scoring function is provided to allow a more accurate, more expensive calculation to be done for this (e.g. using RosettaQM quantum chemistry calculations).
- **confidence**: Probability that the pose will be filtered out if it does not pass this Filter

---
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Filter based on any score that can be calculated in fragment_picker.
outputs_name="(pose &string;)" csblast="(&string;)"
blast_pgp="(&string;)" placeholder_seqs="(&string;)"
sparks-x="(&string;)" sparks-x_query="(&string;)" psipred="(&string;)"
vall_path="(/home/benchmark/rosetta/database//sampling/vall.jul19.2011.gz &string;)"
vall_path="(/Users/vmulligan/rosetta_devcopy/Rosetta/main/database//sampling/vall.jul19.2011.gz &string;)"
frags_scoring_config="(&string;)" n_frags="(200 &non_negative_integer;)"
n_candidates="(1000 &non_negative_integer;)"
print_to_pdb="(false &xs:boolean;)"
Expand Down
Loading