Skip to content

Commit

Permalink
Merge pull request #82 from cheminfo/isomers
Browse files Browse the repository at this point in the history
Isomers generator + Stock search
  • Loading branch information
rschlm authored Oct 18, 2024
2 parents af0c0d1 + 602d8b5 commit a745911
Show file tree
Hide file tree
Showing 39 changed files with 249 additions and 41 deletions.
12 changes: 12 additions & 0 deletions docs/20_samples/10_sample-edition/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,18 @@ You will see several modules covering the canvas and a few buttons. The buttons
- **Attachments**: A list of all files attached \(e.g. JCAMP-DX files\).
:::

## Safety

The `Safety` button allows you to define the safety information associated with the sample. It is possible to add manually the GHS Pictograms, as shown below.

![Safety](images/add_delete_safety.gif)

On the right of the window, you can see the list of **GHS Hazard Statements** as well as the **GHS Precautionary Statements**. You can add or delete them by clicking on the corresponding buttons. You can also search by the code (e.g. H302) or by the description. The process of adding GHS statements is shown below.

![Add GHS](images/statements.gif)

Once you have added all the necessary GHS pictograms, hazard and precautionary statements, you can save the information by clicking on the green `Save data` button.

:::note Upload spectra
To upload spectra via drag and drop, use the application specific view. Those views are design to automatically handle the conversion into a standard format.
That is, if you want to upload a PXRD attachment to your sample you need to open the PXRD view.
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 3 additions & 9 deletions docs/20_samples/20_substructure-search/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
slug: /uuid/aaa5f97c7cde94741de2938b106bb0d4
---

import AQF from "../../includes/advanced_query_features/README.md";

# Structure search

You can perform a structure search if you are looking for molecules with a specific pattern or a fragment. This tool is useful when navigating many samples.
Expand All @@ -27,12 +29,4 @@ The similarity search is based on [Tanimoto algorithm](https://en.wikipedia.org/

Structure edition is powered by [OCL editor].

## Advanced features

You can fine-tune the search by specifying atomic and bond properties. These options can be accessed by hovering over the atom or bond of interest and pressing `q`. For example, you can allow certain atoms at this specific position, or you can modify the ring size.

![advanced options](advanced_options.gif)

Furthermore, you can include separate molecules in the search. This will result in structures containing both fragments with no restrictions on orientation or connectivity. One of the molecules can be selected and excluded from the search. It removes the structures completely from the search results. The excluded fragment is highlighted in pink.

![excluded fragment](excluded_fragment.gif)
<AQF/>
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
---
slug: /uuid/3ec507be0774fdc7abbe80cb07c600f3
title: In silico fragmentation
---

# Fragmentation
import PeakLabels from '../includes/peak-labels/README.md';

# *In silico* fragmentation

This tool is an *in silico* fragmentation tool for small molecules. It uses a database of known fragmentation patterns to predict the most likely fragmentation pattern for a given molecule.

Expand Down Expand Up @@ -49,3 +52,46 @@ When the mouse is over a peak, the corresponding fragment is highlighted in the

![list_mass](images/list_mass.png)

## Explore peaks

The user can have a detailed view of the spectrum by clicking in the `Explore peaks` tab.

![explore](images/explore_peaks.png)


:::tip Peak Labels

<details>
<summary>
Annotation of the peaks in the spectrum.
</summary>
<div>

<PeakLabels />

</div>
</details>
:::


## DB Search

This tool allows to predict the fragmentation of a molecule *in silico*. First **select a peak from the experimental spectrum**. Then click on the `Search in the database` button, this will search in the [octochemdb](https://github.com/cheminfo/octochemdb) database for molecules that are bioactive or natural products. The results are shown on the left table.

![db_results](images/db_results.png)

Once a molecule is selected, the algorithm will perform a fragmentation *in silico* and show the fragmentation tree in the top panel.

![fragmentation_tree](images/fragmentation_tree.png)

The preferences for the DB search and the fragmentation can be set in the right panel.

![prefs_db](images/prefs.png)



## Reactions

The **Reactions** view shows the list of reactions that are applied to the molecule to generate the fragments. The user can filter the reactions by `Labels`, `Schema`, `Reaction`, `Kind`, `Mode` and `Description`.

![reactions](images/reactions.png)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import TOCInline from '@theme/TOCInline'
import Assignment from './includes/assignment/README.md';
import Similarity from './includes/similarity/README.md';
import Taxonomy from '../../../includes/taxonomy/README.md';
import PeakLabels from '../includes/peak-labels/README.md';

<TOCInline toc={toc} />

Expand Down Expand Up @@ -158,27 +159,7 @@ if (entry.ms.em < 300 && entry.charge === 2) return true;

:::

#### Relative mass and MF determination

This view displays normally the mass of the peaks, but it is also possible to display relative mass to a specific peak.

1. Click on a peak to change the `Monoisotopic mass` value
2. Click on the checkbox `Relative mass` on the top right

![preferences](images/prefs.png)

It is also possible to display possible molecular formulas for the relative mass. Those are calculating using the following criteria:

- allowed atoms are based on the `Ranges`
- only neutral loss are considered
- the charge of the entity loosing this neutral fragment is defined in `Charge`, by default 1
- you should change the number in the cell `Show MF` in order to annotate the peaks with the corresponding MF

It is also possible to define the color of the MF annotation depending on the precision. By default, if no MF is found under a precision of 20ppm no MF is displayed.

![colors](images/colors_mass.png)

![mass](images/mass.png)
<PeakLabels/>

### Results table

Expand Down
23 changes: 23 additions & 0 deletions docs/30_structural_analysis/mass/includes/peak-labels/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@


#### Relative mass and MF determination

This view displays normally the mass of the peaks, but it is also possible to display relative mass to a specific peak.

1. Click on a peak to change the `Monoisotopic mass` value
2. Click on the checkbox `Relative mass` on the top right

![preferences](images/prefs.png)

It is also possible to display possible molecular formulas for the relative mass. Those are calculating using the following criteria:

- allowed atoms are based on the `Ranges`
- only neutral loss are considered
- the charge of the entity loosing this neutral fragment is defined in `Charge`, by default 1
- you should change the number in the cell `Show MF` in order to annotate the peaks with the corresponding MF

It is also possible to define the color of the MF annotation depending on the precision. By default, if no MF is found under a precision of 20ppm no MF is displayed.

![colors](images/colors_mass.png)

![mass](images/mass.png)
3 changes: 2 additions & 1 deletion docs/50_machine_learning/includes/selec_norm_prev/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import SelectSpectra from '../select_spectra/README.md'
import SpectraNormalization from '../normalization/README.md'
import SpectraPreprocessing from '../../../includes/preprocessing/README.md'
import SuperimposeSpectraManipulation from '../visualization/README.md'


Expand All @@ -13,7 +14,7 @@ The first step is to select the spectra :

Once spectra have been selected, data normalization filters can be applied :

<SpectraNormalization/>
<SpectraPreprocessing/>

The superimposed spectra can be manipulated without numerous :

Expand Down
42 changes: 42 additions & 0 deletions docs/60_cheminformatics/100_search-stock/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
slug: /uuid/eeb03ca6c7a82d043456704a340e6d04
---

import AQF from "../../includes/advanced_query_features/README.md";

# Search Stock

`Search Stock` is a tool to manage the stock of chemical compounds in your lab. The view is divided into three main sections: the search and print section on the left, the list of the compounds in the middle, and the details of the selected compound on the right.

![general_view](general_view.png)

## Searching for a Compound

To search for a specific compounds within the stock, you can draw a structure on the left panel and either select the `substructure` or the `similarity` search. The `substructure` search will return all compounds that contain the drawn structure, while the `similarity` search will return all compounds that are similar to the drawn structure. It is also possible to search by a name in the search bar.

![search](substructure_search.gif)

Once the substructure or the similar structure is drawn, the search results will be displayed on the right panel. It is possible to click on one of the compounds to get additional information.

<AQF/>

## Stock Information

On the right panel, you can find all the information about the selected compound. The information includes the molecular formula, the structure, the molecular weight, the quantity, the location, the purity, and many more details that can be provided.

:::note
Note that the products are grouped by structure, a given structure can have multiple references representing different batches of the same compound.
:::

It is easy to change the status of a product. You can change the quantity, the location, the purity, whether the product is available or not. The modifications will be saved in the `Stock modification history`.

![details](details.png)

The `Stock modification history` shows the history of the stock modifications. It is possible to see the status, the location, the user that made the modification and the date of the modification of the product.

## Print

To print the current state of the stock status for a specific compound, you can click on the `Print` button. This will create a PDF file with all the information about the compound. You can choose the printer and the location in the following panel.

![print](print.png)

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 16 additions & 0 deletions docs/60_cheminformatics/80_maygen/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
slug: /uuid/7cead2ae3da71090cb17baa3856ea38b
---

# Maygen: Isomers generation

## Introduction

Maygen is a tool to generate isomers from a molecular formula. It is based on the paper [MAYGEN: an open-source chemical structure generator for constitutional isomers based on the orderly generation principle](https://doi.org/10.1186/s13321-021-00529-9).

## Isomers generation

To generate isomers, you need to enter a molecular formula in the input field. The molecular formula is then parsed and isomers are generated.
The generated isomers are displayed in the table below the input field. It is possible to search by substructure in the generated isomers by drawing the substructure in the structure editor. The tool also provide a list of SMILES and idCodes of the generated isomers.

![Maygen](maygen.gif)
Binary file added docs/60_cheminformatics/80_maygen/maygen.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
47 changes: 47 additions & 0 deletions docs/60_cheminformatics/90_surge/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
slug: /uuid/f16d016423c4bfa9ec3c72137682a12e
---

# Surge: Isomers generation

## Introduction

Surge is a tool to generate isomers from a molecular formula. It is based on the paper [Surge: a fast open-source chemical graph generator](https://doi.org/10.1186/s13321-022-00604-9).

## Isomers generation

This tool provides a list of options for the generation of isomers. The list of options is given in the panel shown below.

![Surge Options](surge_options.png)

The options are as follows:

- **Molecular Fromula**: The molecular formula for which the isomers are to be generated.
- **Limit**: The maximum number of isomers to be generated.
- **Timeout**: The maximum time in seconds to generate isomers (maximum 30s).
- **Calculate IdCode**: Whether to calculate the idCode for each isomer.
- **Disallow triple bonds**: Whether to disallow triple bonds in the isomers.
- **Require Planarity**: Whether to require planarity in the isomers.
- **Limit 3 rings**: Limit the number of rings of length 3, with the format `max` or `min:max`.
- **Limit 3 rings**: Limit the number of rings of length 5, with the format `max` or `min:max`.
- **No small ring triple bonds**: Whether to disallow triple bonds in rings of size up to 7.
- **Bredt's rule one**: Whether to apply [Bredt's rule](https://en.wikipedia.org/wiki/Bredt's_rule) for two rings $ij$ with one bond in common (33, 34, 35, 36, 44, 45).
- **Bredt's rule two**: Whether to apply [Bredt's rule](https://en.wikipedia.org/wiki/Bredt's_rule) for two rings $ij$ with two bonds in common ($ij$ up to 56).
- **Bredt's rule three**: Whether to apply [Bredt's rule](https://en.wikipedia.org/wiki/Bredt's_rule) for two rings of size 6 sharing three bonds.
- **No K33 K24**: Whether to disallow K33 and K24 subgraphs.
- **No cone**: None of cone of P4, K4 with 3-ear.
- **No allene**: Whether to disallow allenes (A=A=A) in a ring or not.
- **No allene in small rings**: Whether to disallow allenes in rings of size up to 8.
- **No small rings common atoms**: No atom in more than one ring of length 3 or 4.

After setting all the options, click on the `Search strucural isomers` button to generate the isomers. The results are shown in a table as shown below.

![Surge Results](table_isom.png)

A list of `SMILES` and `idCodes` are shown on the right panel and can easily be copied.

## Substructure search

Using the chemical structure editor, you can draw a structure and search for the substructure in the generated isomers. While the substructure is being drawn, the search is performed in real-time and the results are shown in the table.

![substructure search](substructure_search.gif)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/60_cheminformatics/90_surge/table_isom.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions docs/includes/advanced_query_features/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@

<details>
<summary>
Advanced Query Features
</summary>
<div>

## Advanced Query features

You can fine-tune the search by specifying atomic and bond properties. These options can be accessed by hovering over the atom or bond of interest and pressing `q`. For example, you can allow certain atoms at this specific position, or you can modify the ring size.

![advanced options](advanced_options.gif)

Furthermore, you can include separate molecules in the search. This will result in structures containing both fragments with no restrictions on orientation or connectivity. One of the molecules can be selected and excluded from the search. It removes the structures completely from the search results. The excluded fragment is highlighted in pink.

![excluded fragment](excluded_fragment.gif)

</div>

</details>
44 changes: 35 additions & 9 deletions docs/includes/preprocessing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,25 +9,51 @@

## Preprocessing

You can apply the following modifications to the spectra to enhance the visualization. The modifications include the following:
![preprocessing](preprocessing.png)

### Filters

You can apply the following `filters` to the spectra to enhance the visualization. The modifications include the following:

- `Center Mean` : subtract the mean from every variable observation in the dataset, so that the new variable's mean is centered at 0.
- `Center Median` : subtract the median from every variable observation in the dataset, so that the new variable's median is centered at 0`
- `Divide by SD` : divide every variable observable in the dataset by the standard deviation yields a distribution with a standard deviation equal to 1.
- `Divide by max Y` : divide every value by the maximum y-value shifts all the y-values between 0 and 1.
- `Normed`: Specify a value in the `value` field and select the type of normalization:
- `Sum to value`: normalize the integral under the curve so that it sums to the specified value.
- `Absolute sum to value`: normalize the integral under the curve so that the absolute sum sums to the specified value.
- `Max to value`: normalize the maximum value to the specified value.
- `Rescale (x to y)` : rescale the graph such that the y-values fit between specified minimum and maximum values.
- `Normalize (sum to n)` : normalize the integral under the curve so that it sums to n.
- `Multiply (value)` : multiply every y-value by a scalar.
- `Add (value)` : add a scalar to every y-value.
- `First derivative` : calculate the first derivative of the spectra.
- `Second derivative` : calculate the second derivative of the spectra.
- `Third derivative` : calculate the third derivative of the spectra.
- `Savitzky-Golay` : smooth the spectra and calculate derivatives based on the following parameters:
- `Window`: smoothing window size, must be an odd number, greater than 5.
- `Derivative`: derivative order.
- `Polynomial`: the degree of the polynomial used to calculate the Savitzky-Golay.
- `AirPLS baseline` : baseline correction using adaptive iterative reweighed penalized least squares algorithm.
- `Rolling average baseline` :
- `Iterative polynomial baseline` : baseline correction using iterative polynomial fitting algorithm.
- `Rolling ball baseline `:
- `Rolling median baseline `:
- `Rolling average baseline` : baseline correction using a rolling average.
- `Rolling median baseline` : baseline correction using a rolling median.
- `Rolling ball baseline` : baseline correction using a rolling ball.
- `Ensure growing X values`: ensure that the x-values are in increasing order.
- `Function on X` : apply a function to the x-values. For example, `log(x)`.
- `Function on Y` : apply a function to the y-values. For example, `log10(y+1)`.
- `Calibrate X` : calibrate the x-values with the parameters `from`, `to`, `nbPeak` and `targetX`.
- `Pareto normalization` : Pareto scaling, which uses the square root of standard deviation as the scaling factor, circumvents the amplification of noise by retaining a small portion of magnitude information. [10.1016/j.molstruc.2007.12.026](https://dx.doi.org/10.1016/j.molstruc.2007.12.026)

One classical preprocessing algorithm is [Standard Normal Variate (SNV)](http://wiki.eigenvector.com/index.php?title=Advanced_Preprocessing:_Sample_Normalization#SNV_.28Standard_Normal_Variate.29). This preprocessing can be achieved by selecting the 2 options `Center mean` and `Divide by SD`.

### Selecting the range

A certain range of x-values can be selected to show only a part of the spectrum using `Range`.

### Exclusions

Depending on the analysis, some regions should be removed using `Exclusions` in order to improve the visualization.

![add preprocessing](preprocessing.gif)
### Number of points

`Number of points` can be changed to reduce the number of points in the spectra.

</div>

Expand Down
Binary file removed docs/includes/preprocessing/preprocessing.gif
Diff not rendered.
Binary file added docs/includes/preprocessing/preprocessing.png

0 comments on commit a745911

Please sign in to comment.