3. Structure standardization

    1. 2019, 2020 Dr. Ramil Nugmanov;

    1. 2019 Dr. Timur Madzhidov; Ravil Mukhametgaleev

    1. 2022 Valentina Afoninf

Installation instructions of CGRtools package information and tutorial’s files see on https://github.com/cimm-kzn/CGRtools

NOTE: Tutorial should be performed sequentially from the start. Random cell running will lead to unexpected results.

[1]:
import pkg_resources
if pkg_resources.get_distribution('CGRtools').version.split('.')[:2] != ['4', '1']:
    print('WARNING. Tutorial was tested on 4.1 version of CGRtools')
else:
    print('Welcome!')
Welcome!
[2]:
# load data for tutorial
from pickle import load
from traceback import format_exc

with open('molecules.dat', 'rb') as f:
    molecules = load(f) # list of MoleculeContainer objects
with open('reactions.dat', 'rb') as f:
    reactions = load(f) # list of ReactionContainer objects

m1, m2, m3, m4 = molecules # molecule
r2 = reactions[2] # reaction

3.1. Molecules

MoleculeContainer has standardize, kekule, thiele, neutralize, implicify_hydrogens and canonicalize methods.

Method thiele transforms Kekule representation of rings into aromatized. Method standardize applies functional group standardization rules to molecules (more than 50 rules).

Method canonicalize apply set of methods: neutralize, standardize, kekule, implicify_hydrogens, thiele

[3]:
m3 # molecule with kekulized ring
[3]:
../_images/tutorial_3_standardization_4_0.svg
[4]:
m3.standardize()  # apply standardization. Returns True if any group found
[4]:
True
[5]:
m3 # group-standardized structure.
[5]:
../_images/tutorial_3_standardization_6_0.svg
[6]:
m3.thiele() # aromatizes and returns True then any ring found
[6]:
True
[7]:
m3
[7]:
../_images/tutorial_3_standardization_8_0.svg

Molecules has explicify_hydrogens and implicify_hydrogens methods to handle hydrogens.

This methods is used to add or remove hydrogens in molecule.

Note implicify_hydrogens working for aromatic rings only in kekule form. explicify_hydrogens for aromatized forms required kekule and optionally thiele procedures applyed before.

[8]:
m3.explicify_hydrogens() # return number of added hydrogens
[8]:
5
[9]:
m3
[9]:
../_images/tutorial_3_standardization_11_0.svg
[10]:
m3.clean2d() # for added hydrogen atoms coordinates are not calculated.
# Thus, it looks like hydrogen has the same position on image
m3
[10]:
../_images/tutorial_3_standardization_12_0.svg
[11]:
m3.kekule()
m3.implicify_hydrogens()
[11]:
5
[12]:
m3
[12]:
../_images/tutorial_3_standardization_14_0.svg

3.2. Reactions standardization

ReactionContainer has same methods as molecules. In this case they are applied to all molecules in reaction.

[13]:
r2
[13]:
../_images/tutorial_3_standardization_16_0.svg
[14]:
r2.standardize()
r2.explicify_hydrogens()
r2.clean2d()
r2
[14]:
../_images/tutorial_3_standardization_17_0.svg