3. Structure standardization
2019, 2020 Dr. Ramil Nugmanov;
2019 Dr. Timur Madzhidov; Ravil Mukhametgaleev
2022 Valentina Afoninf
Installation instructions of CGRtools package information and tutorial’s files see on https://github.com/cimm-kzn/CGRtools
NOTE: Tutorial should be performed sequentially from the start. Random cell running will lead to unexpected results.
[1]:
import pkg_resources
if pkg_resources.get_distribution('CGRtools').version.split('.')[:2] != ['4', '1']:
print('WARNING. Tutorial was tested on 4.1 version of CGRtools')
else:
print('Welcome!')
Welcome!
[2]:
# load data for tutorial
from pickle import load
from traceback import format_exc
with open('molecules.dat', 'rb') as f:
molecules = load(f) # list of MoleculeContainer objects
with open('reactions.dat', 'rb') as f:
reactions = load(f) # list of ReactionContainer objects
m1, m2, m3, m4 = molecules # molecule
r2 = reactions[2] # reaction
3.1. Molecules
MoleculeContainer has standardize
, kekule
, thiele
, neutralize
, implicify_hydrogens
and canonicalize
methods.
Method thiele
transforms Kekule representation of rings into aromatized. Method standardize
applies functional group standardization rules to molecules (more than 50 rules).
Method canonicalize
apply set of methods: neutralize
, standardize
, kekule
, implicify_hydrogens
, thiele
[3]:
m3 # molecule with kekulized ring
[3]:
[4]:
m3.standardize() # apply standardization. Returns True if any group found
[4]:
True
[5]:
m3 # group-standardized structure.
[5]:
[6]:
m3.thiele() # aromatizes and returns True then any ring found
[6]:
True
[7]:
m3
[7]:
Molecules has explicify_hydrogens
and implicify_hydrogens
methods to handle hydrogens.
This methods is used to add or remove hydrogens in molecule.
Note implicify_hydrogens
working for aromatic rings only in kekule
form. explicify_hydrogens
for aromatized
forms required kekule
and optionally thiele
procedures applyed before.
[8]:
m3.explicify_hydrogens() # return number of added hydrogens
[8]:
5
[9]:
m3
[9]:
[10]:
m3.clean2d() # for added hydrogen atoms coordinates are not calculated.
# Thus, it looks like hydrogen has the same position on image
m3
[10]:
[11]:
m3.kekule()
m3.implicify_hydrogens()
[11]:
5
[12]:
m3
[12]:
3.2. Reactions standardization
ReactionContainer has same methods as molecules. In this case they are applied to all molecules in reaction.
[13]:
r2
[13]:
[14]:
r2.standardize()
r2.explicify_hydrogens()
r2.clean2d()
r2
[14]: