CGRtools.files package

Available file parsers and writers:

class CGRtools.files.ERDFWrite(file, *, append: bool = False, write3d: bool = False, mapping: bool = True)

MDL V3000 RDF files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters
  • append – append to existing file (True) or rewrite it (False). For buffered writer object append = False will write RDF header and append = True will omit the header.

  • write3d – write for Molecules first 3D coordinates instead 2D if exists.

  • mapping – write atom mapping.

write(data)
class CGRtools.files.ESDFWrite(file, *, write3d: int = 0, mapping: bool = True, append: bool = False)

MDL V3000 SDF files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters
  • write3d – write for Molecules 3D coordinates instead 2D if exists. if 0 - 2D only, 1 - first 3D, 2 - all 3D in sequence.

  • mapping – write atom mapping.

write(data)

write single molecule into file

class CGRtools.files.INCHIRead(file, header=None, ignore_stereo=False, **kwargs)

INCHI separated per lines files reader. works similar to opened file object. support with context manager. on initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object. line should be start with INCHI string and optionally continues with space/tab separated list of key:value [or key=value] data if header=None.

example:

InChI=1S/C2H5/c1-2/h1H2,2H3/q+1 id:123 key=value

if header=True then first line of file should be space/tab separated list of keys including INCHI column key.
example:

ignored_inchi_key key1 key2 InChI=1S/C2H5/c1-2/h1H2,2H3/q+1 1 2

also possible to pass list of keys (without inchi_pseudo_key) for mapping space/tab separated list of INCHI and values: header=[‘key1’, ‘key2’] # order depended

Parameters
  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • store_log – Store parser log if exists messages to .meta by key CGRtoolsParserLog.

  • calc_cis_trans – Calculate cis/trans marks from 2d coordinates.

  • ignore_stereo – Ignore stereo data.

close(force=False)

close opened file

Parameters

force – force closing of externally opened file or buffer

classmethod create_parser(*args, **kwargs)

Create INCHI parser function configured same as INCHIRead object

parse(inchi: str) Union[MoleculeContainer, Dict[str, str]]

convert INCHI string into MoleculeContainer object. string should be start with INCHI and optionally continues with space/tab separated list of key:value [or key=value] data.

read() List[MoleculeContainer]

parse whole file

Returns

list of parsed molecules

class CGRtools.files.MRVRead(file, **kwargs)

ChemAxon MRV files reader. works similar to opened file object. support with context manager. on initialization accept opened in binary mode file, string path to file, pathlib.Path object or another binary buffered reader object

Parameters
  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • store_log – Store parser log if exists messages to .meta by key CGRtoolsParserLog.

  • calc_cis_trans – Calculate cis/trans marks from 2d coordinates.

  • ignore_stereo – Ignore stereo data.

close(force=False)

close opened file

Parameters

force – force closing of externally opened file or buffer

read()

parse whole file

Returns

list of parsed molecules or reactions

class CGRtools.files.MRVWrite(file)

ChemAxon MRV files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

close(force=False)

write close tag of MRV file and close opened file

Parameters

force – force closing of externally opened file or buffer

write(data)

write single molecule or reaction into file

class CGRtools.files.PDBRead(file, ignore=False, element_name_priority=False, parse_as_single=False, atom_name_map=None, **kwargs)

PDB files reader. Works similar to opened file object. Support with context manager. On initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object.

Supported multiple structures in same file separated by ENDMDL. Supported only ATOM and HETATM parsing. END or ENDMDL required in the end.

Parameters
  • ignore – Skip some checks of data or try to fix some errors.

  • store_log – Store parser log if exists messages to .meta by key CGRtoolsParserLog.

  • element_name_priority – For ligands use element symbol column value and ignore atom name column.

  • parse_as_single – Usable if all models in file is the same structure. 2d graph will be restored from first model. Other models will be returned as conformers.

  • atom_name_map – dictionary with atom names replacements. e.g.: {‘Ow’: ‘O’}. Keys should be capitalized.

class CGRtools.files.RDFRead(*args, **kwargs)

MDL RDF files reader. works similar to opened file object. support with context manager. on initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object

Parameters
  • indexable

    if True: supported methods seek, tell, object size and subscription, it only works when dealing with a real file (the path to the file is specified) because the external grep utility is used, supporting in unix-like OS the object behaves like a normal open file.

    if False: works like generator converting a record into ReactionContainer and returning each object in order, records with errors are skipped

  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • store_log – Store parser log if exists messages to .meta by key CGRtoolsParserLog.

  • calc_cis_trans – Calculate cis/trans marks from 2d coordinates.

  • ignore_stereo – Ignore stereo data.

seek(offset)

shifts on a given number of record in the original file :param offset: number of record

tell()
Returns

number of records processed from the original file

class CGRtools.files.RDFWrite(file, *, append: bool = False, write3d: bool = False, mapping: bool = True)

MDL RDF files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters
  • append – append to existing file (True) or rewrite it (False). For buffered writer object append = False will write RDF header and append = True will omit the header.

  • write3d – write for Molecules first 3D coordinates instead 2D if exists.

  • mapping – write atom mapping.

write(data)
class CGRtools.files.SDFRead(*args, **kwargs)

MDL SDF files reader. works similar to opened file object. support with context manager. on initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object

Parameters
  • indexable

    if True: supported methods seek, tell, object size and subscription, it only works when dealing with a real file (the path to the file is specified) because the external grep utility is used, supporting in unix-like OS the object behaves like a normal open file.

    if False: works like generator converting a record into MoleculeContainer and returning each object in order, records with errors are skipped

  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • store_log – Store parser log if exists messages to .meta by key CGRtoolsParserLog.

  • calc_cis_trans – Calculate cis/trans marks from 2d coordinates.

  • ignore_stereo – Ignore stereo data.

seek(offset)

shifts on a given number of record in the original file :param offset: number of record

tell()
Returns

number of records processed from the original file

class CGRtools.files.SDFWrite(file, *, write3d: int = 0, mapping: bool = True, append: bool = False)

MDL SDF files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters
  • write3d – write for Molecules 3D coordinates instead 2D if exists. if 0 - 2D only, 1 - first 3D, 2 - all 3D in sequence.

  • mapping – write atom mapping.

write(data)

write single molecule into file

class CGRtools.files.SMILESRead(file, header=None, ignore_stereo=False, **kwargs)

SMILES separated per lines files reader. Works similar to opened file object. Support with context manager. On initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object.

Line should be start with SMILES string and optionally continues with space/tab separated list of key:value [or key=value] data if header=None. For example:

C=C>>CC id:123 key=value

if header=True then first line of file should be space/tab separated list of keys including smiles column key. For example:

ignored_smi_key key1 key2
CCN 1 2

Also possible to pass list of keys (without smiles_pseudo_key) for mapping space/tab separated list of SMILES and values: header=[‘key1’, ‘key2’] # order depended.

For reactions . [dot] in bonds should be used only for molecules separation.

Parameters
  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • store_log – Store parser log if exists messages to .meta by key CGRtoolsParserLog.

  • ignore_stereo – Ignore stereo data.

close(force=False)

Close opened file.

Parameters

force – Force closing of externally opened file or buffer.

classmethod create_parser(*args, **kwargs)

Create SMILES parser function configured same as SMILESRead object.

parse(smiles: str) Union[MoleculeContainer, CGRContainer, ReactionContainer, Dict[str, str]]

SMILES string parser.

read() List[Union[MoleculeContainer, CGRContainer, ReactionContainer]]

Parse whole file.

Returns

List of parsed molecules or reactions.

class CGRtools.files.XYZRead(file, **kwargs)

XYZ files reader. Works similar to opened file object. Support with context manager. On initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object.

Supported multiple structures in same file. In second line possible to store total charge of system. Example:

2
charge=-1
O 0.0 0.0 0.0
H 1.0 0.0 0.0
Parameters
  • radius_multiplier – Multiplier of sum of covalent radii of atoms which has bonds

  • store_log – Store parser log if exists messages to .meta by key CGRtoolsParserLog.

classmethod create_parser(*args, **kwargs)

Create XYZ parser function configured same as XYZRead object.

from_xyz(matrix, charge=0, radical=0)
parse(matrix: Iterable[Tuple[str, float, float, float]], charge: int = 0, radical: int = 0)