openmolecules.org

 
Home » DataWarrior » Functionality » Toggle-off absolute configuration while determining Murcko scaffold
Toggle-off absolute configuration while determining Murcko scaffold [message #608] Wed, 21 August 2019 14:00 Go to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Thomas,

using both DW as well as third-party python module rdkit to
determine the Murcko scaffolds of molecules including those with
a stereogenic center, I noticed DataWarrior (DW) retains the
stereochemical information in the trimmed fragment. Reading the
same SMILES string as DW (5.0.0), rdkit (version 2019.1 with
Python2) however trims this information off.

Question: Is there an option to instruct DW equally to 'forget'
about this piece of information when writing the SMILES string?

Pristine SMILES string in question used in both programs:
C(=O)(C)O[C@H]1[C@H]([C@H](n2c3c(c(ncn3)N)nc2)O[C@@H]1CO)O

DW's output SMILES string about the Murcko scaffold:
C(C1)CO[C@H]1n1c2ncncc2nc1

MWE for processing with rdkit:
from rdkit.Chem.Scaffolds import MurckoScaffold
from rdkit.Chem import AllChem
     
source = 'C(=O)(C)O[C@H]1[C@H]([C@H](n2c3c(c(ncn3)N)nc2)O[C@@H]1CO)O'
mol = Chem.MolFromSmiles(source)
core = MurckoScaffold.GetScaffoldForMol(mol)
print(Chem.MolToSmiles(core))

>>> c1ncc2ncn(C3CCCO3)c2n1

which constitutionally is about the same molecular structure.

This may be more than a cosmetic issue, because the search for
the Murcko scaffold of

Brc1ccc([C@@]2(CC(=O)CCC2)CN(=O)=O)cc1

actually leads DW to invert the absolute configuration from
initial R to now S (expressed by the lost of one @ in
string O=C(CCC1)C[C@H]1c1ccccc1).

[Updated on: Wed, 21 August 2019 14:34]

Report message to a moderator

Re: Toggle-off absolute configuration while determining Murcko scaffold [message #615 is a reply to message #608] Sat, 24 August 2019 01:09 Go to previous messageGo to next message
thomas is currently offline  thomas
Messages: 715
Registered: June 2014
Senior Member
Dear Norwid,

to me the current handling is correct: the change from R to S in your sample SMILES is correct, because after the removal of the methylene-nitro group, the stereo-chemistry is correctly retained in the Murcko-scaffold.

index.php?t=getfile&id=88&private=0

If you generate a SMILES from the scaffold, then you may use 'Find and replace' to remove all '@' symbols from them.

Thomas
  • Attachment: t.png
    (Size: 12.89KB, Downloaded 546 times)
Re: Toggle-off absolute configuration while determining Murcko scaffold [message #616 is a reply to message #615] Sun, 25 August 2019 18:56 Go to previous message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Thomas,

as probed, the «Find and Replace» (under the Edit tab) indeed offers
exactly the tool I needed to simplify the SMILES as intended. This
change both propagates well in subsequent stages of structure display

index.php?t=getfile&id=89&private=0

as well in the how to retrieve -- either by structure, or by string --
structures sharing a Murcko scaffold in common.

Thank you,
Norwid
  • Attachment: example.png
    (Size: 4.82KB, Downloaded 661 times)
  • Attachment: example.dwar
    (Size: 6.69KB, Downloaded 401 times)
Previous Topic: Retrieve Data from SQL database
Next Topic: Calculation of mean/median values in box &whisker plots.
Goto Forum:
  


Current Time: Fri Nov 22 00:11:28 CET 2024

Total time taken to generate the page: 0.03536 seconds