Home » DataWarrior » Bug Reports » Bug in how stereochemistry is reported (Stereochemistry)
Bug in how stereochemistry is reported [message #1184] |
Fri, 08 January 2021 11:52 |
richards99
Messages: 42 Registered: May 2020 Location: UK
|
Member |
|
|
Hi,
When a molecule has had its 3D coordinates generated in other software packages, and then these are loaded into DataWarrior as an SDF, it seems the stereochemistry representation in the table is not always correct or defined.
Not knowing how this works so I am only speculating, but I suspect DW may be picking up flags just from a 2D representation and not looking at the 3D representation. Is it possible for DW to look and see if 3D coordinates have been generated, and if they have, then to display the stereochemistry defined in the 3D conformation.
It has been causing a lot of problems, in that when generating smiles in DW of molecules with 3D coordinates, it is sometimes returning the wrong stereochemistry or undefined stereochemistry.
Thanks,
Simon.
|
|
|
Re: Bug in how stereochemistry is reported [message #1186 is a reply to message #1184] |
Wed, 13 January 2021 17:51 |
thomas
Messages: 711 Registered: June 2014
|
Senior Member |
|
|
Hi Simon,
DataWarrior puts a lot of attention to handle stereochemistry correctly. When reading SD-files that have 3D coordinates, then the 3D-coordinates are taken to determine stereo parities. I tried to do the following to verify, whether there is a problem:
- I generated a file with 1000 random drug-like molecules
- I added a column with the stereo center count (found 0 to 5 stereo centers) and removed all rows with 0 stereo centers
- I added 3D-coordinates ending up with 480 molecules with 2D- and 3D-coordinates; none of them were racemic
- I added a column with row numbers
- I saved the file as SD-file (version 2) using 3D- rather than 2D-coordinates
- I merged the SD-file into the original file using the Row-No as merge key
- Then I checked, whether the structures were perceived as equal
- Except for 2 rows, molecules were identical. These two rows had unspecified double bond geometries in the first place and the conformer generator picked randomly one, which was different from the original. Thus, stereo centers were matching in all cases.
To understand what doesn't work in your case, I need to understand better, what you do. One reason may be that the software that you use to generate your 3D-coordinates and to export SD-files exports the molecules as racemates, i.e. with 'chiral flag' set to 0. Please check your SD-file with a text editor. The forth line (or every forth line after a $$$$ line) is called 'counts line'. It contains the number of atoms and bonds and other information about the molecule. It looks like:
49 50 0 0 1 0 0 0 0 0999 V2000
If the fifth entry is a '1', then the structure is meant to be enantiomerically pure. If it is a '0', then this means that you have a racemic mixture containing the following structure plus its mirror image. If DataWarrior reads a racemate from a molfile, it shows green stereo bonds with a '&' sign that says, it is this and the opposite stereo configuration. Stereo bonds of absolute stereo centers are drawn in red. In case of racemates, DataWarrior does not necessarily display the enantiomer of the SD-file, because DataWarrior stores structures in a canonical way. For racemic mixtures it doesn't matter, which enantiomer is stored as long as it is marked as racemic. For exact molecules matching, however, it is important that also racemic mixtures are stored in a reproducible way.
If this is not the reason for you observation, please give me an example that doesn't work to find the reason.
Hope, I could explain in an understandable way...
Thomas
|
|
|
|
|
Re: Bug in how stereochemistry is reported [message #1245 is a reply to message #1240] |
Tue, 09 March 2021 16:01 |
thomas
Messages: 711 Registered: June 2014
|
Senior Member |
|
|
Dear Simon,
I tried to confirm the issue with CC[C@H](O)[C@@H](CC1=CC=CC=C1)C#N with the official 5.2.1 version and with the current dev version. If I paste the smiles into an editor or if I paste a tiny table with the smiles being part of it, DataWarrior always created the correct structure with two stereo centers. Can you please give me a procedure where it doesn't work?
The issue with the chiral flag not set in the SD-file is a V2000 problem, not V3000. The fifth entry in the counts line of a V2000 molfile encodes, whether the structure is a racemate or not (0 or 1). For racemates DataWarrior must normalise the configuration, because DataWarrior stores canonical structure representations. A normalization means here that it determines in a reproducible way, which of the two enantiomers is shown and stored. It cannot just store the input form, because then there would be two different representations encoding the same structure.
What I could do, is to provide an option when loading V2000 SD-files to consider racemic molfiles with encoded stereo centers as enantiomerically pure, just assuming that the chiral flag is meant to be set. This, of course, is dangerious, especially if more than one stereo centers are present. Depending on the source of the molfile, however, one may have good reason to believe that the software 'forgot' to set the chiral flag, e.g. when the structures represent conformers with 3D-coordinates.
What do you think?
Thomas
|
|
|
Re: Bug in how stereochemistry is reported [message #1250 is a reply to message #1245] |
Thu, 11 March 2021 09:26 |
richards99
Messages: 42 Registered: May 2020 Location: UK
|
Member |
|
|
Hi Thomas,
Below is a text file of the SMILES I mention in the previous post. When I load this directly into DataWarrior, the carbon with the alcohol attached correctly shows a wedge bond indicating its chirality, whilst the carbon which has the nitrile attached only shows flat bonds with a ? label to indicate unknown stereochemistry. I am using the Dev build of DataWarrior. Other colleagues can reproduce the problem I am seeing also.
With regards the chiral flag issues of mol files, the solution you propose would be very useful as there seems to be multiple tools which are not setting the chiral flag, and therefore when importing into DataWarrior the stereochemistry is getting scrambled. Whether it is in the dialog box for opening an SDF, or a setting in the preferences, I do not mind, so long as an option is available.
Thanks,
Simon.
-
Attachment: temp.txt
(Size: 0.04KB, Downloaded 303 times)
|
|
|
|
|
|
|
Re: Bug in how stereochemistry is reported [message #1968 is a reply to message #1966] |
Thu, 27 July 2023 19:45 |
thomas
Messages: 711 Registered: June 2014
|
Senior Member |
|
|
Dear Tom,
molfiles V3000 don't use a 'chiral' flag. If you open your molfile in DataWarrior, you will notice that it shows an absolute stereo center. Of course, DataWarrior uses the correct Y-direction for molfiles, which is indeed inverted in comparison to most other xy-orientations. CFG=3 is a down bond and coordinates are correctly created leading to an R-isomer. ChemDraw does the same. I am afraid that ChemDoodle created the wrong CFG entry here.
Best wishes,
Thomas
|
|
|
Goto Forum:
Current Time: Fri Nov 08 23:27:03 CET 2024
Total time taken to generate the page: 0.03684 seconds
|