openmolecules.org

 
Home » DataWarrior » Functionality » Comparing Structure Files (proble to use the "Find similar compound in other file" tool)
Comparing Structure Files [message #1470] Wed, 12 January 2022 13:42 Go to next message
biolive31!! is currently offline  biolive31!!
Messages: 5
Registered: January 2022
Junior Member
Good morning I'am Olivier a new datawarior user !
I tried to use the "Find similar compound in other file" tool but it doesnt work.

I know that some molecule are similar to those in the "other file " and the tool did not find it . Any molecule is find similar. I try both comparison method , even to 70% of similarity any similar molecule is find .
Could you help me ?

Thank you

Olivier
Re: Comparing Structure Files [message #1471 is a reply to message #1470] Thu, 13 January 2022 23:44 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 204
Registered: June 2019
Senior Member
Dear Oliver,

your question contains little material to replicate your finding. Thus, it is difficult to identify a plausible cause for your observation.

On the other hand, I created two tiny libraries of drug-like molecules (chemistry -> generate random molecules) where the defaults were modified slightly (instead of 1000, I opted for 100 molecules each). Subsequently, while reading set_b.dwar, I selected set_a.dwar for a comparison. With a structureFp threshold lowered to 70%, DW identified 13 molecules as similar enough. Raw data, set up of the parameters, and result files in the attachement below. Perhaps the threshold set by you was too high? The criterion «structure exact» isn't a restraint, it is meant as a constraint*); any variation in the molecule to check vs. the reference molecule and the test assigns the two as dissimilar.

*) which may be weakened in regard of stereochemistry, tautomers, or application on the largest fragment only.

Norwid
Re: Comparing Structure Files [message #1473 is a reply to message #1471] Mon, 17 January 2022 14:39 Go to previous messageGo to next message
biolive31!! is currently offline  biolive31!!
Messages: 5
Registered: January 2022
Junior Member
Dear Norwid,
Thank you for your help.
I tired to compare the set a and seb . I found 16 similar compounds.

My first conclusion is that DW I installed work well. Smile

Now I wondering if my files are, in someway, unanalysable by similarity tool in DW.
These files are sdf files with structures.

I tried an other tool to compare molecule of my files (chemdiff) and it wasn't able to identify similar structure like DW.
Do you know what is the file type required to perform a similarity analysis ?

thank you for your help.

Olivier
Re: Comparing Structure Files [message #1479 is a reply to message #1473] Mon, 24 January 2022 13:36 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 204
Registered: June 2019
Senior Member
Dear Olivier,

by Monday, 2022-01-24, Thomas identified the cause for DW's problems to deploy the criterion «Structure[Exact]» in the similarity analysis of two structure files. In the development version, the code is corrected accordingly.

Norwid
Re: Comparing Structure Files [message #1535 is a reply to message #1479] Tue, 08 March 2022 15:37 Go to previous messageGo to next message
biolive31!! is currently offline  biolive31!!
Messages: 5
Registered: January 2022
Junior Member
Dear Norwid,
Thank you to keep inform about this problem. Where the development version is downable ?
In addition in the version I used there is an other bugg in the fine similarity function when you slect Structure[FragFp] tools.
When you are using this tool to find similar compound in an other SDF file it work well only when you are not selecting colomuns to be copied into the file. When you select a colummn to copy into the file the similarity tool doesn't work.

Thank you for your help

Olivier
Re: Comparing Structure Files [message #1536 is a reply to message #1535] Tue, 08 March 2022 23:06 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 204
Registered: June 2019
Senior Member
Dear Olivier,

after accepting the disclaimer on the download page[1], an additional text box is visible with links for either an installation in Linux and Mac,[2] or for the Windows operating system.[3] These archives contain only the .jar of ongoing development of DataWarrior you substitute in an already existing installation.

So far, DataWarrior's identification of similar/identical molecules in two sets of molecules worked best for me be when starting with two .dwar containing only the structures of either set without any further computed data. Then, there was no ambiguity in selecting the column in question when running an instance with one set and reaching out for the other set to compare with by Chemistry -> Find Similar Compounds In File. It is dissimilar to a copy-paste of structures of one .dwar into an other .dwar; in parlance of DataWarrior append data (without header).

Norwid

[1] https://openmolecules.org/datawarrior/download.html
[2] https://openmolecules.org/datawarrior/dw550x.zip
[3] https://openmolecules.org/datawarrior/dw550win.zip
Re: Comparing Structure Files [message #1537 is a reply to message #1536] Wed, 09 March 2022 22:52 Go to previous messageGo to next message
biolive31!! is currently offline  biolive31!!
Messages: 5
Registered: January 2022
Junior Member
Norwid,
thank you I'am going to install the ongoing dvpt version.
I will also try to test the similar tool with the .dwar file.

I keep you in touch.
Thank you for your help.
olivier
Re: Comparing Structure Files [message #1538 is a reply to message #1537] Thu, 10 March 2022 10:58 Go to previous messageGo to next message
biolive31!! is currently offline  biolive31!!
Messages: 5
Registered: January 2022
Junior Member
Dear Norwid,

i'am sorry but the exact similar tool still not working with the .jar file I replaced in the Datawarior windows programm file and the comparison with the simple ( two column file) .dwar file for comparison.

Olivier
Re: Comparing Structure Files [message #1539 is a reply to message #1538] Thu, 10 March 2022 19:37 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 204
Registered: June 2019
Senior Member
Dear Olivier,

the comparison by «Structure[Exact]» may be more strict (think of identity) than by «Structure [FragFp]» (think of similarity), though DW allows both may be attenuated/adjusted. But set_A and set_B mentioned yield an overlap only by similarity if the threshold is less than or equal to 0.84.

Norwid
Re: Comparing Structure Files [message #1580 is a reply to message #1539] Tue, 05 April 2022 16:47 Go to previous message
thomas is currently offline  thomas
Messages: 646
Registered: June 2014
Senior Member
Dear Olivier and Norvid,

I am sorry that it took such a long time for me to realize the issue here. When comparing an open window against an SD-file and if columns shall be copied from the SD-file into the open window, then DataWarrior used to parse the SD-file to the end in order to create a list of existing columns. When then starting comparing molecules, it didn't find any molecules, because it was already at the end of the SD-file and, therefore, it never found matching molecules in this case. This should now be fixed in the current dev version.

Thomas
Previous Topic: Video tutorial on using Datawarrior to undertake QSAR predictions
Next Topic: search for conformers, empty column «minimization error»
Goto Forum:
  


Current Time: Thu Mar 28 22:40:51 CET 2024

Total time taken to generate the page: 0.09758 seconds