openmolecules.org

 
Home » DataWarrior » Functionality » Macro for evolutionary library (EL) in Datwarrior (DW)
Macro for evolutionary library (EL) in Datwarrior (DW) [message #1696] Tue, 09 August 2022 14:42 Go to next message
Jo W
Messages: 34
Registered: July 2021
Member
I want to be able to copy a (any) structure from an existing table in DW into EL using a macro.

when I create a macro to do this it works fine.

However when I try to use this same macro on a new structure, it doesn't work. The macro just uses the previous structure and ignores the new one.
How do you get around this?

Ultimately I want to be able to run several EL's in a "chain" using a macro starting ideally from any structure to help towards creating a virtual library of new compounds.
Many thanks in advance
Jon
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1697 is a reply to message #1696] Tue, 09 August 2022 22:06 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Jon,

when generating an evolutionary library, DataWarrior offers (at least) twice to use a set of molecules from a .sdf or a .dwar as reference; in the screen photo below, they are marked by (A) and (B).

/forum/index.php?t=getfile&id=581&private=0

Should your question be understood in lines of «instead of molecules of an (of one) EL to be similar enough to a set of e.g., 20 molecules defined in box A, you would rather generate 20 EL where each of these reference molecules is a unique seed for an individual EL (more like an iteration)»?

Anyway, a .dwar and macro .dwam to represent a minimal working example (even if it currently stops after the first molecule of reference) to complement your question would be welcomed.

Norwid

[Updated on: Tue, 09 August 2022 22:19]

Report message to a moderator

Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1700 is a reply to message #1697] Sun, 21 August 2022 19:31 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Hi Norwid and Thomas
Thanks for trying to answer the problem including the suggestion of using a file (which in this case did not solve the issue).

However, that's not quite what I had in mind so please see more information from the attached word document (with images).

Basically, it seems that you can't paste structures (or smiles) into an Evolutionary library window using a macro.

If I am correct - is there any way to change this as it would be really useful to have a chain of ELs running via a macro.


Many thanks in advance

Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1701 is a reply to message #1700] Mon, 22 August 2022 15:11 Go to previous messageGo to next message
thomas is currently offline  thomas
Messages: 711
Registered: June 2014
Senior Member
I will do something and let you know...
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1702 is a reply to message #1701] Mon, 22 August 2022 17:57 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Here is the code from the macro that I used:
<macro name="Macro EL Help X">
<task name="selectView">
viewName=Table
</task>
<task name="buildEvolutionaryLibrary">
startSet=fb}@`@@YRYVum[ehrRkNBf@BjjjjJ`@@
survivalCount=8
kind=drugs
paramConfig0=structure  similar     SkelSphere s 100   fb}@`@@YRYVum[ehrRkNBf@BjjjjJ`@@
paramCount=1
generationCount=automatic
generationSize=128
</task>
<task name="selectWindow">
viewName=Evolutionary Library
</task>
<task name="selectView">
viewName=Table
</task>
<task name="sortRows">
column=Fitness
selectedFirst=false
descending=false
</task>
<task name="selectView">
viewName=Table
</task>
<task name="buildEvolutionaryLibrary">
startSet=fbm@`@@YrJIJEJJIRKTXxOdp@UUUUQQIQTMOP@
survivalCount=8
kind=drugs
paramConfig0=structure  similar     SkelSphere s 100   fbm@`@@YrJIJEJJIRKTXxOdp@UUUUQQIQTMOP@
paramCount=1
generationCount=automatic
generationSize=128
</task>
<task name="selectWindow">
viewName=Evolutionary Library
</task>
<task name="selectView">
viewName=Table
</task>
<task name="sortRows">
column=Fitness
selectedFirst=false
descending=false
</task>
<task name="selectView">
viewName=Table
</task>
<task name="sortRows">
column=Fitness
selectedFirst=false
descending=true
</task>
<task name="selectView">
viewName=Table
</task>
<task name="buildEvolutionaryLibrary">
startSet=fbmAB@L`dLbbRaRbbTbuFNCyL@EUUUTTRTUCSt@@
survivalCount=8
kind=drugs
paramConfig0=structure  similar     SkelSphere s 100   fbmAB@L`dLbbRaRbbTbuFNCyL@EUUUTTRTUCSt@@
paramCount=1
generationCount=automatic
generationSize=128
</task>
<task name="selectWindow">
viewName=Evolutionary Library
</task>
<task name="selectView">
viewName=Table
</task>
</macro>

Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1703 is a reply to message #1702] Tue, 23 August 2022 21:58 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Jon,

while converting the macro you reported as text into a .dwam file (for convenience of further use attached below), I noticed multiple pairs where «a seed» reporting a molecules' structure in idcodes (the line starts with "startSet=") is followed by an other line beginning with "paramConfig0". Assuming the first line counted as line "1", in the present .dwam this pattern occurs thrice, i.e. for lines (6, 9), (29, 32), and (60, 63).

Simultaneously, I read on the second line of these a high SkelSphere threshold, 100, e.g.

/forum/index.php?t=getfile&id=584&private=0

My speculation would be that this requested similarity might be too strict to yield multiple (new) molecules, perhaps especially because (based on your screen photos submitted) there only is one molecule to define the fitness of molecules to retain for a report / subsequent use.

Norwid
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1704 is a reply to message #1703] Tue, 23 August 2022 23:28 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Hi Norwid (and Thomas)

I have been using Notepad to edit the macro and found that DataWarrior saves the structure in the Macro as an Idcode.

The reason the Evolutionary Library appears three times as I am attempting to make it recursive and a minimum of two evolutionary libraries would be necessary, the third was just a demonstration of it not working. The SkelSpheres may be too high but in this example, the focus is on having the Macro "Copy and Paste" into the "startSet=...". The specifics of Weighting can be focused on at a later date.

Thanks for bringing to light the restriction of molecules I am generating and the fact that the target fitness is the same as the original. However, in the future, the objective is still to get "Copy and pasting" working inside the Macro's view of the evolutionary window.

Even if SkelSpheres structure similarity was at '60' and the parameters for the desired structure were different I would still have the same issue as I have tried configuring these in multiple ways before.
So any further help from you and Thomas on enabling copying and pasting in EL using a macro would be appreciated.

Thanks, Jon
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1706 is a reply to message #1704] Fri, 26 August 2022 15:25 Go to previous messageGo to next message
thomas is currently offline  thomas
Messages: 711
Registered: June 2014
Senior Member
I just released an update that allows deferred 1st generation building: In the e-lib dialog you can select to build the 1st generation at macro execution time using various option, e.g. molecules from clipboard, a file, a structure column from the current window optionally defining a subset by the selection or a row list. I have tested a lot of things, but the implementation involved a lot of code and changes. Thus, please test and let me know, if you experience unexpected behaviour. Have fun, Thomas
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1707 is a reply to message #1706] Fri, 26 August 2022 15:40 Go to previous messageGo to next message
thomas is currently offline  thomas
Messages: 711
Registered: June 2014
Senior Member
Hi JonW,

One more remark: when after sorting a generated e-lib by fitness and before feeding those compounds into a new start generation, it would probably be useful to use just the best n compounds for the new start set. I didn't see a task to achieve that in you macro. One way of doing that would be to add row numbers after sorting and then applying a filter on the row number column e.g. from 1-32, then save visible rows as the file to be used for the next start set. Alternatively, you could select the table view, select all (visible) rows, copy view content (which in this case would be the entire TAB-delimited table) and use the selection as new 1st generation.

I noticed that a task to copy the n best molecule, providing n and the column to decide what is best, would be very handy here...

Hope this was more or less clear...

Thomas

[Updated on: Fri, 26 August 2022 15:42]

Report message to a moderator

Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1710 is a reply to message #1707] Sat, 27 August 2022 14:12 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Thomas,

with the current update of DW (packaged 2022-08-27 12:08 for Linux/MacOS) the «self-reference» is unable to start. As a test, I let DW generate 5 random molecules in first place and saved the result as .dwar. Subsequently, and still with the same file, the aim to pick these structures as start for an EL fails:

/forum/index.php?t=getfile&id=586&private=0

However one may argue to populate the window «1st generation molecules» right away would be at least as valuable than the pick from an external/other .sdf or .dwar file (which are accepted input).

To mark every in the structure column by individual click (in my installation, the cell's background then darkens) and attempt again to use these entries of the same file just open and running via «structure column» equally fails.

Norwid
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1711 is a reply to message #1710] Sat, 27 August 2022 15:06 Go to previous messageGo to next message
thomas is currently offline  thomas
Messages: 711
Registered: June 2014
Senior Member
Dear Norwid,

many thanks. I overlooked that. This check should only have been done, when list was selected. I just have fixed it.
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1712 is a reply to message #1711] Sat, 27 August 2022 17:20 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Jon,

thanks to Thomas' new update of DW for Mac/Linux (packaged 2022-08-27 15:02), I generated a new (tiny) set of random molecules (file Random_Molecules.dwar) used to record a toy macro (1to1.dwam) to loop over the first three entries of the molecule library, to pick one molecule at a time as 1st generation molecule of the evolutionary library. For now, only a computationally affordable criterion of fitness was set (molecular weight .lt. 400 g/mol).

When I let this macro work*, it generates three individual evolutionary libraries. Instead of a set of multiple molecules, there is only one molecule as seed for this new collection.

Does this matches better your intent mentioned by August 9th? If this were the case, one could now reflect about further work along this direction; 1) about an approach which doesn't stop after the third cycle. And 2) instead of the current pick of seed molecules by structure to address the input by row number in Random_Molecules.dwar.

Norwid


*) DW 5.5.0 with the afternoon update 2022-08-27 15:02 in Linux Debian 12/bookworm. Both Random_Molecules.dwar as well as the macro 1to1.dwam were in the same folder (Desktop) during the test run. The path for the new data deposits (here three lines), e.g.

fileName=/home/norwid/Desktop/Evolutionary_Library_01.dwar

in 1to1.dwam requires an update to your environment accessible.
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1716 is a reply to message #1706] Tue, 30 August 2022 17:56 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Hi Thomas
Thank you very much for making the changes. I would like to try the update but I am a chemist not a programmer.

So what do I do?

Are you saying that the latest download includes the changes you have made i.e., I simply click the download button for Windows from https://openmolecules.org/datawarrior/download.html and use this version?

or are you saying that I have to download a "patch" if so where do I get that from and how do I install it.

Many thanks
Jon
I am using a PC with Windows
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1717 is a reply to message #1716] Tue, 30 August 2022 19:56 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Jon,

on https://openmolecules.org/datawarrior/download.html, confirm «I have read and understood the disclaimer» below the eventually green rectangles. This then opens a new text below with links to archives to update DW. If you use Windows, your link address for the updates is https://openmolecules.org/datawarrior/dw550win.zip. Decompress the download (ca 30 MB) and use its content (DataWarrior.exe, jniinchi.jar, and mmtf.jar) to replace the files of same name in your already existing installation of DataWarrior. And you got the update, ready to restart DW again.

Norwid
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1722 is a reply to message #1717] Sun, 04 September 2022 19:57 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Dear Norwid
Many thanks for the link. A few questions:

1. Is the link you provided the same for any new updates/modifications that could be undertaken in the future or will a new link be created each time an update is made?
If it is the latter how do we know what the new link will be?

2. With regard to the download page at: https://openmolecules.org/datawarrior/download.html

This presumably has only the older "official version" ie DataWarrior V5.5.0 from April 2021 and does not have any updates since then- is that correct?

3. It may be useful to keep this older version in a separate folder in case one prefers some of the older settings (e.g. for running a specific macro for which a newer update might not function correctly). My question is - may an older version and an updated version on the same hard disc cause some stability issues or other problems?

Best regards
Jon
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1723 is a reply to message #1707] Mon, 05 September 2022 01:39 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Dear Thomas
Many thanks for the suggestions and for updating the software. I will try the changes this week and see how the macro performs.

Are there any published papers or further information on using the EL in DW apart from the online manual and a brief tutorial on youtube?
Best regards
Jon
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1724 is a reply to message #1723] Tue, 06 September 2022 08:05 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Jon,

the links following the penguin, the apple, the Windows symbol and before the «I have read and understood the disclaimer» which turn into green rectangles after your click: these are the links for the installers of DW for Linux, MacOS, and Windows respectively to provide the basic version of the program (at present, version 5.5.0) for major releases.

The address https://openmolecules.org/datawarrior/dw550win.zip only provides access to an archive to substitute the bits and bolts which changed since the release of the installer, version Windows. This sub set of data alone will not suffice to use DW. Without change of the address, the very same address is used for every update of the Windows version of DW. Speaking for the updates for MacOS/Linux I use, this either happens about once in 14 days (regular interval), or once a bug was identified and corrected. In the later case, uploads are updated as soon as possible (on occasion, multiple per day). This is why a keep a local copy of the installer, and of a few updater archives.

This allows to revert to an earlier minor version of DW should there be a need.[1] Then, deinstall DW completely, use the installer to install the major version 5.5.0 again, and eventually run the latest updater known to work well enough. Because the updater packages replace parts of DW's inner gear and I use only one installation directory for DW, there only is one version of DW installed (and most often working well). So far, there was no need for me to have multiple installations of DW (a minor release of June, and an other of August, etc.) to be present simultaneously. For one, the concurrent, simultaneous use of them would require additional guards to keep them separate in the computer's working memory (like in individual sandboxes,[2] or virtual machines[3]). For two, a user written macro should work regardless of the minor upgrade of DW used to write/record it, and irrespective if the underlying operating system was/is Windows, MacOS, Linux. Though only aware about two video recordings[4,5] I think Isabelle Giraud's workshop at RSC Open Chemical Sciences in 2020 set an other example of exchanging DW macros freely; they are meant to ease exchange among the interested. In case they do not work on more recent versions of DW, file a bug report; in case they don't work on an elder version of DW, perhaps the addressed functionality in DW wasn't yet implemented and you may get help here. Directory addresses -- if the macro uses them (many don't) -- may require an adjustment; yet because the macros are plain text, this requires a one-time edit already possible with a text editor.

Norwid

[1] example https://openmolecules.org/forum/index.php?t=msg&th=609&a mp;start=0&
[2] https://en.wikipedia.org/wiki/Sandbox_(software_development)
[3] https://en.wikipedia.org/wiki/Virtual_machine
[4] https://www.youtube.com/watch?v=mQCf9GakQW0
[5] https://www.youtube.com/watch?v=Is2hLqqSFvM
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1726 is a reply to message #1724] Thu, 08 September 2022 16:17 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Hi Norwid
If I don't use different versions of DW simultaneously -i.e., only use one at a time - can I still have two different versions on my PC?

Thanks for the links. These I already know. So my initial question remains: Are there any published papers or further information on using the EL (not using macros but the evolutionary library) in DW apart from the online manual and a brief tutorial on youtube?
I would like to understand the EL much better.
Many thanks
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1727 is a reply to message #1723] Thu, 08 September 2022 16:22 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Hi Thomas
Many thanks - the EL now works with the macro.
Thank you so much for your efforts.

With regard to my previous question and following Norwids response:

Are there any published papers or further information on using the EL in DW apart from the online manual and a brief tutorial on youtube?

Or failing that - is there a good paper on evolutionary libraries or even better some simple youtube videos that you know of that would help, ideally, specifically in understanding how EL works in DW?
Best regards
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1728 is a reply to message #1727] Thu, 08 September 2022 20:15 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Jon,

the safest approach is to run the installer and the update A for version A. And if you want to check version B, to 1) deinstall DW, 2) run the installer, and 3) the updater for version B). This is by precaution because I do not know if updater A and B contain the same libraries by name the updated libraries of updater A may simply exchanged by those of updater B (like different tires on a car), .or. if they introduce changes in the other components of DW.

As for external literature references of evolutionary libraries .and. DataWarrior, my answer is no, I do not have a good i.e., tutorial-like literature reference at hand. Ningsih et al.[1] describe they used it, however address those who already are in the know:

/forum/index.php?t=getfile&id=594&private=0

(loc. cit. p. 18)

[1] Ningsih, E.G., Hidayat, M.F., Tambunan, U.S.F. (2019). Fragment-Based Drug Design to Discover Novel Inhibitor of Dipeptidyl Peptidase-4 (DPP-4) as a Potential Drug for Type 2 Diabetes Therapy. In: Rojas, I., Valenzuela, O., Rojas, F., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2019. Lecture Notes in Computer Science(), vol 11465. Springer, Cham. https://doi.org/10.1007/978-3-030-17938-0_2 pages 14–24 (paywall).

[Updated on: Thu, 08 September 2022 22:09]

Report message to a moderator

Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1731 is a reply to message #1728] Sun, 11 September 2022 15:33 Go to previous messageGo to next message
thomas is currently offline  thomas
Messages: 711
Registered: June 2014
Senior Member
Dear Norwid,

many thanks for answering all the questions already...

Dear Jon,

let me quickly add some comments to two questions:

If you want to keep running the newest version: Once install the original 5.5.0 version from 2021 (which you have already done). Once in a which download the dw550win.zip (Mac or Linux dw550x.zip) using the small print link from the official datawarrior download page. Unpack the file to obtain a few replacement files. Make a save copy of the respective original files with the same names and move the files from the zip archive to the DataWarrior installation folder, where they replace the original files. To move to an earlier version you can just copy an earlier set of files or the saved original files back to the DataWarrior installation folder.

I am aware that the explanation of how to use the evolutionary library and how it works is not well represented in the manual. Especially, the new fitness criteria PheSA and docking have a lot of potential. The best thing to understand what it really does one might look a the source code, but admittedly, that is not for everybody and it still takes some time even for a person being fluent in Java. I also realized that the current algorithm could be improved by reusing fragments of previous hi-ranking structures. The current principle is rather simple: apply simple modifications to the best structures of the previous generation, rank them and use the best ranking molecules for further modifications. Modifications are selected randomly from a complete list of available modifications. The random selection is completely random: modifications are the more likely, the more the product is drug-like (or natural product like). This ensures that created molecules make sense in this regard. The central Java class that performs the changes is called Mutator and can be found here. It lists the kind of changes possible and also contains the few lines of code to actually perform the change in the molecule. The rest is basically how to calculate fitness, which is a weighted sum of user selected parameters. The more complex are Flexophore similarity (DOI:10.1021/ci700359j), PheSA (paper will come) and Docking.

Hope this is somewhat useful...

Thomas
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1732 is a reply to message #1731] Mon, 12 September 2022 00:21 Go to previous messageGo to next message
Jo W
Messages: 34
Registered: July 2021
Member
Dear Thomas
Thanks and yes the PheSA is interesting and its been fun experimenting with this feature.
I look forward to the paper on this. Any idea when it will be published?
Re: Macro for evolutionary library (EL) in Datwarrior (DW) [message #1738 is a reply to message #1732] Tue, 13 September 2022 21:45 Go to previous message
thomas is currently offline  thomas
Messages: 711
Registered: June 2014
Senior Member
it is not in my hands, but it is in the preparation and I hope, it will be out early next year...
Previous Topic: Find & Replace or alternative for stereoisomers
Next Topic: update/recalculation of all computed array data
Goto Forum:
  


Current Time: Fri Nov 08 23:30:13 CET 2024

Total time taken to generate the page: 0.00852 seconds