Home » DataWarrior » Functionality » handling ultra large space (DB sizes and scaling)
icon1.gif  handling ultra large space [message #1950] Thu, 06 July 2023 00:07
jrs11 is currently offline  jrs11
Messages: 1
Registered: July 2023
Junior Member
Wondering if there is a plan (or an easy way to change some flags and build a local windows version) so that DW can start to handle ultra large libraries? So far I tend to find the program becomes unresponsive with between 0.5 - 2 million compounds. REAL space 38 billion. Or perhaps a way to work with ZINC22 either via API or locally? eMolecules 7.3 trillion....

So perhaps on that note, how about a method to do structure-based (ie not Tanimoto similarity etc) but searching through DB of reactions and building blocks, that way we only enumerate the region of chemical space which contains the functionalities of interest. The search could quickly calculate approx. anticipated results as a sanity check for if the query is too broad.

Basically I am trying to avoid paying BioSolveIT - especially because they don't do substructure searching anyway.

But more importantly I am noticing vendors are becoming more reticent to provide files including reagent and reaction files, limiting our ability to enumerate their libraries locally... The cynic in me wonders if commercial agreements are being put in place to make it harder to do this kind of work locally.
Previous Topic: Filter out nasty functions
Next Topic: Chemaxon pKa plugin
Goto Forum:

Current Time: Sat May 25 14:15:28 CEST 2024

Total time taken to generate the page: 0.03907 seconds