openmolecules.org

 
Home » DataWarrior » Functionality » Methods to select diverse compounds using non-binary descriptors?
Methods to select diverse compounds using non-binary descriptors? [message #1330] Wed, 30 June 2021 13:37
EJW515 is currently offline  EJW515
Messages: 1
Registered: June 2021
Junior Member
Hi all,

I have a large library of compounds (100k) from which I'm trying to select a diverse set. The "select diverse compounds" function has worked for FragFp and Spheres fingerprints.

What may be too big a computational task for larger libraries is to select compounds based on conformational diversity too. For this, the Flexophore descriptor worked very nicely on a smaller set of 2000 molecules. I clustered the set into 30 groups (which took 3 minutes), from which the 30 "representative" molecules could serve as a diverse subset. To me, clustering with a non-binary descriptor appears to be analogous to using the "select diverse set" command with a binary descriptor (FragFp, SpheresFp).

However, trying to cluster a library of 100k into 180 groups by Flexophore gave me a waiting time of 500 hours... My setup is a Dell Precision 7920/Intel Xeon Gold 5122 4-core/NVIDIA Quadro P1000/64 GB RAM.

Has anyone here ever tried to do something similar?

I wondered if there's a better approach in some manual way, but the calculation time just for "maxsim(Flexophore)" on the set of 2000 already took 1.5 hours. I have a feeling this may be a fundamental limitation due to the complexity of calculating Flexophore similarity (relative to Tanimoto coefficients), but it would be a very useful tool for me so I thought it worth asking.

Thanks for any help Smile

Best,
EJW515
Previous Topic: Set Bond Properties for Combinatorial Library
Next Topic: Adapt size to zoom state
Goto Forum:
  


Current Time: Sat Apr 20 14:50:49 CEST 2024

Total time taken to generate the page: 0.03838 seconds