openmolecules.org Forum: Functionality » molecular complexity descriptor

Home » DataWarrior » Functionality » molecular complexity descriptor (descriptor missing in the version 4.4.3)

Show: Today's Messages :: Polls :: Message Navigator

molecular complexity descriptor [message #204]

Wed, 02 November 2016 22:37

dataviz
Messages: 5
Registered: November 2016

Junior Member

Hello
i just read a recent review in DDT 2016 (O Mendez-Lucio et al) about molecular complexity computed by different tools including datawarrior. I asked the authors and they used version 4.2.2 to compute this descriptor but it seems that this is not present in version 4.4.3 ? while this parameter seems valuable... maybe of interest to plug it back ?
many thanks
Bruno

Report message to a moderator

Re: molecular complexity descriptor [message #206 is a reply to message #204]

Sat, 05 November 2016 23:49

thomas
Messages: 742
Registered: June 2014

Senior Member

this was due to an unfortunate bug in the 4.4.3 version. The checkbox disappeared, but will be back in an update in a few days.

Thomas

Report message to a moderator

Re: molecular complexity descriptor [message #417 is a reply to message #206]

Sat, 03 November 2018 11:03

timritchie
Messages: 15
Registered: February 2015
Location: St Albans, UK

Junior Member

Hello,
How is the complexity descriptor calculated? Which structural features are included?
Thanks and regards,
Tim Ritchie.

Report message to a moderator

Re: molecular complexity descriptor [message #419 is a reply to message #417]

Sun, 04 November 2018 01:13

thomas
Messages: 742
Registered: June 2014

Senior Member

Hi Tim,
the complexity calculation is conceptually very easy but computationally demanding. Its original version calculates the number of distinct structural fragments, which one can construct from a molecule by just cutting parts off. When doing this all delocalized bonds are retained, i.e. marked as such. Then the fragments is converted into a canonical code and added to the list if it is new. The fragment count grows in principal exponentially with the size of the molecule. Therefore we normalize the absolute fragment count by taking its logarithm and devide it by the molecule size. The more distinct fragments, the more complex is the molecule. Molecules with many symmetrical=equivalent atoms, substituents, or molecules with many re-occurring sub-structures are by this logic of low complexity.
For larger molecules the complete creation of all existing sub-structures is rather demanding in terms of memory and time. Therefore DataWarrior uses a fast and simplified version. We have found that if we limit the number of bonds that we allow a fragment to have, we have nevertheless a good estimator for the brute force method's result. DataWarrior limits the fragment generation to a maximum of 7 bonds and calculates the complexity as log(fragmentCount)/bondLimit with bondLimit=7 unless the molecule has less than 14 bonds. Then bondLimit is bondCount/2.

You can find the source code in FastMolecularComplexityCalculator.java as part of the DataWarrior source code.

More detailled info is here:

von Korff M., Sander T. (2013) About Complexity and Self-Similarity of Chemical Structures in Drug Discovery. In: Stavrinides S., Banerjee S., Caglar S., Ozer M. (eds) Chaos and Complex Systems. Springer, Berlin, Heidelberg

Report message to a moderator

Previous Topic:	Using Arrow Keys to Navigate Structures Panel
Next Topic:	Hide Stereochemistry Text Labels on Molecules

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Thu Mar 05 00:08:21 CET 2026

Total time taken to generate the page: 0.00654 seconds