Thanks so much for developing and supporting such a great tool for free. I've found it very useful in my research.

Patrick Fitzgerald]]>

the shape index is based on the 2D-graph of the molecule and looks at the those two atoms that have the longest distance in terms of bonds between them. This length and the atom count of the molecule are used to calculate the shape index.

The globularity is calculated from atoms coordinates in conformers. It is basically doing a singular value decomposition to rotate the molecule in the coordinate system such that its width (size in x-direction) is maximized. Then it is rotated around the x-axis until its height is maximized. After that globularity is calculated as the molecule size in z-direction devided by the size in x-direction.

Shape index and globularity have a slight negative correlation, but one can hardly used as substitute for the other. I have just added a globularity calculation to DataWarrior (dev version). The results correlate (0.83) with the Nature paper's glob value. This is not a perfect correlation. I use up to 128 non-energy minimized conformers and average the result. If I minimize conformers with MMFF94s+ or reduce their number and therefore overpopulate low energy conformers, then the correlation gets worse. Probably, the conformers used in the paper were random conformers with no bias on low energy and most likely not forcefield minimized.

Hope this helps,

Thomas]]>

I just deployed an update, because the recent version was based on the singular values and created too small globularity values. Strangely, the Nature dataset's globularity values also seem too small. Now my calculation uses a singular value decomposition to determine the three orthogonal axes through the molecule that represent largest, second largest and smallest variance of the atom coordinates. Then, I determine along the axes the size of the molecule and divide the smallest by the largest size resulting in a reasonable globularity value. I average individual values of up to 32 random conformers, which were not minimized, but produced with a bias for low energy. The correlation with the 'glob' values of the Nature dataset is smaller than before, but if you explore conformers, create surfaces and visually inspect the molecules, it makes much more sense. One more thing to mention is that the conformers contain all hydrogen atoms, which gives on average slightly larger values than without hydrogens.

Thomas]]>

Thomas]]>