openmolecules.org

 
Home » DataWarrior » Cheminformatics » How are compounds clustered? (What algorthm is used to cluster the compounds?)
How are compounds clustered? [message #177] Fri, 13 May 2016 17:09 Go to next message
mindhira is currently offline  mindhira
Messages: 4
Registered: May 2016
Junior Member
I'm using Datawarrior to cluster various sets of compounds but it would be helpful if I could tell people how these are clustered. From the description, it sounds like hierarchal clustering. Is that accurate?
Re: How are compounds clustered? [message #178 is a reply to message #177] Fri, 13 May 2016 21:22 Go to previous message
thomas is currently offline  thomas
Messages: 146
Registered: June 2014
Senior Member
You are right. It is an agglomerative hierarchical clustering. It starts by calculating the complete similarity matrix between all molecules. Therefore it can be used by any descriptor, not just vector based ones. At the beginning every molecule already represents a cluster. Stepwise always those two clusters are merged, which are the most similar ones. The respective two rows and columns of the similarity matrix are then also merged by simply calculating a weighted mean between any two similarity values. The merging continues until the stop criterion is met.

Thomas
Previous Topic: What method is used to cluster compounds
Next Topic: Access Database not in DataWarrior List
Goto Forum:
  


Current Time: Thu Jul 19 17:40:30 CEST 2018

Total time taken to generate the page: 0.00538 seconds