openmolecules.org

 
Home » DataWarrior » Functionality » Assign cluster name based on cluster size
Re: Assign cluster name based on cluster size [message #1598 is a reply to message #1590] Fri, 22 April 2022 22:46 Go to previous messageGo to previous message
nbehrnd is currently offline  nbehrnd
Messages: 211
Registered: June 2019
Senior Member
Hello mcmc,

I just completed a small Python script to process DataWarrior's results about structure similarity (Chemistry -> Cluster Compounds) exported as text file (File -> Save Special -> Textfile). It identifies the clusters, sorts these based on the number of molecules in each clusters, updates the molecules' cluster labels (1, 2, 3,...) accordingly and writes a new .txt file one may read with DW by (Ctrl + O). There are two sorts possible: a) «the more molecules in the cluster, the lesser the integer used as label of the cluster», a pattern possibly matching best your intent. Though with the optional flag -r you equally may reverse the sort for b) «the more molecules in the cluster, the greater the label».

The .zip archive attached below includes the .py script and describes early results when processing a small set of test data. It assumes the first column labeled «Cluster No» contains the cluster labels assigned by DataWarrior (which is the program's default header).

Norwid

[Updated on: Tue, 26 April 2022 11:31]

Report message to a moderator

 
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Updating columns
Next Topic: Find & Replace or alternative for stereoisomers
Goto Forum:
  


Current Time: Tue Apr 30 16:11:14 CEST 2024

Total time taken to generate the page: 0.04334 seconds