openmolecules.org

 
Home » DataWarrior » Cheminformatics » Exporting a descriptor as a Textfile
Exporting a descriptor as a Textfile [message #1605] Mon, 09 May 2022 17:45 Go to next message
Christophe is currently offline  Christophe
Messages: 31
Registered: January 2022
Member
Hello everyone,

Is it possible to export descriptors such as Skelspheres or others as a textfile ?

Thanks
Re: Exporting a descriptor as a Textfile [message #1610 is a reply to message #1605] Wed, 18 May 2022 14:22 Go to previous messageGo to next message
Christophe is currently offline  Christophe
Messages: 31
Registered: January 2022
Member
Just to mention that I'd like to use these text files as a part of a matrix to perform diverse ordination techniques not available in DW like UMAP for example.
These files must be stored somewhere since the "Skelsphere" descriptor for example is available when a t-SNE ordination is envisioned.

Thanks
Re: Exporting a descriptor as a Textfile [message #1611 is a reply to message #1610] Thu, 19 May 2022 12:42 Go to previous messageGo to next message
amorrison
Messages: 38
Registered: March 2016
Member
Hi,
If you use 'add calculated values' and then str(Descriptor Variable) a new column should appear. I think this is what you are looking for.
Angus
Re: Exporting a descriptor as a Textfile [message #1613 is a reply to message #1611] Fri, 20 May 2022 16:23 Go to previous messageGo to next message
Christophe is currently offline  Christophe
Messages: 31
Registered: January 2022
Member
Hi Angus,

Thank you
The columns appear well but the data are unusable (cabalistic signs)
Some descriptors are encoded as a matrix (1024 bits for example), so I suppose this is not the proper way to extract it.

Christophe
Re: Exporting a descriptor as a Textfile [message #1615 is a reply to message #1613] Sat, 21 May 2022 18:48 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Christophe,

for a small set of molecules, I think I'm able to replicate your findings e.g., for the assignment and subsequent display of skelspheres as a string:

/forum/index.php?t=getfile&id=552&private=0

(observation with DW 5.5.0 for Linux, including the update by May 13th).

On the other hand (now specific to skelspheres); if these are like fingerprints e.g., openbabel offers,[1] what are programs you intend to use which accept these as an input for further computation?

E.g., openbabel reports about DMF:

$ obabel -:"CN(C)C=O" -ofpt -xf FP2
>   5 bits set 
00000000 00000000 00000000 00000000 00000000 00000000 
00000000 00000000 00000000 00000000 00200000 00000000 
00008000 00000000 00000000 00000000 00000000 00000000 
00000000 00000000 00000000 00000400 00000000 00000000 
00000000 00000000 00000000 00000000 00000000 00040001 
00000000 00000000 
1 molecule converted
or

$ obabel -:"CN(C)C=O" -ofpt -xs -xf FP2
>
0 6 1 7 1 6 <693>
0 7 1 6 <82>
0 8 2 6 <623>
0 8 2 6 1 7 <330>
0 8 2 6 1 7 1 6 <64>
1 molecule converted
Norwid

[1] https://open-babel.readthedocs.io/en/latest/FileFormats/Fing erprint_format.html
Re: Exporting a descriptor as a Textfile [message #1616 is a reply to message #1615] Tue, 24 May 2022 15:24 Go to previous messageGo to next message
Christophe is currently offline  Christophe
Messages: 31
Registered: January 2022
Member
Hi Norwid,

Thank you

It looks you managed to convert the structure DW column into one of the finger print managed by open babel.

I am not sure this procedure captures the native information of the skelspheres descriptor. It generated a FP2 (by default) FP descriptor.

I know how to generate a lot of different FP from structures. For example PaDEL (free java program, http://www.yapcwsoft.com/dd/padeldescriptorWink does a very good job. It provides a matrix (.csv file) with structures (in rows) and bits (binary or count, depending on the FP you select) (in columns).

If you know how bits (or series of bits) are organized you can try to trace the source that causes the differences in distribution by multidimension reduction methods in R for example.

with DW, when I apply a similarity (or activity cliff) with Skelspheres and/or OrgFunctions, I have data sets that cluterize very well but it is quite challenging then to relate the resulting clusters to the distributional differences in term of structure (SkelSpheres) or functionalization (Orgfunctions). If these two descriptors are different from the one I used so far such as ExtFP, MACCSFP .... may be I could gain more information. But I need the "matrix formalism" into a text file.
Christophe
Re: Exporting a descriptor as a Textfile [message #1618 is a reply to message #1616] Thu, 02 June 2022 09:46 Go to previous messageGo to next message
thomas is currently offline  thomas
Messages: 716
Registered: June 2014
Senior Member
Hi Christoph,

if you use Java, you could use this line to decode the SkeletonSpheres descriptor into a byte array, which contains 1024 count values:

byte[] counts = new DescriptorHandlerSkeletonSpheres().decode(encodedSkeletonSph eres);

Then you could loop over the counts array and write numbers where ever you want. The only dependency would be OpenChemLib, which you can find on GitHub.

Likewise you can decode the OrgFunctions descriptor with

int[][] pairs = new DescriptorHandlerFunctionalGroups().decode();

Here you get an array of arrays with length of 2. Every one of these small arrays contains a functional group ID and an associated count value. Thus, this is not a simple matrix and making use of it will probably need the some knowledge of the groups, i.e. the similarity tree. You may study the FunctionalGroupClassifier to understand which groups have which ID and how the tree is organized.

By the way, UMAP support in DataWarrior is planned.

Thomas

[Updated on: Thu, 02 June 2022 09:46]

Report message to a moderator

Re: Exporting a descriptor as a Textfile [message #1620 is a reply to message #1618] Thu, 02 June 2022 17:24 Go to previous messageGo to next message
Christophe is currently offline  Christophe
Messages: 31
Registered: January 2022
Member
Hello Thomas,

Thank you for this valuable information as usual.

UMAP support in future versions of DW is a very good news. Great!!!

Thanks

Christophe
Re: Exporting a descriptor as a Textfile [message #1669 is a reply to message #1620] Wed, 13 July 2022 17:29 Go to previous messageGo to next message
thomas is currently offline  thomas
Messages: 716
Registered: June 2014
Senior Member
Hi Christophe,

UMAP is now available...
Re: Exporting a descriptor as a Textfile [message #1682 is a reply to message #1669] Mon, 18 July 2022 13:35 Go to previous messageGo to next message
Christophe is currently offline  Christophe
Messages: 31
Registered: January 2022
Member
Hello Thomas,

Thanks for this news.

But we are still at the 5.50 version aren't we ?

When I check for updates, the system tells me that I've got the latest one, i.e. v5.50.

When did you plan to release the latest version?

Christophe
Re: Exporting a descriptor as a Textfile [message #1684 is a reply to message #1682] Tue, 19 July 2022 07:54 Go to previous messageGo to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Christophe,

have a look at https://openmolecules.org/datawarrior/download.html. Click on «I have read and understood the disclaimer» and you find the links to the archives for Linux/Mac, or for Windows to obtain the updates (i.e., you need an already functional installation in first place). Select the one suitable for you, decompress the archives, and substitute with the .jar they include the .jar of your installation.

Norwid

For Mac and Linux: https://openmolecules.org/datawarrior/dw550x.zip
For Windows: https://openmolecules.org/datawarrior/dw550win.zip
Re: Exporting a descriptor as a Textfile [message #1686 is a reply to message #1684] Tue, 19 July 2022 09:32 Go to previous message
Christophe is currently offline  Christophe
Messages: 31
Registered: January 2022
Member
Hi Norwid,

Thank you.

Indeed, the whole procedure is explained on the website.
I hadn't even bothered to read it.

Christophe
Previous Topic: Find similars using Flexophore
Next Topic: t-SNE
Goto Forum:
  


Current Time: Fri Dec 13 11:01:31 CET 2024

Total time taken to generate the page: 0.02711 seconds