openmolecules.org

 
Home » DataWarrior » Functionality » feature suggestion "pdf2dwar"
feature suggestion "pdf2dwar" [message #573] Thu, 06 June 2019 11:37 Go to next message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Thomas,

maybe the following feature might be implemented in future versions
of the program.

Among the complementary .dwar files is the one about reactions
associated with US Patents. Beside these, publications in scientific
journals represent an additional source of information; frequently
available as .pdf files. Given Lowe's thesis "Extraction of chemical
structures and reactions from the literature" already mentioned, and
picture-to-SMILES converters like OSRA on

https: // cactus.nci.nih.gov/cgi-bin/osra/index.cgi

perhaps DataWarrior may be enabled to harvest equally their
information, too.

I speculate current publications already set to appear as .pdf might
be easier to work with, than those scanned after their publication in
print (e.g., Acta Chemica Scandinavica).

Well, it may sound like a resurrection of MDL's IsisBase seen in the
later 1990s. To some extent, it is tangential to webreactions, too.
The idea surfaced (again) while accessing my literature reference
program, zotero. So far, however, zotero's indexing is limited to
text-only information; the addition of a key reaction is constrained
by pasting a figure from the publication as annotation then accessible
in its browser or report by entry (cf. the two example files attached).

Harvesting this information might be eased if the relevant .pdf are all
deposit into a dedicated partition, rather than multiple sub-folders of
the webbrowser directory requiring an os.walk.

Norwid
Re: feature suggestion "pdf2dwar" [message #585 is a reply to message #573] Thu, 27 June 2019 08:23 Go to previous message
thomas is currently offline  thomas
Messages: 715
Registered: June 2014
Senior Member
doing this right, would be a lot of work and probably beyond the scope of DataWarrior.
I assume and hope that it is only a matter of time that some organization compiles all
openly available chemical reaction information into a handy database. It would be a
fantastic resource of information and might catalyse the development of some open source
retro-synthetic synthesis planning software...
Previous Topic: .csv data input including SMILES
Next Topic: Racemates showing as chiral
Goto Forum:
  


Current Time: Thu Nov 21 23:51:48 CET 2024

Total time taken to generate the page: 0.03610 seconds