openmolecules.org

 
Home » DataWarrior » Functionality » feature suggestion "pdf2dwar"
feature suggestion "pdf2dwar" [message #573] Thu, 06 June 2019 11:37 Go to previous message
nbehrnd is currently offline  nbehrnd
Messages: 224
Registered: June 2019
Senior Member
Dear Thomas,

maybe the following feature might be implemented in future versions
of the program.

Among the complementary .dwar files is the one about reactions
associated with US Patents. Beside these, publications in scientific
journals represent an additional source of information; frequently
available as .pdf files. Given Lowe's thesis "Extraction of chemical
structures and reactions from the literature" already mentioned, and
picture-to-SMILES converters like OSRA on

https: // cactus.nci.nih.gov/cgi-bin/osra/index.cgi

perhaps DataWarrior may be enabled to harvest equally their
information, too.

I speculate current publications already set to appear as .pdf might
be easier to work with, than those scanned after their publication in
print (e.g., Acta Chemica Scandinavica).

Well, it may sound like a resurrection of MDL's IsisBase seen in the
later 1990s. To some extent, it is tangential to webreactions, too.
The idea surfaced (again) while accessing my literature reference
program, zotero. So far, however, zotero's indexing is limited to
text-only information; the addition of a key reaction is constrained
by pasting a figure from the publication as annotation then accessible
in its browser or report by entry (cf. the two example files attached).

Harvesting this information might be eased if the relevant .pdf are all
deposit into a dedicated partition, rather than multiple sub-folders of
the webbrowser directory requiring an os.walk.

Norwid
 
Read Message
Read Message
Previous Topic: .csv data input including SMILES
Next Topic: Racemates showing as chiral
Goto Forum:
  


Current Time: Fri Nov 22 16:09:34 CET 2024

Total time taken to generate the page: 0.03822 seconds