The ironic proverbial saying that "a month in the lab can save you an hour in the library" is proving itself repeatedly and at a huge cost to both academic and commercial institutions alike. Missed information in the literature costs time, money, and quality. Both the quality of decisions made and the quality of subsequent research output is compromised when the available information is not realized. In monetary terms, incorrect decisions along the drug pipeline lifecycle in the pharmaceutical area can cost millions to billions of dollars.
[...] Chemical information mining: A new paradigm Computer-assisted extraction or mining of chemical structural information from the literature requires special tools that address the various ways of encoding structures. Traditionally, in the literature, chemical structures are identified by textual names or images of structures. Chemical images of structures are in general very explicit and can convey a great deal of information to a chemist, but they cannot be read by computers. To make these images machine-readable would involve a chemical image recognition capability. [...]
[...] This contextual component can be a very simple and powerful research tool that paves the way for a new paradigm in chemical information mining of the literature, using text analytical tools such as chemical name entity recognition (NER) together with natural language processing (NLP). Conclusion Ultimately, the use of these capabilities has to enhance the ways in which researchers in both academics and industry work. Information overload is a major driver in this shifting paradigm, along with a variety of technological advances in other key areas. [...]
[...] Based on their meticulous extraction and analysis, they identified five molecular pathways common to four different types of addictive drugs. This included discovering two new pathways and clues to the irreversible features of addiction. They did this without conducting a single experiment. A rigorous description of literature-based discovery was published by Kostoff in an earlier paper (Kostoff 2007) and followed later by a series of eight papers that detailed the techniques used and demonstrated these techniques for a variety of life science areas including cataracts, Raynaud's, Parkinson's, and multiple sclerosis, and water purification. [...]
[...] Both the quality of decisions made and the quality of subsequent research output is compromised when the available information is not realized. In monetary terms, incorrect decisions along the drug pipeline lifecycle in the pharmaceutical area can cost millions to billions of dollars. Substantial costs have been experienced in academia as well and seen as missed funding opportunities due to a combination of access limitations to the information together with the inability to find and process the available information. A variety of technological advances including the [...]
[...] The bottom line is that in the hands of creative, experienced researchers, text mining of the literature or literature-based discovery can only serve to increase the opportunities within drug discovery and enhance life science research. Turning the ironic proverbial phrase around to read hour in the library saves a month in the would be a more advisable approach. Barriers to the automation of these discoveries Finding relevant information involves finding relevant documents, accessing those documents, and finding relevant information within those documents. [...]
Online readingwith our online reader
Content validatedby our reading committee