Literature mining: Let the literature find you
Keeping abreast of the wealth of scientific literature by todays standards is a daunting task. It is extremely time consuming yet crucial for a scientist to remain on top of their field. Hubmed is a really useful interface that accesses Pubmed that allows you to search for relevant articles based on keywords within the articles. Expanding the search terms produces the most common keywords within the first 500 abstracts. A boolean option allows you to create conditional relationships between these terms, such as “optic AND neuropathy” as opposed to these words appearing alone.
I have developed programs that mine keywords and conditional phrases from gene ontology records and OMIM documents. For example if you are searching for a gene involved in eye disease you may want to know if the candidate has previously been identified as causing some other eye-related disease or phenotype. Automating these processes allows you to scan numerous OMIM and gene ontology records for interesting terms. This can be developed further by creating conditions between the words and phrases. A scoring system can then be applied to allow you to compare numerous candidates of interest. A simple word count based on the occurrences of these key phrases would be a reasonable place to start.
If you find yourself requiring more advanced techniques refer to text mining.
No comments yet
Leave a reply