This interactive data visualization tool allows the user to explore the 26+ million scientific articles on the MEDLINE/PubMed open dataset, hand-annotated by health experts using 16 major categories and a maximum of 13 levels of deepness. In this tab, we present two tools designed to leverage the scientific knowledge in this dataset to better understand what is known and published about a certain health topic. 



 Move the pointer over the tag cloud and change the order of results  

Description: This tool uses text mining algorithms to help surface information we are looking for, avoiding the standard prioritization of articles that is biased by definition. To this aim, it exhibits the clustered keywords of a query, after searching for a health-related topic. It allows for Lucene syntax in, e.g., searching for all scientific articles hand-annotated with the health category Coronavirus by writing: MeshHeadingList.desc:”Coronavirus”. It changes the position of the results according to the choices of the user when moving the pointer over the word cloud of subtopics. 

Functionality: It is designed to improve the search engine experience; the user provides further information to the search by interacting with the system by dragging a pointer over word clouds. These word clouds are produced by cosine similarity to an "average" centered on the topics in each abstract of the set of selected papers, clustered using the k-means algorithm. Besides the usual query over keywords, you cat search for the articles hand-annotated by the MEDLINE experts with a certain health topic, by writing MeshHeadingList.desc:”<TOPIC>”. You can also use connectors on your queries so search to, e.g., retrieve all the articles hand-annotated with both the Coronavirus and the diabetes health classes, by writing: MeshHeadingList.desc:”Coronavirus” AND MeshHeadingList.desc:”Diabetes”.

* at the moment Chrome with original settings is facing difficulties in the full functionality of the visualisation tool above. This is solved by disabling the 'out-of-blink-cors' flag in chrome://flags/#out-of-blink-cors. If you face such problems and prefer not to disable the flag, please use it with, e.g., Firefox. 


MIDAS MeSH Classifier

 Copy and paste any text you want to annotate with MeSH classes 

Description: The MeSH Classifier is a tool to automatically annotate any text snippet with the MeSH health classes. The algorithm learned on the knowledge of 80+ years of published biomedical articles at MEDLINE/PubMed, hand-annotated with health classes hierarchically organized over a MeSH tree with 16 major categories and a maximum of 13 levels of deepness. It provides all the classifying categories with position number and (cosine) similarity weight, with a slider and a number of maximum categories visible. It is also available through an API (per request). 

Functionality: It is designed to classify free text of any nature with the classes of MeSH Headings, using a tailor-fit algorithm learning over 80+ years of published biomedical articles. It can classify any documents of interest, including medical reports, electronic health records, and health news. At the moment it is limited to text in the English language, due to that MEDLINE/PubMed is also only available in English. 



 Explore the MEDLINE/MeSH dataset on Coronavirus 

Description: The MEDLINE/MeSH dataset offers the scientific community . 

Functionality: The user can explore. 

Disclaimer: The core system was developed by the AI Lab at the Institute Jozef Štefan, and refocused by Quintelligence within the MIDAS project to analyze the MEDLINE dataset. It can be implemented in premises to work with proprietary data, or used as a service. It is currently available as Open Source under the BSD license. It can be used in any document set that can be indexed, analyzed and visualized with this approach.