Intelligent Searches in Rare Diseases

The first Rare Disease Day was celebrated in 2008 on 29 February, a ‘rare’ date that happens only once every four years. Ever since then, Rare Disease Day has taken place on the last day of February, a month known for having a ‘rare’ number of days. In Europe, a disease is considered rare when it affects less than 1 in 2000 people [1]. There are more than 7000 rare diseases worldwide [2]. About 80% of rare diseases have a genetic origin and approximately 75% affect children [1,2]. Although individually rare, collectively, rare diseases are estimated to affect 350 million people globally [3,4].

The problem with rare diseases is right there in the name. Since the population group of each rare disease is small, the amount of data and information available is also small. It is important to note that even though the amount of information available to clinicians is small, it is not simple to analyse information manually. First, it is important to have access to all available information, to be able to find all relevant pieces, put them in context and tie together to generate a clear overview. A deeper understanding can be gained from a review of published papers on the disease – but capturing the key information from literature databases is also a slow and painstaking task.  Second, when critical information about a rare disease is overlooked, diagnosis can be delayed, and patient outcomes can be compromised. So, with the 14th annual Rare Disease Day, it is a pertinent time to think about how we can arm our clinicians with intelligent systems to augment their search efforts. One valuable approach for improving medical care for rare disease patients are software tools that aim to bundle data and expertise about rare diseases so that healthcare providers can easily access, network and exchange relevant information.

COREMINE™ is a family of tools developed by the Norwegian bioinformatics company PubGene using the company’s patented text mining algorithms, as well as big biomedical data integration and analysis pipeline. The company’s proprietary technology enables the user to mine large repositories of information using advanced statistical text mining algorithms and specialized Natural Language Processing[1]. The knowledge derived from this mining exercise is presented, managed and explored using intuitive graphical networks. This interface provides the user with a comprehensive 360-degree view of all relevant information and allows the user to navigate these complex networks and qualify interesting and often new associations. Our tools are used across the world for mining evidence for better diagnoses and possible treatments for every single patient.

Here is an example of how COREMINE platform can be used in Rare diseases:

  • COREMINE uses machine learning to find and extract patterns in the data, which are pulled from different data sources. Our search engine is powered by dictionaries and taxonomies to ease querying process (E.g. Single search term ‘Progeria’ to query it’s all 12 synonyms).

    [1] Natural Language Processing refers to systems that can understand        language

  • COREMINE can combine information from disparate medical records, pulling together highly distributed associations, and continuously calculate the probability, and level of confidence for consideration in real time.
  • COREMINE explores the ability of intelligence-powered algorithms to connect people who would never otherwise meet over social media can empower rare disease patients in unprecedented ways.

To check out more features go to


While it’s important not to overestimate the capabilities of machine learning and intelligent systems, every day we’re seeing new and exciting innovations happening in health and with every new project or setback, we’re getting closer to making true intelligence systems a reality.


It’s an exciting road ahead!

About the author

Jimita Toraskar, PhD in Medicine, works as a Scientist at PubGene AS. She plays a key role in identifying and evaluating evidence resulting in actionable insights for patients. Jimita participates in evaluation and specification of AI-assisted tools. Along with a strong background in pharmaceutical sciences, Jimita is a firm believer in the principles of user-centric design and constant learning.


1. Sernadela, P.; González-Castro, L.; Carta, C.; van der Horst, E.; Lopes, P.; Kaliyaperumal, R.; Thompson, M.; Thompson, R.; Queralt-Rosinach, N.; Lopez, E.; et al. Linked Registries: Connecting Rare Diseases Patient Registries through a Semantic Web Layer. BioMed Res. Int. 2017, 2017, 1–13.

2. Ekins, S. Industrializing rare disease therapy discovery and development. Nat. Biotechnol. 2017, 35, 117–118.

3. About Rare Diseases. Available online:

4. Ronicke, S.; Hirsch, M.C.; Turk, E.; Larionov, K.; Tientcheu, D.; Wagner, A.D. Can a decision support system accelerate rare disease diagnosis? Evaluating the potential impact of Ada DX in a retrospective study. Orphanet J. Rare Dis. 2019, 14, 69.