Health data- and model- driven Knowledge Acquisition

Director : 

Sarah Zohar

Deputy Director : 

Anita Burgun

Hospital information systems are used at every step of patient care, collecting continuously longitudinal data, both unstructured and structured, including clinical reports, drug prescriptions, laboratory results and omics data. Unfortunately, the knowledge that can be acquired from previously collected health data is hardly considered in the clinical care of new patients.

The objective of our Inserm team is to develop methodologies, tools and their applications in clinics towards a learning health system, i.e., a health system that leverages clinical data collected to extract agilely and reliably novel medical knowledge that, in turn, continuously improves healthcare. We rely on the availability of EHRs (Electronic Health Records), clinical trials, cohorts and other linked data to develop models for stratification and prediction with the potential of improving the precision and the personalization of treatments, and in turn the quality of healthcare.

With this objective, the team 22 research activity follows 3 interdependent axes: (1) Patient phenotyping and representation learning, (2) Stochastic and data-driven predictive models for decision guiding, and (3) Designs of next generation clinical trials.

Keywords: Data-driven medicine, Model-based medicine, Learning health system, Precision medicine, Knowledge acquisition, Representation learning, Predictive modelling, Next generation clinical trials, Small samples, Translational research, Electronic Health Records, Machine learning, Bayesian inference

Scientific Themes

Patient phenotyping and representation learning

Methods and tools for leveraging patients’ data in their wide variety and complexity.

Stochastic and data-driven predictive modelling of health trajectories

Development of original machine and statistical learning methods applied to clinical practice, prognostic and personalized medicine

Designs of next generation clinical trials

Development of machine learning models and algorithms for innovative clinical trials methods.

Main publications

A high-dimensional mixture model for censored durations, with applications to genetic data. Bussy S, Guilloux A, Gaïffas S, Jannot A-S. C-mix:  Statistical Methods in Medical Research. 2019 ;28(5):1523-1539. DOI: 10.1177/0962280218766389 link

Learning the Clustering of Longitudinal Shape Data Sets into a Mixture of Independent or Branching Trajectories. Debavelaere V., Durrleman S., Allassonnière S. et al. Int J Comput Vis 2020 128, 2794–2809. DOI: 10.1007/s11263-020-01337-8 link

MedExt: combining expert knowledge and deep learning for medication extraction from French clinical texts. Jouffroy J , Feldman SF, Lerner I, Rance B, Burgun A, Neuraz B. JMIR Medical Informatics. 20/01/2021:17934 (forthcoming/in press) DOI: 10.2196/17934 link

Bayesian dose-regimen assessment in early phase oncology incorporating pharmacokinetics and pharmacodynamics. Gerard E., Zohar S., Thai H.T., Lorenzato C., Riviere M.K., Ursino M. (2020). Biometrics 2021. link

Natural Language Processing for Rapid Response to Emergent Diseases: Case Study of Calcium Channel Blockers and Hypertension in the COVID-19 Pandemic. Neuraz A, Lerner I, Digan W, Paris N, Tsopra R, Rogier A, Baudoin D, Cohen KB, Burgun A, Garcelon N, Rance B, AP-HP/Universities/INSERM COVID-19 Research Collaboration; AP-HP COVID CDR Initiative J Med Internet Res 2020 22(8):e20773. DOI: 10.2196/20773 link

The Adverse Drug Reactions From Patient Reports in Social Media Project: Protocol for an Evaluation Against a Gold Standard. Arnoux-Guenegou A, Girardeau Y, Chen X, Deldossi M, Aboukhamis R, Faviez C, Dahamna B, Karapetiantz P, Guillemin-Lanne S, Lillo-Le Louët A, Texier N, Burgun A, Katsahian S. JMIR Res Protoc. 2019 May 7;8(5):e11448. DOI: 10.2196/11448 link

Predicting the need for a reduced drug dose, at first prescription. Coulet A., Shah N.H., Wack M., Chawki M.B., Jay N., Dumontier M. Scientific reports 2018, 8(1), 1-11. DOI: 10.1038/s41598-018-33980-0 link

A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse. Garcelon N, Neuraz A, Salomon R, Faour H, Benoit V, Delapalme A, Munnich A, Burgun A, Rance B. J Biomed Inform. 2018 Apr;80:52-63. DOI: 10.1016/j.jbi.2018.02.019 link

All publications

Job offer

Post-Doctorant F/H Découverte de sous-groupes de répondeurs aux combinaisons de chimiothérapies

fiche de poste

Page offre d'emploi