*Result*: Preparing for the European Health Data Space: an open-source compiler for fast, transparent, and portable health data transformations.
BMC Med Inform Decis Mak. 2023 Jan 24;23(1):18. (PMID: 36694161)
Stud Health Technol Inform. 2024 Aug 22;316:1442-1446. (PMID: 39176652)
J Med Internet Res. 2020 Jul 7;22(7):e17508. (PMID: 32348265)
P T. 2017 Sep;42(9):572-575. (PMID: 28890644)
J Pers Med. 2024 Mar 03;14(3):. (PMID: 38541024)
J Med Internet Res. 2024 Aug 8;26:e53369. (PMID: 39116424)
Stud Health Technol Inform. 2024 Aug 22;316:83-87. (PMID: 39176680)
JMIR Med Inform. 2022 Jul 19;10(7):e35724. (PMID: 35852842)
*Further Information*
*Introduction: Healthcare systems generate vast amounts of data in diverse and often incompatible formats. Efficient conversion between these formats is essential to ensure interoperability and enable secondary data use, particularly in the context of the European Health Data Space (EHDS) and the proposed Austrian Health Data Donation Space (AHDDS). While standards such as HL7 FHIR aim to facilitate interoperability, inconsistencies in implementation persist. Electronic health record (EHR) providers, including Austria's ELGA, continue to face challenges in this area. The FHIR mapping language (FML) offers a promising solution for format translation, but current tools for executing FML mappings are limited, especially in terms of processing speed. To address this gap, there is a pressing need for a compiler that translates FML mappings into efficient, executable code.
Materials and Methods: We developed the Mapping Language Compiler for Health Data (MaLaC-HD), which compiles FML code into Python. To assess performance, we benchmarked the compiler using a large ELGA document on a typical end-user device, comparing execution speed with existing FML tools. Baseline overhead was measured using an empty mapping. Conformance was manually evaluated by comparing the output of a wide range of example mappings and input data against the Java reference implementation. Additionally, we analyzed the structure and correctness of the generated Python code to assess functional completeness.
Results: After adjusting for overhead, MaLaC-HD achieved execution speeds nearly 100 times faster than existing tools. The output closely matched that of the reference implementation, with only minor discrepancies. The generated Python code met all functional requirements and demonstrated the compiler's ability to support complex transformations. MaLaC-HD is publicly available under the LGPL license.
Conclusion: MaLaC-HD can serve a wide array of use cases and has the potential to integrate with existing platforms for secondary data use to support large-scale health data research across Europe and beyond. MaLaC-HD could provide the EHR community with a powerful, efficient tool for accelerating data transformation, an essential capability for the success of the EHDS initiative.
(Copyright © 2025 Beyer, Tanjga, Kleinoscheg, Hayn, Donsa, Kreiner and Schreier.)*
*NT, GK were employed by ELGA GmbH. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.*