Metabolites and lipids can be responsive to both genetic and environmental influences. Variations may occur due to host genes, disease states, lifestyle, diet, medications and the interaction with the gut microbiome. Many rare diseases have genetic origins, but their symptoms can also be impacted by non-inherited causes such as infections, cancers, and other acquired conditions. Metabolomics and lipidomics analyses have been helpful in identifying inborn errors of metabolism, and in characterizing acquired metabolic conditions such as diabetes and metabolic syndrome. These conditions are typically associated with a small number of metabolites and/or lipids that are significant outliers, and easily identified as abnormal. In contrast, the metabolic changes in rare and undiagnosed diseases may be more subtle, consisting of complex patterns of minor changes of a large number of analytes rather than a few significant outliers. Due to the rare nature of these disorders, the number of individuals with a given phenotype is usually limited to one or just a few, precluding the use of the balanced study designs typically used in metabolomics. For these reasons the use of metabolomics and lipidomics analyses in the evaluation of rare and undiagnosed diseases presents many unique challenges.
The NIH Common Fund’s Undiagnosed Diseases Network (UDN) was established to accelerate the diagnosis and clinical management of rare or previously unrecognized diseases, and to advance research in disease mechanisms. The UDN is composed of multiple clinical sites around the United States, and multiple research cores including DNA sequencing (whole exome and whole genome), model organisms (e.g., drosophila and zebrafish) and metabolomics. As the Metabolomics Core for Phase I of the UDN, our role was to provide comprehensive untargeted measurements to identify qualitative and quantitative changes of metabolites (metabolomics) and lipids (lipidomics) in biofluids from probands (i.e. individuals with an undiagnosed disease accepted into the UDN) to assist in the evaluation and/or identification of the causes of rare and undiagnosed diseases.
FIG: Overview of the study design. Biofluid samples were collected from probands at the UDN clinical sites and then extracted for metabolomics (urine, plasma, CSF) and lipidomics (plasma and CSF) analyses using chromatography coupled to mass spectrometry (GC-MS for metabolomics and LC-MS/MS for lipidomics). Data were pre-processed, including data quality checks, normalized, and compared against data from the reference population of healthy individuals. Metabolomics and lipidomics results in the form of Z-score, log2 fold change and p-value per metabolite and lipid of the proband (and associated family members, if applicable) were reported back to the respective UDN Clinical Site for diagnostic assistance.
Here, we describe in detail the raw and processed metabolomics and lipidomics data from analyses of UDN patient samples and make the data available to the research community so that it might be useful in the diagnoses of current or future patients suffering from undiagnosed disorders. Our previous publication (Webb-Robertson et al.) described the detailed statistical approach used for processing this same underlying data set, and so we refer readers to that work for more details on the statistical analyses employed.
Kyle, J.E., Stratton, K.G., Zink, E.M. et al. A resource of lipidomics and metabolomics data from individuals with undiagnosed diseases. Sci Data 8, 114 (2021). https://doi.org/10.1038/s41597-021-00894-y