Generative AI models for generating synthetic medical text, time series, and longitudinal data

$0
Pledged
0%
Funded
$3,000
Goal
21
Days Left
  • $0
    pledged
  • 0%
    funded
  • 21
    days left

About This Project

Most deep learning models require large datasets for training and validation, but preparing such datasets with sufficient class samples can be challenging, especially for medical data where patient privacy is a major concern. Synthetic data not only lowers the risk of data abuse but also provides a governed pathway for sharing data in a secured manner. Our project reviews generative AI models for synthetic health records, focusing on modalities, models, metrics, and datasets.

Ask the Scientists

Join The Discussion

What is the context of this research?

Synthetic data refers to artificially generated data that mimics the statistical properties of real-world datasets without using actual patient information. It enables researchers to access large, diverse, and balanced datasets while safeguarding patient privacy and adhering to data protection regulations. Despite substantial progress in developing generative models for synthetic medical data, there is still a big room for studying the reliability of the generated data. Our research presents the results of a novel scoping review on the practical generative models for generating different types of synthetic health records (SHRs). Our findings significantly impact ethical sharing of medical data, collaboration, and alternative to using real patient data. Outside the medical field, it support building public trust in technology by safeguarding sensitive information.

What is the significance of this project?

The research has several key implications:

  1. Improved Predictive Models: We can address class imbalance and data scarcity by utilizing the right generative model.
  2. Addressing Privacy Concerns: Provides a solution for using sensitive medical information without exposing actual patient records. By safeguarding privacy, synthetic data fosters trust among patients, researchers, and the public, promoting ethical and responsible innovation.
  3. Cost-Efficient Model Training and Evaluation: Organizations can reduce development costs by (i) selecting the right performance measures for evaluating the quality of SHR, (ii) utilizing available datasets, and (iii) using state-of-the-art of the generative models for creating synthetic medical data along with the methodological limitations.
  4. Standardization. We can guide policymakers in understanding the utility and limitations of synthetic data, contributing to standards and regulations for its ethical use in healthcare.

What are the goals of the project?

Publish findings and share methodologies, datasets, and insights into existing challenges in generating synthetic health data to deepen understanding of generative AI and support advancements in both academic research and practical applications.


Budget

Please wait...

We have done the project. The results of the project is going to be published as a novel scoping review paper in the journal of Nature digital medicine (ISSN 2398-6352) with the impact factor of 12.4. We need financial support for covering the article processing charges (APC) that is about $4090.00/€3290.00 (ref: https://www.nature.com/npjdigi...).

Endorsed by

I am really excited about this project, as it tackles one of the biggest challenges in deep learning—access to diverse and privacy-safe datasets. By leveraging generative AI for synthetic health records, it offers a promising solution for secure and ethical data sharing. I believe this research will play a crucial role in advancing AI-driven healthcare by providing valuable insights into different models, metrics, and datasets while ensuring patient privacy remains protected.

Project Timeline

We have already completed the research work and we are looking to publish the results. The requested budget is for covering publication fee of the Nature journal.

Feb 13, 2025

Project Launched

Feb 21, 2025

Publishing project results

Meet the Team

Mohammad
Mohammad
Dr.

Team Bio

Our team combines decades of expertise in machine learning, biomedical signal processing, and medicine. Dr. Arash Gharehbaghi who leads the research is a associate professor at Linköping University, Sweden. Here is the list of team members in this project:

- Mohammad Loni is a senior AI research Engineer at Volvo CE, Sweden.

- Fatemeh Poursalim is a medical doctor at Servicehälsan center, Västerås, Sweden.

- Mehdi Asadi is a Ph.D student in CS at Turku University, Finland.

Mohammad

I received my Ph.D. degree in 2022 from MDU, Sweden in computer science and engineering. I am an experienced researcher and developer with over 6 years of experience at various levels ranging from implementation to teaching, fundraising, and thesis supervision in deep learning, AutoML, Generative AI (GANs, LLMs), and TinyML (quantization, sparsification) areas. My primary skill and interest are in developing efficient deep learning pipelines for various tasks such as medical synthetic data generation. Side skills include embedded systems, agile software development, Scrum framework, Git, and programming languages including C++, Python. Last but most importantly, always enthusiastic and so perseverant in learning.

Lab Notes

Nothing posted yet.


Project Backers

  • 0Backers
  • 0%Funded
  • $0Total Donations
  • $0Average Donation
Please wait...

See Your Scientific Impact

You can help a unique discovery by joining 0 other backers.
Fund This Project