The dataset used in this study is a high-quality clinical ECG database collected from two major medical centers: Jiangsu Provincial People's Hospital (1,051 cases) and Sun Yat-sen Memorial Hospital of Sun Yat-sen University (171 cases), totaling 1,222 patient samples. To fully leverage this valuable clinical data, we segmented the original long-term ECG recordings into shorter samples, creating a training set of approximately 20,000 real samples. This dataset is clinically representative, encompassing a wide spectrum of atrial substrate states, and provides a reliable foundation for model training and validation.
This real clinical dataset contains valuable clinical information, covering a wide spectrum of atrial substrate states. To protect patient privacy and comply with strict data usage agreements, the real clinical data is not publicly available. However, access can be granted for research purposes upon reasonable request. Please contact the corresponding authors for further details.
To overcome the limitation of not being able to share the real data and to advance research in this field, we have generated and are publicly releasing a large-scale synthetic ECG dataset using our ECG LDM model. Everyone CAN generate MORE samples following the ECG-LDM proposed in our work.
Total Samples: 216,913
Labels: Each sample is annotated with its atrial substrate status (0 for normal, 1 for abnormal).