Heart Disease Study 🫀
We will use the full version of the Heart Disease dataset, as available on the UCI ML repository.
The database is the result of a study for the diagnosis of coronary artery disease, as described in this paper.
The dataset contains data as collected by patients in four different hospitals, in 1988:
- Cleveland Clinic in Cleveland, Ohio (
303
patients); - Hungarian Institute of Cardiology in Budapest, Hungary (
425
patients); - Veterans Administration Medical Center in Long Beach, California (
200
patients) - University Hospitals in Zurich and Basel (
143
patients).
💡Each hospital will be mapped to a single PySyft Datasite,
hosting their own version of the Heart Study Data
data.
We will pretend that these data were not public - as it is most likely the
case
with real medical data. Therefore our main focus in the tutorial
will be to learn how to work with non-public data, while maintaining
privacy.
What you will learn 🎓
In this tutorial, you will learn how to…- work remotely with non-public medical data.
- use PySyft to run Machine learning experiments on non-public and distributed medical datasets.
- take advantage of getting access to multiple medical datasets for better Machine learning modelling.
Materials 🧑💻
The tutorial is organised into multiple Jupyter notebooks that will guide you to the different steps of our Machine learning experiment, using PySyft.
Ready to get started ?
Everything you need to start working with the tutorial is available on GitHub! You can start by cloning the repo, and follow the instructions in the README file:
$ git clone https://github.com/openmined/syft-heart-disease-tutorial
Feedback? Always welcome!
If you liked this tutorial, or for any additional question, or feedback you may have, please feel free to use one of the options below:
Star the repository.Open an issue
Reach out in
#support
channel on Slack