The Challenge
The challenge was to make predictions based on patients’ lab test data to predict their disease status and their durations of stay in hospital. This prediction would help the administrative section of hospitals to make sure there is sufficient bed space for everyone. Another similar approach was implemented to predict disease status based on their genetic variations. This would help clinicians to make better decisions about medical treatments. The biggest challenge was to deal with lots of missed data and inconsistency in different databases which required a significant amount of time to be spent on data cleaning.
Procogia’s Approach
- Our team applied data cleaning/wrangling to handle missing value, and low-quality data using pandas, PySpark and Dask.
- We applied data management of a large dataset of human genetics variants using PySpark and Hadoop cluster.
- Machine Learning (ML) techniques were used for different classification tasks such as SVM, Random Forest, and logistic regression.
- We used data visualization tools such as Tableau to visualize the data in a compelling and easy-to-digest way.
The Results
- Clinicians and scientists received more insights to assist them in making better decisions about medical treatments and underlying conditions.
- Improved predictions about patient’s disease status with encouraging accuracies (80%), precision and recall (F1 score).
- Improved operation efficiency of the hospital by predicting a patient’s length of stay. Hospitals can identify patients with a high length of stay risk at the time of admission. As a result, these patients can have their treatment plan optimized to minimize the length of stay, and it can aid logistics such as room and bed allocation planning.
Services Used
Data Science
We use open source technology to leverage the full potential of your data. Predictive and prescriptive results are actioned using AI and Machine Learning (ML).
BI & Analytics
We transform complex and high-volume data into BI reports using dashboards and visualizations, allowing you to make smarter decisions.
Bioinformatics
We deliver scientific results that drive clinical and translational research decisions. Our Bioinformatics team has extensive experience designing, optimizing, executing and analyzing pre-clinical and clinical research projects using next-generation sequencing technologies.
Technologies





Related Blogs
Let’s Connect
What can we help you with?
T: +1 425-624-7532
Alternatively, simply fill in this form and we’ll be in touch.