In the previous parts of this series, I discussed how data science and machine learning are becoming more turn-key, and being used to drive business outcomes in the traditional enterprise. Perhaps no market has seen more advancement in the utilization of data to drive results than healthcare. From improvements in patient care services to curing diseases in biotechnology, healthcare enterprises are storing, searching, and analyzing data at a massive scale.
Healthcare was a key focus for many speakers at the recent Strata+Hadoop World conference in New York City. Presentations came from biotech scientists, government healthcare specialists, and representatives from large providers such as Kaiser Permanente. All of them focused on positive results that have been achieved from data analytics so far, as well as areas of future opportunity for research and/or profitable business. One thing was abundantly clear: the digitization of healthcare data has progressed rapidly, and now is the time to unlock the value.
In the provider space, data storage has come a long way. Most hospital systems have developed a digital medical record strategy and have implemented repositories for various imaging technologies. The current focus is to acquire more data about their patients and to use analytical strategies to improve their services. According to Kaiser Permanente, 88% of patients were willing to share additional personal data with their doctors to improve healthcare services, and more than 80% showed interest in using mobile applications to manage their healthcare.
The focus at Kaiser is to “democratize” data: to acquire it regularly from patients in their system and share it widely with doctors and research professionals. This has proven challenging; although patients were more willing to share data with their doctor, they have concerns about privacy when allowing that data to be distributed widely. This places a high emphasis on security, access controls, and encryption when collecting patient data into a research repository.
On the other side of the healthcare industry sits biotechnology and life sciences. As an industry, biotech could quite possibly be the place where data analytics can provide the most value, but the progression has been slower than that driven by providers. A key reason is that valuable biotech data can be very large, consisting of high-resolution images or the digitization of the entire human genome. Organizations must first solve the challenge of storing massive quantities of data prior to using analytics to unlock the research potential.
The challenge of storing biotech data gets more and more difficult for three primary reasons:
- Data that was previously only kept in physical form (stained slides, tissue samples, etc.) is now going digital and is being stored as images, DNA sequences, and other methods.
- Regulatory standards will eventually shift towards placing lengthy retention requirements on digital data, matching existing standards for the physical samples.
- The research benefits of keeping samples in digital form are advancing rapidly.
At RoundTower, we have used our expertise in designing big data infrastructure to help biotech firms. A storage solution must be developed which can provide high-performance access and direct connection to lab equipment, and seamlessly transfer data to more cost-effective tiers for long-term retention without requiring significant interaction from lab professionals. The solution must also be designed in such a way that researchers can quickly search for, retrieve, and perform analysis on new and old records alike.
Once we have solved the significant challenge of storing biotech data, we can begin the process of layering machine learning and data science techniques on top, with the ultimate goal of predicting or curing diseases and improving our understanding of the human body. It requires a tight partnership between medical professionals and the experts who can help them build data storage and analytics solutions to meet their specific needs. It is a challenge that RoundTower has already developed significant expertise around, and we will continue to expand our presence as our healthcare customers focus on unlocking value from their data.
I am very excited about the potential for positive outcomes in the healthcare industry via data science and analytics. It provides a tough challenge for infrastructure specialists such as myself, as we need to develop knowledge outside of the infrastructure space to fully understand how our customers need to store and consume data. It also provides the possibility of curing ailments and improving healthcare services in ways which were previously not possible without the benefit of data, and the ability to correlate data across large-scale repositories. Success will depend on the researchers being able to access the necessary data to find beneficial information or make valuable predictions, and our ability to build the infrastructure which can meet this demand.
Part 1: Data Analytics for the Enterprise
Providing Turn-Key Solutions
I recently attended the Strata+Hadoop World conference in New York City held from September 27-29. I have been to the conference in previous years, and in the past you could say that Hadoop World was a bit of a geek-fest. The conference focused on things like core Hadoop infrastructure, exciting new software like Spark, and new innovations in the open source world.
POWERED BY NUTONIAN EUREQA®
RoundTower Technologies and Nutonian have partnered to deliver the industry’s leading data science solution, Fast Answers Powered by Eureqa.
This collaboration brings together Nutonian’s industry leading expertise in data science and machine learning with RoundTower’s capabilities in delivering data analytics platforms.