Hacking Your Health With Elasticsearch
Recently I had my doctor prescribe a Continuous Glucose Monitor because while I don’t have diabetes, I’m most likely at risk for developing it due to plenty of factors. While health is a valid concern, my secondary interest is part science, part tech, and part — well, I can tell my wife I’m a cyborg now.
I’ve been wearing the CGM since Dec 1st, which means that I don’t have a lot of data to establish trends yet. In addition, I haven’t played around too much with my eating habits to scientifically see how my glucose levels will react to such tests. This will likely happen in future ‘Hacking your Health with Elasticsearch’ posts. My brother however has been using his CGM for several months and he has agreed to share his data with me as well for the purpose of this post. He’s played around a little more with one meal a day/keto type eating habits — although he’s fallen off the wagon a time or two; so we’ll see what comes out from ingesting that data for now.
The goal here is to show the true ease of zero to hero you can get with a tool like Elasticsearch. We will be importing data from a Continuous Glucose Monitor into Elasticsearch and walking through how to easily setup your very own Machine Learning/Anomaly Detection in Kibana. You don’t have to be a techno-wizard to get rolling and learn new things from your data.
While I am using my brother’s glucose monitoring data in this instance because I just got one and it’s new to me — ingesting and visualizing data in Elasticsearch is painless. There have been a vast number of improvements to Kibana to allow it to be a one-stop shop for many use-cases that previously would have required an individual to involve a developer or someone who knew how to transform their initial dataset into JSON or understand logstash implementations.
To start we can take a look at the data that my brother exported to me. It is simplistic information consisting of only a few columns. Notably — a timestamp, a glucose level, and a type of record. We’re mostly interested in the timestamp and glucose — but I’ll bring in the other columns as well because it might help differentiate the data when I also load up my data. For ease of deconfliction I added in my own ‘userid’ column to the CSV file and assigned the number 2.
Elasticsearch provides ample opportunity to import your own data easily without having to do any development on your end. From the initial Kibana front page, I chose to ‘upload a file’ and from there it was as simple as dragging and dropping my delimited glucose export into the browser. The most difficult part was making sure the timestamp was accurately parsed. I noticed that without utilizing the advanced options to set the timestamp field I wasn’t able to progress using the machine learning aspects with the data.
With the data ingest completed, the next step was obvious — load up anomaly detection in Elasticsearch which is great for time series data that you just want to see what might be interesting with almost minimal effort quickly. I created a quick machine learning job on my brother’s data from the beginning of September through December based off a bucket of 1 hour and using the mean of the glucose level. We’re looking for anything out of the ordinary and my hope was that this period of time would raise some interesting nuggets. The Machine Learning wizard basically walked me through the process of setting up this job and provided all the guidance needed to just get it done.
The decision to use a 1h bucket span was based off hitting the ‘Estimate bucket span’. For those who enjoy seeing how the sausage is made here’s a screen shot of the JSON that the ML job wizard creates.
After you run the job, you get a summary graph that basically runs over some quick highlights. This enables you to see at a glance that ‘oh yeah — there’s gonna be some information here!’
This overview is interesting in its own right simply because you see the ups and downs of someone who’s starting to figure out how the CGM works, and how to maintain the “healthy” level of glucose. In the beginning there were some anomalies but those quieted down over time. Then came some data points that we’ll take a look at more closely in the November time frame.
Previous to this chart was a whole lot of uneventful well within the “normal” glucose levels, but I highlighted two rather distinct areas. The first one being a 5 day fast that both my brother and I completed between the 15th of November and the 20th. The second anomaly easily diagnosed as American Thanksgiving. As we can see from the data provided that while not eating, glucose levels were below the norm, however after what can only be a large meal of certain delicious carbohydrate laden sides and desserts you see an incredible spike in glucose.
This understanding of the data was not only easily seen in the graph presented out of the box by Elasticsearch with the helpful markers, but also in table form below the graph. I am still sometimes amazed by how simplistic taking a dataset and importing it into Elasticsearch is without having to touch any programming language, or extract transform and load tool. Simple drag and drop, assign a timestamp and possibly a timestamp format and you’re off to the races.
Elastic provides a free 30 day license to access some of the premier add on capabilities they provide for Elasticsearch. Machine Learning is one of these paid features available through this license along with several other advanced features that will give you the ability to tear into your data efficiently and with little effort.
I now have my own continuous glucose monitor, so stay tuned for more posts about loading the data, extracting insights, and as time would have it — see how my own body reacts to these events as well.