Full Disclaimer: I’ve been working at Splunk for close to 4 years now as a software engineer. For those that aren’t familiar with Splunk, it’s a tool intended to collect machine generated data (aka log files), index it, and allow you to search and report on it. Most companies use Splunk in their IT department to monitor and troubleshoot problems, but it’s really starting to branch out into all areas of businesses. After discovering Splunk, it didn’t take long for me to realize I could use it myself even though I wasn’t running a huge IT department. Of course I threw a bunch of my web server logs at it and started fixing errors, but that’s par for the course. Things started getting interesting when I decided I was going to Splunk myself.
With the rise of the Quantified Self movement, more and more biometric devices and self-tracking tools are becoming available to normal consumers. As we start using more devices that track more variables, the amount of data generated frows exponentially. Devices like the Basis allow you to track things like your heart rate or skin temperature on a minute by minute basis, resulting in thousands of data points every day. It’s debatable whether we can call this “Big” Data since it only accounts to a few megabytes, but it definitely scales outside the realm of manual handling very quickly.
The Basis is just one device that you can use to help track yourself. There are hundreds more. Personally I’m currently (or within the past year) tracking the following:
- Fitbit Zip – steps (also tracking my cat Bunki with a Fitbit)
- Nike Fuelband – steps and activities
- Basis B1 Band – steps, heart rate, skin temperature, air temperature, and galvanic skin response
- Withings scale – weight and fat mass
- Garmin Forerunner 610 – GPS location, heart rate (paired with a Wahoo ANT+ HRM), and pace for my runs and bike rides
- Lumoback – posture
- Zeo – brain activity during sleep (Company out of business)
- Automatic – tracking car trips, mpg, and driving habits
- Moves (iOS/Android) – steps, passive tracking of places I’ve been, cycling, running
- Openpaths (iOS/Android) – geolocation
- Google Latitude (service discontinued) – geolocation
- foursquare (iOS/Android) – active tracking of places I’ve been
- Last.fm – Music listening habits
- Biologger – a custom web based app for tracking things like sneezes, last time I cleaned the litterbox, did the dishes, headaches, dyed my hair, etc.
- Email inbox – custom script for checking number of read and unread messages in my inbox
- Twitter – tracking my tweets @edrabbit
Each of these devices or apps has their pros and cons. Battery life, ease of syncing, and of course quality of data are all factors. But the biggest issue I ran into with Splunking myself was getting my data out of all these services.
If I was really lucky, a full-featured API was available (like Fitbit’s). Other times I had to settle for an export in CSV or some other format (like Zeo). Unfortunately there are a few services/devices that just don’t offer any official way to get your data out (I’m looking at you Basis).
Over time I was able to find a number of tools written by others to help free your data. Additionally I wrote several of my own tools to extract data and transform it into human-readable and Splunk-friendly log formats. All of my tools are available on my github page. I’ve got tools to fetch and/or format data from Fitbit, Nike Fuelband, Twitter, Google Latitude, GPX files, CSV files, Zeo, foursquare, and Basis.
But wait, I said Basis didn’t have any way to get your data out! As an excellent example of users finding ways to use things the way they want, there is a way to get your data out of Basis. One Basis user (Bob Troia) discovered that while Basis doesn’t offer an API to user, they do have an internal API where JSON is passed through the browser in order to show you dashboards with your data. All it took was a little bit of scripting to grab this automatically and suddenly we all had control of our own data. You can find his Basis Data export tool here. This was great, but the data was exported in a horrible combination of JSON and CSV. Luckily one of my scripts can turn it into minute by minute log files as if they were generated on the fly. There are a bunch of other self-tracking hackers out there building tools like this. They’re often easy to find, but to help I’ve started to try to catalog them at wiki.biologger.com
Armed with all these devices generating data and the appropriate tools for fetching and formatting it correctly, I was finally able to feed all of this into Splunk. Let’s take a look at some of what I’m able to get out of this.
I have around 1.5 million events logged in Splunk, taking up only a couple hundred megabytes. For anyone already familiar with Splunk, this is obviously within the 500mb index limit of the free product, which means anyone can do this without shelling out cash for a license.
One of the most common questions when people learn I’m using all these devices is how they compare to each other. Previously I could only give a vague sort of answer about how I thought the Fitbit was the most accurate and that the other devices where a few thousand off each day. But with Splunk I’m able to generate a visualization comparing the Fitbit with the Basis and the Fuelband. It’s true, they don’t give similar numbers, but they do seem to track similarly over time.
How do I compare to Bunki in steps?
How many times have I sneezed each day?
What venues have I checked in at with foursquare the most over the past 8 years?
What kind of venues do I tend to check in at?
With the Google Maps App for Splunk 5 I can even plot all of these checkins on an interactive map. (The newest version of Splunk 6 comes with maps built in)
I can also use this to see exactly where I went on trips, what restaurants I ate at, and what activities I did. So next time a friend asks what’s good to do in a city I’ve visited I can just pull up the map and show them exactly where to go. Here’s an example map from Splunk’s .conf in Vegas last year:
The maps above are based on foursquare data, which requires active check-ins at specific venues. However tools like Google Latitude and Openpaths simple record latitude and longitude over time, so you get a more detailed report on my movements. Here’s Openpaths tracking me around San Francisco and my journeys to the Black Rock Desert.
Or how about the current state of my email inbox? I’m tempted to put this up on a screen at home so I can see how many unread messages I have without having to open up my inbox.
What about my favorite music based on Last.fm? This is an interesting one that also shows off how multipurpose Splunk can be. As you can see I listened to a single song (The Acoustic Hoods – Cycles of Time) several times more than all other songs. In Splunk I can click on that pie piece and it will show me every single time that song was logged as played. Turns out I left the song running on repeat on my work computer when I went home for the evening and it looped all night long. Oops. Luckily with Splunk it’s easy to just exclude those events to get a more realistic picture if that’s what I wanted.
This is all pretty simple stuff that even the newest of Splunk users can do. It’s mainly looking at just one data source. What I really wanted was to be able to see a picture of any day of my life and know what that day was like for me. SO I put together a dashboard that lets me do just that. It pulls in info from various different services that have vastly different formats and lets me see it all on one page instantly. No more clicking from site to site to site. I’m still working on getting more and more relevant data in, but if you take a look at the screenshot below you’ll see exactly what July 24th, 2013 looked like for me. Based on my heart rate around 9am and 6pm it looks like I biked to work, I ate lunch at Mexico au Parc, went to the Apple store to get my iPhone replaced, sneezed 3 times in the morning, and learned how to play Top Gun on the accordion.
Keen-eyed Splunk users will notice in the Summary screenshot that some of the data sources are data dumps and not actually feeding into Splunk live. I want to work on getting everything feeding into Splunk automatically so it is all constantly up to date rather than have to take 20 minutes to sit down and get new data dumps from the various services.
I also want to start Splunking my house. I’ve got a Nest, a wifi washing machine, and more and more devices becoming internet enabled. It’s just a matter of grabbing it and feeding it into Splunk.
You can download the free version of Splunk which is available for almost every platform, or you can give our hosted (and also free) Splunk Storm a try. There’s a slight learning curve to using Splunk, but if you can use Google, you can learn to use Splunk. I’d love to hear from anyone else out there that either wants to or already has started Splunking themselves. Shoot me an email or drop me a tweet @edrabbit!
The plan was to hike from the Education Center of Point Reyes down along the Coast Trail, and spend the night at Glen Camp. Then we’d turn around and head back up, clocking in about 9 miles each way. I packed my bag, laced up my boots and put together an arsenal of tracking devices and off we went.
I brought with me my normal daily carry quantifying gadgets which included the Basis watch, a Nike Fuelband, and a Fitbit Zip. I decided to put my phone in airplane mode for the hike so I could conserve battery for taking photos. That meant I didn’t have any tracking with foursquare, Moves, openPaths, or any other apps. I also brought along my Garmin Forerunner 610 with the Wahoo ANT+ heart rate strap which turned out to be quite sweaty, but worth it for the data. And of course my Garmin Vista HCx GPSr which always goes with me on trips. I also brought a Newtrent 7000mAh USB battery pack for recharging devices as needed since just about everything is chargeable over USB. I ended up only needing to charge the Forerunner overnight.
We parked our cars at the Educational Center in Point Reyes and started down, or rather up, the Laguna trail at about 2013-09-21 13:08:27. Laguna connected up with the Fire Lane trail which took us to Coast Camp and our first glimpse of the beach only 40 minutes into our hike. After stopping for lunch under a giant eucalyptus tree we got back on the Coast Trail (14:50), stopping several miles later at Kelham Beach for awhile (16:09 – 17:21).
While exploring the beaches I turned off my Garmin Forerunner watch because I wasn’t sure how long the battery would actually last: a very common problem with all these tracking devices. So I didn’t get my heart rate as much on the first day, but still got a bunch of GPX tracks. We finally made it into Glen Camp around 18:57, and set up camp as the sun was setting, but not before meeting a banana slug at 18:00 who was hanging out with some mushrooms at 37°59’36” N 122°48’21” W.
We spent the evening cooking a delicious dinner, and Tim made Bananas Foster for desert to celebrate the motivation for this trip, Mella’s birthday! Yes, Tim hiked in a full pound of brown sugar, half a pound of butter, and a small bottle of Hennessey to make that happen. That night we also got to hear first hand “what the fox says” and were comforted in knowing that it was foxes and not actually someone being murdered.
The next morning we awoke, packed things up, and got back on the trail around 12:08:06. We stopped off at Arch Rock for lunch around 13:15, got back on the trail at 14:19, took a side detour to Sculptured Beach at 15:42 and then got back on the trail at 16:52. For some reason, my Garmin Vista HCx stopped tracking around 18:15, but my Garmin Forerunner tracked me all the way to the parking lot, putting my feet on concrete ten minutes later at 18:25.
Here’s some more screenshots from my Nike Fuelband, Fitbit and Basis showing some of my stats for the weekend:
View the entire trip in EveryTrail here.
Or grab the raw GPX files for fun:
Garmin Vista GPX file from 2013-09-21
Garmin Forerunner GPX file from 2013-09-21
Garmin Vista GPX file from 2013-09-22
Garmin Forerunner GPX file from 2013-09-22
In the end, according to Fitbit, I clocked in over 54,000 steps for the weekend, and Fitbit says almost 26 miles but I think my stride might be a bit off as the GPS devices put it closer to 20 miles. I was able to add 11,694 Fuel points to my account.
I learned that I do indeed sweat a lot while hiking. The interesting thing was that I continued to sweat Sunday evening even though we were just riding in a car. Not sure if that’s accurate or if the Basis was just confused.
I also learned that I start hurting and feeling tired while hiking if my heart rate gets up over 110. Being able to glance down at my watch and monitor my heart rate was great for helping me slow down and take it easy, making sure I’d make it back to civilization. After all, it wasn’t a sprint with a 30 pound pack, but closer to a marathon with one.
If you’ve been tracking your own personal data for awhile like I have, you probably found yourself at one point asking, “Great, now what do I do with all of this?” If you’re Brian House, the answer is: make a record.
Brian collected a years worth of gelocation data through an iOS app called OpenPaths and transformed it into a physical vinyl record. His project, titled “Quotidian Record”, is a beautiful white vinyl with each day represented with a revolution of the record. You can read more about his project in his blog post and this Wired article.
I’ve been carrying a Fitbit around for 2+ years, generating all sorts of data as part of my ongoing preoccupation with self tracking. I’m not sure where the original idea came from, but somehow I got it into my head that my cat, Bunki, should join me on this quantified self journey. Perhaps it had something to do with her slight roundness. Due to a combination of factors (losing a Fitbit, buying a new one, having that one get run over a week later, Fitbit generously replacing the run-over one, and then me finding the lost one) I ended up with an extra Fitbit Zip that was perfect for Bunki.
I imagine some cats wouldn’t be too into the idea of lugging around a Fitbit, but I figured we’d give it a try. Luckily Bunki had worn a collar in the past, and she’s a 100% indoor cat, so there was a chance this could work without annoyance and without losing another device. I started out with just the collar to make sure she was still cool with that. Then I added just the silicone skin without the actual Fitbit. She didn’t even seem to notice it after the first minute or two, so a few hours later I added in the actual device and set up a Fitbit account of her own! A couple days later and there have been no complaints.
Since Bunki doesn’t have an iPhone she couldn’t just sync over Bluetooth Smart, so I used the tiny USB receiver that comes with the Zip. It’s plugged into a Windows machine that I have running 24/7. It just happens to currently be in the same room as her litter box right now, so it all happens automatically. When we inevitably move the litter box, I’ll have to figure out a new solution. I think treats will probably be involved. One of the big benefits of the Fitbit Zip, besides its smaller size, is that the battery will last 4-6 months, so Bunki doesn’t have to worry about charging anything.
I have no idea how accurate the step count is (you try calibrating that), I don’t know if she gets to double her score since she has four legs, and I’m sure there’s some cheating perhaps going on when she’s scratching. But all in all the data appears to be pretty good as long as you use it comparatively.
You’ll notice that somebody was taking a nap from 11am-3pm while we were out of the house. Proof that yes, the cat has not even moved from that spot since we left.
Memoto presents a short (23 minute) documentary on Lifeloggers. It’s a great conversation with a number of the characters in the Quantified Self, Lifelogging, Self-Tracking, “Record All The Things!” movement. I couldn’t help but smile when I saw Dave Asprey wearing a Splunk shirt. (Disclaimer: I work for Splunk and am using Splunk Storm to track my own self.) There’s also some good blog posts over on Memoto’s site including an interview with Steve Mann.