I got my first Fitbit tracker a couple of years ago and I’ve been a loyal Fitbit user since – I’m currently on my 3rd tracker and my family has them too. Very quickly I became interested in getting a bit more information out of my data, charting it across other health data I have available – so I went on to Fitbit’s website, as I was quite sure they must have an API, right? Yes, they do. And yes, I can use it for free. Because, as Fitibit says, “your data is yours”. Awesome.
And then it turned out it’s not as much “my data” as “my totals”. Fitbit’s public API could only give me aggregated data for each day, total number of steps each day, averages, etc. I wanted to get a count of my steps for each 5-minute period, the same as I can see on Fitbit’s dashboard when I log in. It turned out that was not possible, unless I had a commercial application, submit a request to Fitbit, and they decide it’s worth it. Boo. (Please refer to the bottom of this post for a note of the state of the API today)

So, I did what any (I think…) developer would do. I checked out the source code! Since the website was able to show the data, it means it was getting it from somewhere. And, good news here, Fitbit’s dashboard and all its graphs not only show all the data I wanted, they also load it via AJAX, making what looked like API calls and getting results in JSON. That means, I can write a scraper which will login to Fitbit’s website using my account, make the same API calls and get the data. Perfect. And so I set out to writing a simple scraper.

Login

First things first, it seemed I need to be logged in and have a valid session cookie to be able to make the request to the API (fair enough). So, I opened the login page and checked out the form – it’s simple enough, login and password. It also has a couple of more hidden inputs, but after a quick test, it turned out they were not required and I was logged in even when they were missing.

Very simple login using PHP and curl:

const USERNAME = 'xxx';
const PASSWORD = 'abc';

$curl = curl_init('https://www.fitbit.com/login');
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_COOKIEJAR, "/dev/null"); //make sure we save the cookies!
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_TIMEOUT, 1000);
curl_setopt($curl, CURLOPT_USERAGENT, '');
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_NOSIGNAL, 1);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt(
    $curl,
    CURLOPT_POSTFIELDS,
    array(
        'email'    => USERNAME,
        'password' => PASSWORD,
        'login'    => 'Log In',

    )
);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_exec($curl);

$url = curl_getinfo($curl, CURLINFO_EFFECTIVE_URL);
if ($url == "https://www.fitbit.com/login") { //terribly simple check
    throw new \Exception("Could not login to fitbit!");
}

Notice a very simple check to ensure the login was successful – just checking if we were redirected back to the login page seems to be enough, after a successful login we are redirected to www.fitbit.com, otherwise the login page is displayed with an error.
Now that we have the correct session cookies stored in memory (curl_setopt($curl, CURLOPT_COOKIEJAR, "/dev/null"); is an old “trick” making curl save cookies in memory during the session), we can try getting our data.

I was primarily interested in two pieces of my information: my steps each day and my HR data each day, in as much detailed form as possible.
The HR data is visible on the dashboard (at the main page after login), on the graph like below:
fitbit hr graph

The steps were not as obvious as the dashboard shows totals in 15-minute periods, not 5-minute. The detailed data can be seen inside the “Log” (www.fitbit.com/activities):
fitbit steps graph

Steps

After checking out the XHR calls made on the /activities page, I ended up with this code (as a continuation of above):

const USERID='xyz';
const DATE='2016-10-26';

$params = array(
    'userId'      => USERID,
    'type'        => 'intradaySteps',
    'dateFrom'    => DATE,
    'dateTo'      => DATE,
    'dataVersion' => 1450,
    'ts'          => time().'000',
    'apiFormat'   => 'json',
);
$url = 'https://www.fitbit.com/graph/getNewGraphData?'.http_build_query($params);
curl_setopt($curl, CURLOPT_URL, $url);

$output = curl_exec($curl);

$data = json_decode($output, true);
var_dump($data['graph']['dataSets']['activity']['dataPoints']);

Example partial output:

[160]=>
  array(2) {
    ["dateTime"]=>
    string(19) "2016-10-25 13:20:00"
    ["value"]=>
    float(8)
  }
  [161]=>
  array(2) {
    ["dateTime"]=>
    string(19) "2016-10-25 13:25:00"
    ["value"]=>
    float(120)
  }
  [162]=>
  array(2) {
    ["dateTime"]=>
    string(19) "2016-10-25 13:30:00"
    ["value"]=>
    float(0)
  }
  [163]=>
  array(2) {
    ["dateTime"]=>
    string(19) "2016-10-25 13:35:00"
    ["value"]=>
    float(25)
  }

Steps as numbers for each 5-minute period. Perfect.

I’m not sure if there’s a “nice” way to get your user id for this call. I got mine from seeing the original AJAX call made.

Heart rate

The code:

curl_setopt($curl, CURLOPT_URL, "https://www.fitbit.com/ajaxapi");
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt(
    $curl,
    CURLOPT_POSTFIELDS,
    array(
        'request' => json_encode(
            [
                'template'     => "/ajaxTemplate.jsp",
                'serviceCalls' => [
                    [
                        'name'   => 'activityTileData',
                        'args'   => [
                            'date'      => DATE,
                            'dataTypes' => 'heart-rate',
                        ],
                        'method' => 'getIntradayData',
                    ],
                ],
            ]
        )
    )
);

$output = curl_exec($curl);

$data = json_decode($output, true);
if ($data) {
    $data = array_shift($data);
}

var_dump($data['dataSets']['activity']['dataPoints']);

The above code has been working perfectly fine since March 2014 (that’s when I started gathering that data). It stopped a couple of days ago. Turns out, someone at Fitbit’s development team realised (I hope it was intentional and not just a happy mistake) that the page is riddled with CSRF tokens, which are sometimes sent with the AJAX calls, sometimes not – but they’re never checked. So when this call stopped working, I had a suspicion this may be a culprit – and it was. Luckily (for me), the page source code is abundant with the CSRF token, it appears on the page in multiple places, so I chose one of them to extract it. Adding a couple of lines to make it work:

curl_setopt($curl, CURLOPT_URL, 'https://www.fitbit.com/');
$output = curl_exec($curl);

$tokenPosition = strpos($output, 'window.fitbitCsrfToken');
$tokenString = substr($output, $tokenPosition + 26, 36); //will break if the token's length changes

curl_setopt($curl, CURLOPT_URL, "https://www.fitbit.com/ajaxapi");
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt(
    $curl,
    CURLOPT_POSTFIELDS,
    array(
        'request' => json_encode(
            [
                'template'     => "/ajaxTemplate.jsp",
                'serviceCalls' => [
                    [
                        'name'   => 'activityTileData',
                        'args'   => [
                            'date'      => DATE,
                            'dataTypes' => 'heart-rate',
                        ],
                        'method' => 'getIntradayData',
                    ],
                ],
            ]
        ),
        'csrfToken' => $tokenString,
    )
);

$output = curl_exec($curl);

$data = json_decode($output, true);
if ($data) {
    $data = array_shift($data);
}

Example output:

[258]=>
  array(6) {
    ["bpm"]=>
    int(68)
    ["confidence"]=>
    int(2)
    ["caloriesBurned"]=>
    float(6.82652)
    ["defaultZone"]=>
    NULL
    ["customZone"]=>
    NULL
    ["dateTime"]=>
    string(19) "2016-10-25 21:35:00"
  }
  [259]=>
  array(6) {
    ["bpm"]=>
    int(66)
    ["confidence"]=>
    int(2)
    ["caloriesBurned"]=>
    float(5.52145)
    ["defaultZone"]=>
    NULL
    ["customZone"]=>
    NULL
    ["dateTime"]=>
    string(19) "2016-10-25 21:40:00"
  }

CSRF tokens are important, Fitbit!

Now, as you probably noticed, the login page and the calls on the activities page (there are more calls made on that page, not only the one for steps), still do not use the CSRF token! Not nice, Fitbit.

Side note: Fitbit API

Since writing this piece of code in 2014, Fitbit changed their API availability so some of this data may be available through their API: dev.fitbit.com. The point on the CSRF tokens is still valid as of the day of writing.

Was this post helpful to you? Yes!


Leave a comment