Details for app-related-stress-rescuetime-apple-health.ipynb

Published by gedankenstuecke

Description

Does using different applications/program have an influence on my stress levels? I was wondering whether I could combine my computer usage data from RescueTime (which records which programs and apps I use and when I use them) with the heart rate recordings my Apple Watch to find out, thinking that higher heart rates might be a sign of more stress when using them?

0

Tags & Data Sources

stress heart rate productivity RescueTime connection OH Data Port for Apple Health

Comments

Please log in to comment.

Notebook
Last updated 3 years, 2 months ago

Different app, different level of stress?

If you want to run this notebook and run into problems or have questions: Reach out to Bastian on Twitter or Slack

I was wondering whether I could combine my computer usage data from RescueTime (which records which programs and apps I use and when I use them) with the heart rate recordings my Apple Watch makes in semi-regular intervals. The idea being that higher heart rates might be a sign of more stress when using them?

Combining the data isn't fully trivial for a number of reasons:

  1. RescueTime bins the data in different intervals (seems like it used to be 1 hour-sized bins, now it's 5-minute bins).
  2. The Apple Watch doesn't do continous recordings but takes heart rate snapshots in regular intervals, so the data can be spotty, especially as it requires little to no movement to take a record.
  3. Of course there's the fun bit of time zones (the good ol' data science nemesis)

Luckily RescueTime saves data by just using the local system time and Apple HealthKit saves in local time (plus saving the UTC-offset), which means those values are easily alignable (point 3: check!). Point 2 isn't too bad either, as each HR recording will fall within a given 5-minute window that RescueTime reports, so we can fit those nicely too.

Point 1 is a bit trickier: For each 5 minute bin RescueTime reports how many seconds you spent on a given app/program within that window. So there's some judgement calls to be made about how many seconds inside a given window are enough to justify assigning that HR record to an app. For now I decided to with "at least 101 seconds", which comes out as just above 1/3rd of the time available in that interval. Which means that at most 2 different applications can be assigned the same heart rate record. But I'd be happy to hear arguments or ideas for different ways of addressing this!

This notebook makes use of data from a two sources. If you want to run this analysis for your data you need the following data sources connected to your Open Humans account:

We get started by importing the raw data for RescueTime & Apple Health into our notebook:

Step 1: Parsing RescueTime data

The code below takes the raw data from RescueTime and converts it into a dataframe/table.

For my own data RescueTime only provided a one-hour resolution before the 9th of July 2019. Since then the data has a 5-minute resolution. As the hour-long values are hard to interpret/align with heart rate recordings in a meaningful way we limit our analysis to data points recorded to during the 5-minute interval range. (You can adjust this by editign the STARTING_DATE variable below).

/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:23: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Step 2: Reading the Heart Rate data

Now we import the heart rate data from Apple Health. We also remove older heart rate recordings from before the STARTING_DATE, just to keep the size of the data more manageable. We also drop the timezone information from the HR recordings as the local time is what we're after.

Out[8]:
date time_spent_seconds activity category productivity time
134997 2019-07-09 12:25:00 26 Google Documents Writing 2 2019-07-09 12:25:00
134998 2019-07-09 12:25:00 15 telegram Instant Message -1 2019-07-09 12:25:00
134999 2019-07-09 12:25:00 10 instagram.com General Social Networking -2 2019-07-09 12:25:00
135000 2019-07-09 12:25:00 1 facebook.com General Social Networking 0 2019-07-09 12:25:00
135013 2019-07-09 12:30:00 3 Blank Web Browser Internet Utilities 1 2019-07-09 12:30:00

Step 3: Merging the data

Now for the important step, merging the two tables for the RescueTime usage and the heart rate recordings. We do this by matching heart rate recordings that are within a 3 minute window of the RescueTime timestamps.

Now we got our merged dataframe/table. We removed all records for which we didn't have heart rate information, making the data set even more manageable. Below are two records of how this new joint table looks like, we have information for the time spent in a given activity, the activity itself, its category and how productive that application is (scored from -2 to +2). And last but not least we have the heart rate:

Out[11]:
date time_spent_seconds activity category productivity time heart_rate hr_normalized
4 2019-07-09 12:30:00 3 Blank Web Browser Internet Utilities 1 2019-07-09 12:30:00 131.0 57.0
5 2019-07-09 12:30:00 6 telegram Instant Message -1 2019-07-09 12:30:00 131.0 57.0

Step 4: Plotting the data

We're using R and ggplot2 (with the ggridges extension) to make some nice visualizations of our data.

First, we load the R environment and install/load the packages we need:

/opt/conda/lib/python3.7/site-packages/rpy2/robjects/pandas2ri.py:14: FutureWarning: pandas.core.index is deprecated and will be removed in a future version.  The public classes are available in the top-level namespace.
  from pandas.core.index import Index as PandasIndex
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: trying URL 'http://cran.us.r-project.org/src/contrib/ggridges_0.5.3.tar.gz'

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Content type 'application/x-gzip'
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:  length 2218289 bytes (2.1 MB)

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: downloaded 2.1 MB


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The downloaded source packages are in
	‘/tmp/Rtmp4llS00/downloaded_packages’
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Updating HTML index of packages in '.Library'

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Making 'packages.html' ...
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:  done

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: trying URL 'http://cran.us.r-project.org/src/contrib/cowplot_1.1.1.tar.gz'

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Content type 'application/x-gzip'
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:  length 1353271 bytes (1.3 MB)

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: downloaded 1.3 MB


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The downloaded source packages are in
	‘/tmp/Rtmp4llS00/downloaded_packages’
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Updating HTML index of packages in '.Library'

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Making 'packages.html' ...
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:  done

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: ✔ tibble  3.0.1     ✔ dplyr   0.8.5
✔ tidyr   1.1.0     ✔ stringr 1.4.0
✔ readr   1.3.1     ✔ forcats 0.5.0
✔ purrr   0.3.4     

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
Attaching package: ‘lubridate’


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The following objects are masked from ‘package:dplyr’:

    intersect, setdiff, union


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The following object is masked from ‘package:cowplot’:

    stamp


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The following objects are masked from ‘package:base’:

    date, intersect, setdiff, union


As there are tons of activities in this time frame – too many to plot them – we only highlight those activities that were used more than 30,000 seconds in total (a bit over 8 hours of usage), which should give us a manageable sized list of activities. Similarly, for the categories we only use those with at least 40,000 seconds usage in total (over 11 hours of usage).

🎉

And here's our joint plot of the heart rate in relation to different applications or application types!

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 2.64

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 3.32

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 3.46

Normalize by daily resting HR

Our resting HR can vary over time, and in my case I know it has been dropping quite a bit in the last 1 1/2 years. So let's see if we can normalize our data by just taking the "excess" HR by taking the actual HR records minus the resting HR.

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 2.63

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 3.33

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 3.43

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 59.8

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 55

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 17.8

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 45.5

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 1.08

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 1.33

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 1.82

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 1.34

Notebook
Last updated 3 years, 2 months ago

Different app, different level of stress?

If you want to run this notebook and run into problems or have questions: Reach out to Bastian on Twitter or Slack

I was wondering whether I could combine my computer usage data from RescueTime (which records which programs and apps I use and when I use them) with the heart rate recordings my Apple Watch makes in semi-regular intervals. The idea being that higher heart rates might be a sign of more stress when using them?

Combining the data isn't fully trivial for a number of reasons:

  1. RescueTime bins the data in different intervals (seems like it used to be 1 hour-sized bins, now it's 5-minute bins).
  2. The Apple Watch doesn't do continous recordings but takes heart rate snapshots in regular intervals, so the data can be spotty, especially as it requires little to no movement to take a record.
  3. Of course there's the fun bit of time zones (the good ol' data science nemesis)

Luckily RescueTime saves data by just using the local system time and Apple HealthKit saves in local time (plus saving the UTC-offset), which means those values are easily alignable (point 3: check!). Point 2 isn't too bad either, as each HR recording will fall within a given 5-minute window that RescueTime reports, so we can fit those nicely too.

Point 1 is a bit trickier: For each 5 minute bin RescueTime reports how many seconds you spent on a given app/program within that window. So there's some judgement calls to be made about how many seconds inside a given window are enough to justify assigning that HR record to an app. For now I decided to with "at least 101 seconds", which comes out as just above 1/3rd of the time available in that interval. Which means that at most 2 different applications can be assigned the same heart rate record. But I'd be happy to hear arguments or ideas for different ways of addressing this!

This notebook makes use of data from a two sources. If you want to run this analysis for your data you need the following data sources connected to your Open Humans account:

We get started by importing the raw data for RescueTime & Apple Health into our notebook:

In [1]:
from ohapi import api
import os
import requests
import json
import pandas as pd
from datetime import datetime
import arrow

member = api.exchange_oauth2_member(os.environ.get('OH_ACCESS_TOKEN'))
for f in member['data']:
    if f['source'] == "direct-sharing-149":
        rescuetime_data = json.loads(requests.get(f['download_url']).content)
    if f['source'] == 'direct-sharing-453':
        hr_df = pd.read_csv(f['download_url'],names=['heart_rate', 'time', 'type'])

Step 1: Parsing RescueTime data

The code below takes the raw data from RescueTime and converts it into a dataframe/table.

For my own data RescueTime only provided a one-hour resolution before the 9th of July 2019. Since then the data has a 5-minute resolution. As the hour-long values are hard to interpret/align with heart rate recordings in a meaningful way we limit our analysis to data points recorded to during the 5-minute interval range. (You can adjust this by editign the STARTING_DATE variable below).

In [2]:
STARTING_DATE = '2019-07-09'
In [3]:
date = []
time_spent_seconds = []
activity = []
category = []
productivity = []
for element in rescuetime_data['rows']:
    date.append(element[0])
    time_spent_seconds.append(element[1])
    activity.append(element[3])
    category.append(element[4])
    productivity.append(element[5])
date = [datetime.strptime(dt,"%Y-%m-%dT%H:%M:%S") for dt in date]

rt_df = pd.DataFrame(data={
    'date': date,
    'time_spent_seconds': time_spent_seconds,
    'activity': activity,
    'category': category,
    'productivity': productivity
})

rt_df_filtered = rt_df[rt_df['date'] > datetime.fromisoformat(STARTING_DATE)]
rt_df_filtered['time'] = rt_df_filtered['date']
rt_df_filtered = rt_df_filtered.sort_values(by='time')
/opt/conda/lib/python3.7/site-packages/ipykernel_launcher.py:23: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Step 2: Reading the Heart Rate data

Now we import the heart rate data from Apple Health. We also remove older heart rate recordings from before the STARTING_DATE, just to keep the size of the data more manageable. We also drop the timezone information from the HR recordings as the local time is what we're after.

In [4]:
hr_df['time'] = pd.to_datetime(hr_df['time'].str[:-6],utc=True)
In [5]:
resting_hrs = {}
for index, row in hr_df[hr_df['type']=='R'].iterrows():
    day = str(row['time'])[:10]
    resting_hrs[day] = row['heart_rate']
In [6]:
hr_df = hr_df[hr_df['type']=='H']
hr_df = hr_df.sort_values(by='time')
hr_df = hr_df.set_index(hr_df['time'])
hr_df = hr_df.tz_convert(None)
hr_df = hr_df['heart_rate']
hr_df = hr_df.reset_index()
hr_df = hr_df[hr_df['time'] > datetime.fromisoformat(STARTING_DATE)]
In [7]:
def baseline_hr(row):
    day = str(row['time'])[:10]
    if day in resting_hrs.keys():
        return row['heart_rate'] - resting_hrs[day]
    else:
        return None
In [8]:
rt_df_filtered.head()
Out[8]:
date time_spent_seconds activity category productivity time
134997 2019-07-09 12:25:00 26 Google Documents Writing 2 2019-07-09 12:25:00
134998 2019-07-09 12:25:00 15 telegram Instant Message -1 2019-07-09 12:25:00
134999 2019-07-09 12:25:00 10 instagram.com General Social Networking -2 2019-07-09 12:25:00
135000 2019-07-09 12:25:00 1 facebook.com General Social Networking 0 2019-07-09 12:25:00
135013 2019-07-09 12:30:00 3 Blank Web Browser Internet Utilities 1 2019-07-09 12:30:00
In [9]:
hr_df['hr_normalized'] = hr_df.apply(lambda row: baseline_hr(row), axis=1)

Step 3: Merging the data

Now for the important step, merging the two tables for the RescueTime usage and the heart rate recordings. We do this by matching heart rate recordings that are within a 3 minute window of the RescueTime timestamps.

In [10]:
merged_df = pd.merge_asof(rt_df_filtered,hr_df, on='time', tolerance=pd.Timedelta('3min'), allow_exact_matches=False)
merged_df = merged_df[merged_df['heart_rate'].notna()]

Now we got our merged dataframe/table. We removed all records for which we didn't have heart rate information, making the data set even more manageable. Below are two records of how this new joint table looks like, we have information for the time spent in a given activity, the activity itself, its category and how productive that application is (scored from -2 to +2). And last but not least we have the heart rate:

In [11]:
merged_df.head(2)
Out[11]:
date time_spent_seconds activity category productivity time heart_rate hr_normalized
4 2019-07-09 12:30:00 3 Blank Web Browser Internet Utilities 1 2019-07-09 12:30:00 131.0 57.0
5 2019-07-09 12:30:00 6 telegram Instant Message -1 2019-07-09 12:30:00 131.0 57.0

Step 4: Plotting the data

We're using R and ggplot2 (with the ggridges extension) to make some nice visualizations of our data.

First, we load the R environment and install/load the packages we need:

In [12]:
%load_ext rpy2.ipython
/opt/conda/lib/python3.7/site-packages/rpy2/robjects/pandas2ri.py:14: FutureWarning: pandas.core.index is deprecated and will be removed in a future version.  The public classes are available in the top-level namespace.
  from pandas.core.index import Index as PandasIndex
In [13]:
%%R -i merged_df -w 10 -h 10 --units in 
library(ggplot2)
install.packages('ggridges', repos='http://cran.us.r-project.org')
install.packages('cowplot', repos='http://cran.us.r-project.org')
library(ggridges)
library(cowplot)
library(tidyverse)
library(lubridate)
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: trying URL 'http://cran.us.r-project.org/src/contrib/ggridges_0.5.3.tar.gz'

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Content type 'application/x-gzip'
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:  length 2218289 bytes (2.1 MB)

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: downloaded 2.1 MB


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The downloaded source packages are in
	‘/tmp/Rtmp4llS00/downloaded_packages’
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Updating HTML index of packages in '.Library'

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Making 'packages.html' ...
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:  done

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: trying URL 'http://cran.us.r-project.org/src/contrib/cowplot_1.1.1.tar.gz'

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Content type 'application/x-gzip'
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:  length 1353271 bytes (1.3 MB)

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: =
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: downloaded 1.3 MB


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The downloaded source packages are in
	‘/tmp/Rtmp4llS00/downloaded_packages’
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Updating HTML index of packages in '.Library'

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Making 'packages.html' ...
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]:  done

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: ✔ tibble  3.0.1     ✔ dplyr   0.8.5
✔ tidyr   1.1.0     ✔ stringr 1.4.0
✔ readr   1.3.1     ✔ forcats 0.5.0
✔ purrr   0.3.4     

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: 
Attaching package: ‘lubridate’


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The following objects are masked from ‘package:dplyr’:

    intersect, setdiff, union


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The following object is masked from ‘package:cowplot’:

    stamp


WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: The following objects are masked from ‘package:base’:

    date, intersect, setdiff, union


In [14]:
%%R
merged_df$hour <- hour(merged_df$time)
merged_df$minute <- minute(merged_df$time)
merged_df$time <- merged_df$hour + merged_df$minute/60

As there are tons of activities in this time frame – too many to plot them – we only highlight those activities that were used more than 30,000 seconds in total (a bit over 8 hours of usage), which should give us a manageable sized list of activities. Similarly, for the categories we only use those with at least 40,000 seconds usage in total (over 11 hours of usage).

In [15]:
%%R -w 15 -h 10 --units in 
aggregated_activities <- aggregate(merged_df$time_spent_seconds, by=list(merged_df$activity), FUN=sum)
frequent_activites <- subset(aggregated_activities, aggregated_activities$x > 30000)$Group.1

merged_df_sub <- subset(merged_df, merged_df$time_spent_seconds > 101 & merged_df$activity %in% frequent_activites & merged_df$activity != 'Blank Web Browser')
app_level <- ggplot(merged_df_sub,aes(y=fct_reorder(activity,heart_rate,.fun=mean), x=heart_rate, fill=fct_reorder(activity,heart_rate,.fun=mean))) +
    geom_density_ridges_gradient(rel_min_height = 0.03) + 
    scale_fill_viridis_d() +
    geom_vline(xintercept=mean(merged_df_sub$heart_rate),color='red') + 
    scale_y_discrete('application') + 
    scale_x_continuous('heart rate') + 
    theme_minimal(base_size = 30) + 
theme(legend.position = "none")
In [16]:
%%R -w 15 -h 10 --units in 
aggregated_categories <- aggregate(merged_df$time_spent_seconds, by=list(merged_df$category), FUN=sum)
frequent_categories <- subset(aggregated_categories, aggregated_categories$x > 40000)$Group.1
merged_df_sub <- subset(merged_df, merged_df$time_spent_seconds > 151 & category %in% frequent_categories)
category_level <- ggplot(merged_df_sub,aes(y=fct_reorder(as.character(category),heart_rate,.fun=mean), x=heart_rate, fill=fct_reorder(as.character(category),heart_rate,.fun=mean))) +
    geom_density_ridges_gradient(rel_min_height = 0.05) + 
    scale_fill_viridis_d() +
    scale_y_discrete('application category') + 
    geom_vline(xintercept=mean(merged_df_sub$heart_rate),color='red') + 
    scale_x_continuous('heart rate') + 
    theme_minimal(base_size = 25) + 
    theme(legend.position = "none")
In [17]:
%%R -w 15 -h 10 --units in 
merged_df_sub <- subset(merged_df, merged_df$time_spent_seconds > 151)
productivity_level <- ggplot(merged_df_sub,aes(y=fct_reorder(as.character(productivity),heart_rate,.fun=mean), x=heart_rate, fill=fct_reorder(as.character(productivity),heart_rate,.fun=mean))) +
    geom_density_ridges_gradient(rel_min_height = 0.04) + 
    scale_fill_viridis_d() +
    scale_y_discrete('productivity level') + 
    scale_x_continuous('heart rate') + 
    theme_minimal(base_size = 30) + 
    theme(legend.position = "none") + 
    geom_vline(xintercept=mean(merged_df_sub$heart_rate),color='red')

🎉

And here's our joint plot of the heart rate in relation to different applications or application types!

In [18]:
%%R -w 25 -h 15 --units in 
left_column <- plot_grid(productivity_level, category_level, labels = c('A', 'B'),ncol=1,rel_heights = c(2, 3.5))

plot_grid(left_column, app_level, ncol=2,rel_widths=c(1,2),labels=c('','C'))
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 2.64

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 3.32

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 3.46

Normalize by daily resting HR

Our resting HR can vary over time, and in my case I know it has been dropping quite a bit in the last 1 1/2 years. So let's see if we can normalize our data by just taking the "excess" HR by taking the actual HR records minus the resting HR.

In [19]:
%%R -w 25 -h 15 --units in 
aggregated_activities <- aggregate(merged_df$time_spent_seconds, by=list(merged_df$activity), FUN=sum)
frequent_activites <- subset(aggregated_activities, aggregated_activities$x > 30000)$Group.1

merged_df_sub <- subset(merged_df, merged_df$time_spent_seconds > 101 & merged_df$activity %in% frequent_activites & merged_df$activity != 'Blank Web Browser')
app_level <- ggplot(merged_df_sub,aes(y=fct_reorder(activity,hr_normalized,.fun=mean), x=hr_normalized, fill=fct_reorder(activity,hr_normalized,.fun=mean))) +
    geom_density_ridges_gradient() + 
    scale_fill_viridis_d() +
    geom_vline(xintercept=mean(merged_df_sub$hr_normalized),color='red') + 
    scale_y_discrete('application') + 
    scale_x_continuous('excess heart rate (HR - resting HR)') + 
    theme_minimal(base_size = 30) + 
theme(legend.position = "none")

aggregated_categories <- aggregate(merged_df$time_spent_seconds, by=list(merged_df$category), FUN=sum)
frequent_categories <- subset(aggregated_categories, aggregated_categories$x > 40000)$Group.1
merged_df_sub <- subset(merged_df, merged_df$time_spent_seconds > 151 & category %in% frequent_categories)
category_level <- ggplot(merged_df_sub,aes(y=fct_reorder(as.character(category),hr_normalized,.fun=mean), x=hr_normalized, fill=fct_reorder(as.character(category),hr_normalized,.fun=mean))) +
    geom_density_ridges_gradient() + 
    scale_fill_viridis_d() +
    scale_y_discrete('application category') + 
    geom_vline(xintercept=mean(merged_df_sub$hr_normalized),color='red') + 
    scale_x_continuous('excess heart rate (HR - resting HR)') + 
    theme_minimal(base_size = 25) + 
    theme(legend.position = "none")

merged_df_sub <- subset(merged_df, merged_df$time_spent_seconds > 151)
productivity_level <- ggplot(merged_df_sub,aes(y=fct_reorder(as.character(productivity),hr_normalized,.fun=mean), x=hr_normalized, fill=fct_reorder(as.character(productivity),hr_normalized,.fun=mean))) +
    geom_density_ridges_gradient(rel_min_height = 0.04) + 
    scale_fill_viridis_d() +
    scale_y_discrete('productivity level') + 
    scale_x_continuous('excess heart rate (HR - resting HR)') + 
    theme_minimal(base_size = 30) + 
    theme(legend.position = "none") + 
    geom_vline(xintercept=mean(merged_df_sub$hr_normalized),color='red')

left_column <- plot_grid(productivity_level, category_level, labels = c('A', 'B'),ncol=1,rel_heights = c(2, 3.5))

plot_grid(left_column, app_level, ncol=2,rel_widths=c(1,2),labels=c('','C'))
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 2.63

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 3.33

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 3.43

In [20]:
%%R
merged_df$hr_category <- case_when(
merged_df$hr_normalized > mean(merged_df_sub$hr_normalized) + sd(merged_df_sub$hr_normalized) ~ 'high',
merged_df$hr_normalized < mean(merged_df_sub$hr_normalized) - sd(merged_df_sub$hr_normalized) ~ 'low',
merged_df$hr_normalized > mean(merged_df_sub$hr_normalized) - sd(merged_df_sub$hr_normalized) & merged_df$hr_normalized < mean(merged_df_sub$hr_normalized) + sd(merged_df_sub$hr_normalized)  ~ 'average',
)

merged_df$hr_category <- factor(merged_df$hr_category, levels = c("low", "average", "high"))
In [21]:
%%R -w 10 -h 15 --units in 
merged_df_sub <- subset(merged_df, merged_df$time_spent_seconds > 101 & merged_df$activity %in% frequent_activites & merged_df$activity %in% c("Keynote", "netflix.com", "notebooks.openhumans.org", "drive.google.com"))
ggplot(merged_df_sub,aes(y=hr_category, x=as.Date(date), fill=hr_category)) +
    geom_density_ridges_gradient() + 
    scale_fill_viridis_d() +
    scale_y_discrete('application') + 
    scale_x_date('date') + 
    theme_minimal(base_size = 30) + 
    facet_grid(activity ~ .) +
    theme(legend.position = "none")
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 59.8

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 55

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 17.8

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 45.5

In [22]:
%%R -w 10 -h 15 --units in 
merged_df_sub <- subset(merged_df, merged_df$time_spent_seconds > 101 & merged_df$activity %in% frequent_activites & merged_df$activity %in% c("Keynote", "netflix.com", "notebooks.openhumans.org", "drive.google.com"))
ggplot(merged_df_sub,aes(y=hr_category, x=time, fill=hr_category)) +
    geom_density_ridges_gradient() + 
    scale_fill_viridis_d() +
    scale_y_discrete('application') + 
    scale_x_continuous('hour',limits=c(0,24)) + 
    theme_minimal(base_size = 30) + 
    facet_grid(activity ~ .) +
    theme(legend.position = "none")
WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 1.08

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 1.33

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 1.82

WARNING:rpy2.rinterface_lib.callbacks:R[write to console]: Picking joint bandwidth of 1.34