Calculate DAU with the Raw Data via Python

There are many reasons why you may want to validate the raw data - especially with the Daily Active Users (DAU). In this guide, we will go over the isSession flag, possible discrepancies, and how to properly calculate the DAU. 

isSession flag
In the raw data, there is an isSession flag that is a boolean variable. This flag is used in our backend to determine whether or not to include the user in the DAU calculation. If this variable is set to false, then the user is not included in DAU count.

This flag can be set as false if:

  1. Events are fired out of order
  2. The user has gone offline
  3. You are sending us offline events via our REST API

We do not include these users in the DAU count because they may not be interacting with the app themselves.

However, we disregard the isSession flag when it comes to tracking events because the event could be an API call for when the user was outside of the app (in that case, the user would not have a session but they still triggered an event).

To sum up - we only search for isSession is true for DAU counts. For events, and that is all events, we disregard the flag since we understand that users do not necessarily have to have a full session to trigger a certain event.

Data from Offline users
When you request data from us, we can only return data that is available at that time. Keep in mind that users on offline devices are not able to send Leanplum data until they reconnect. So there could be some instances of trailing data. When they do come back online, their data will then be batched over to Leanplum in that moment. Users' data can appear on the Leanplum server up to 7 days after the day of the event due to internet connectivity of the user and when they return to the app.

Note: if you use our automated exporting feature via the s3 buckets, then offline data is accommodated for as the export ports over any "new data" received since the last export.

Python Script to get raw data
To see the actual code to count the DAU, you can use the following script. The script is written assuming that you are also using the dataExport.py script though you are free to change the input of the function. 

 


Was this article helpful?
Have more questions? Submit a request