What does the warning mean?
Have you noticed a warning icon next to a Baresquare ticket title that looks like the one below?
This is a warning that comes from Google Analytics (GA) API when Baresqure pulls the data. It means that if the same data is extracted again, there could be slight differences compared to what you see in the GA UI. Or that data is sampled, e.g., due to the volume of a website's traffic and GA only reports on a portion of it.
Baresquare's data extraction process is optimized both for speed and data accuracy. Even when data discrepancies are reported, they are usually too small to be noticed.
Why does this happen?
When Baresquare connects to your Google Analytics (GA) account and performs a request to extract data, the GA API response may contain the following keys:
'isDataGolden': this indicates if the data is complete, i.e. the response to this request wouldn't return different results if asked at a later point in time.
'samplesReadCounts' / 'samplingSpaceSize': this shows if the results are sampled.*
When data is not golden (i.e., the 'isDataGolden' key is not present in the GA API response) or when data is sampled (i.e., both the 'samplesReadCount' and the 'samplingSpaceSizes' keys are present in the GA API response), then the warning icon is appended next to the title of the ticket referring to this data.
Could this be addressed somehow?
Baresquare by default performs requests that contain the 'samplingLevel' field set to "Large" to avoid sampled data as much as possible**. Also, the extraction time is automatically set to balance data completeness and freshness of reported anomalies. This is done by ensuring the extraction request takes place at least 4 hours after the closing of the day to process full data (learn more about GA data freshness here).
However, these cannot always guarantee that data will be complete or unsampled in GA API responses. We could delay the data extraction/processing upon request, in order to ensure data completeness. Feel free to contact us if you would like us to do so!
* sampling in data analysis is the practice of analyzing a subset of all data in order to infer meaningful information in the larger data set. Learn more about sampling in Google Analytics here.
** If the 'samplingLevel' field is unspecified in a request, the DEFAULT sampling level is used. The available options are:
DEFAULT: returns response with a sample size that balances speed and accuracy.
SMALL: returns a fast response with a smaller sampling size.
LARGE: returns a more accurate response using a large sampling size. But this may result in slower responses.