GA4 and BigQuery: towards more complete data with Fresh daily tables?

In our previous article, we explored the behavior of intraday and interday partitioned tables in GA4 and their evolution. Aujourd’hui, nous nous intéressons à une option disponible pour les comptes GA4 360 dans BigQuery : les tables “ Fresh daily“. These offer more frequent updates and reach a relevant level of completion more quickly, positioning themselves as an alternative to intraday tables for companies seeking the freshest possible insights.

In this article, we will compare the “Fresh daily” to intraday and interday tables in terms of availability, update frequency and completeness. availability, update frequency and completeness, to better understand which of these options best meets the requirements of “real-time” performance analysis.

Our analysis focuses mainly on metrics, as we know that Google does not ensure the completion of dimensions related to session attribution (source, medium, campaign…) for this “Fresh Daily” table.

Methodology

To assess the performance of the “Fresh daily”, intraday and interday tables in GA4, we conducted an analysis over a full week (from 09/29 to 10/03).

Each day, we extracted and compared data from 5 key events:

  • Sessions
  • Purchases (transactions)
  • Begin checkout
  • Revenue
  • Reach search (custom event)

We have also programmed a query to track the evolution of these events hour by hour over three additional days (from 18/10 to 20/10), allowing us to analyze differences in data completeness as they are updated in each table type.

This monitoring helps us to understand the speed and accuracy with which each table provides information. By observing this data over several days and at different times, we have been able to measure the completeness of each table type, the variations between them and their respective usefulness for different use cases.

Results and analysis

1- Data completeness

The analysis revealed significant differences in terms of completeness between the 3 types of GA4 tables. At D+1, “Fresh daily” tables achieve an average completeness rate of 95% to 99%, while intraday tables stagnate between 53% and 76%. On the other hand, interday tables, available a little later, guarantee 100% completeness, but take several days to be fully updated.

Looking at specific events, we see that completeness varies according to the type of data collected. For example:

  • Custom metrics and other standard metrics: The completeness rate of “Fresh daily” data is close to 100% on D+1, exceeding intraday data by almost 25%.

Completion level of standard and Custom metrics on Fresh daily and Intraday tables at D+1 (compared with interday, our base 100)

  • Revenue For this event, the gap is even wider. Fresh daily data is virtually complete on D+1, while intraday table data completes much more slowly, making the latter unreliable for early reports or operational decisions.

Revenue completion level on Fresh daily and Intraday tables at D+1 (compared with interday, our base 100)

Cette 1ère analyse nous laisse à penser qu’à J+1, on ne peut pas se fier à l’intraday contrairement à la fresh daily qui se rapproche des 100%.

Toutefois, il convient d’abord d’examiner l’évolution de la donnée à J0 (notamment à des fins de détection d’anomalie) ainsi que jusqu’à J+2 pour établir les “cycles de vie” de chaque type de table.

2 – Evolution of data on the 3 types of tables from D0 to D+2

By observing the evolution of data in the Fresh daily, intraday, and interday tables from D0 to D+2, we can note the following points:

  • Intraday tables are generated a few hours before fresh daily tables, but the difference in completeness between the two quickly widens in favour of fresh daily.
  • At D+1, the Fresh daily table systematically outperforms intraday in terms of completeness, especially for indicators such as earnings, where intraday shows a much slower progression.

Another essential point is the“latency gap“: after intraday tables have been deleted and before interday data are available, there may still be a period of incompleteness, during which only Fresh daily offers data close to reality. For example, the delay between the end of intraday and the publication of interday data can leave a gap of almost 1% in the Fresh daily for certain indicators, but this gap is even more variable for intraday.

Let’s take the Reach and Revenue events and analyze what we’ve learned from our hour-by-hour programmed queries.

Hour-by-hour completion, on standard and custom metrics for Fresh daily and Intraday tables (relative to interday, our base 100)

Hour-by-hour completion on the Revenue for Fresh daily and Intraday tables (relative to interday, our base 100)

Pour les Sessions, on observe que la fresh daily est à +99% dès le début, ce qui la rend plus fiable que l’intraday, sauf sur la première moitié de la journée (mais avec un écart < à 1%).

Pour les Purchases en revanche, la fresh daily est toujours plus fiable et plus complète, avec des écarts conséquents.

Quid de l’attribution des sessions source et medium ?

Niveau de complétion de l’attribution des traffic medium sur les tables Fresh daily et Intraday à J0 (par rapport à l’interday, notre base 100, à timestamp équivalent)

Ici encore le taux de complétion des données est largement en faveur de la fresh daily à J0, l’intraday étant à des volumes globalement très bas.

Niveau de complétion de l’attribution des traffic source sur les tables Fresh daily et Intraday à J0 (par rapport à l’interday, notre base 100, à timestamp équivalent)

3) Completion time and data availability

By tracking tables over several days, we have identified completion times and availability times for each table type. Here’s what we observed:

  • Intraday These tables remain active for about two days before switching to interday tables.
  • Fresh daily These tables reach 99% completeness within a day and a half, and remain virtually complete, although they never reach 100% of interday tables. They are available around half a day before interday, offering an early preview of performance.

Our programmed queries allow us to establish the creation, deletion and completion dates of the 3 types of tables, which we can summarize with this diagram:

Time of data creation and completion for Fresh daily, interday and intraday tables

We also know that the creation time of the interday table is fairly unreliable, varying by almost 10 hours (from 8am to 6pm), and that this creation time has tended to shift sharply since the beginning of 2024.

To maximize the efficiency of real-time analyses, two methods are available to monitor the creation and completion of Fresh daily, intraday and interday tables:

1. Table metadata query : The following query identifies table creation and modification dates. However, it does not apply to intraday tables, which are temporary and can only be retrieved live.

				
					SELECT
  table_id AS table_name,
  FORMAT_TIMESTAMP('%Y-%m-%d', DATETIME(TIMESTAMP_MILLIS(creation_time), "America/New_York")) AS creation_date,
  FORMAT_TIMESTAMP('%H:%M', DATETIME(TIMESTAMP_MILLIS(creation_time), "America/New_York")) AS creation_time,
  FORMAT_TIMESTAMP('%Y-%m-%d', DATETIME(TIMESTAMP_MILLIS(last_modified_time), "America/New_York")) AS last_modified_date,
  FORMAT_TIMESTAMP('%H:%M', DATETIME(TIMESTAMP_MILLIS(last_modified_time), "America/New_York")) AS last_modified_time
FROM
  `nameofyourdataset.__TABLES__`
WHERE
  (table_id LIKE 'events_fresh_202409%' OR table_id LIKE 'events_fresh_202410%')
  AND DATETIME(TIMESTAMP_MILLIS(creation_time), "America/New_York") BETWEEN DATETIME('2024-09-19 00:00:00') AND DATETIME('2024-10-31 23:59:59')
ORDER BY
  table_name ASC;
				
			

2. Tracking via scheduled queries By creating scheduled queries every hour, we can detect when tables are being created or deleted (when the scheduled query is no longer in error), particularly intraday. This monitoring can also help avoid errors linked to queries that are too early on incomplete data.

Status of our scheduled query, until it reaches KO when the fresh daily table no longer exists in BigQuery.

Lessons learned and key questions

Une des interrogations majeures de notre analyse concerne la précision des tables Fresh daily comparée à celle des tables interday, en particulier lorsque la Fresh daily atteint un niveau de complétude proche de 100 %. Les questions suivantes se posent alors :

When the Fresh daily is complete, does it correspond to 100% of the interday data?

To answer this question, we have compared the event completion rates in the Fresh daily and interday tables. We observe that, although Fresh daily often achieves over 99% completeness, it does not always correspond exactly to the 100% guaranteed by interday tables. This slight difference could be significant in contexts where maximum accuracy is required, such as closing reports or performance audits.

At what point can it be considered reliable and replace interday?

For uses focused on rapid decision-making, the Fresh daily table can be considered reliable as soon as it approaches 99% completeness, often as early as D+1. It thus offers a faster solution than interday, which is available half a day to a day later. This early availability of Fresh daily can be particularly advantageous for continuous performance analysis and anomaly alerts, where 99% completeness is generally sufficient.

As an example, let’s look at the data for 18/10 (after the last modification date):

Si je souhaite monitorer la donnée sur la journée en cours, quelle table dois-je utiliser ?

La disponibilité anticipée peut être particulièrement avantageuse pour les analyses de performance continue et les alertes d’anomalies.

Quelle est alors la meilleure table choisir à un instant T ?

Nous avons vu précédemment que l’intraday était créé en moyenne 1h30 avant la fresh daily. Si c’est la disponibilité immédiate de la donnée qui est recherchée, il est préférable de s’appuyer sur l’intraday.

Toutefois, la donnée intraday est loin d’être précise, comme nous avons pu le voir avec les purchases notamment.

C’est pourquoi, dans la mesure du possible, il est donc préférable de se reposer sur la fresh daily, disponible quelques heures plus tard, mais plus complète et plus fiable, en attendant l’apparition de l’interday.  

Par ailleurs, si vous souhaitez analyser les canaux d’acquisition, la donnée intraday est prohibée car non fiable sur ces dimensions, à la différence du Fresh daily et de l’interday.

Conclusion

Our study shows that Fresh daily tables offer an effective alternative to intraday tables for “fast” analytical needs (you’ll still have to wait a few hours), particularly in revenue reporting. Although they don’t achieve the 100% completeness of interday tables, their proximity (often over 99%) and early availability make them valuable for companies seeking greater reactivity for their analyses.

Advantages of the Fresh daily :

  • Over 99% completeness on D+1.
  • Reliable for rapid reporting and anomaly detection.

Limits :

  • No guarantee of absolute accuracy for complex events.
  • Slightly less precise than interday, but with almost complete coverage.
  • It takes a few hours to obtain complete data above 95%.

These results show that Fresh daily is the solution of choice for companies seeking a balance between reactivity and precision, with data available earlier to facilitate informed decision-making, even if we’re still a long way from real time.

Share the Post:

Subscribe to our newsletter