Getting value out of a sunburst

Sunbursts make people look smart on a slide, but it’s much harder when it comes to actually make it actionable.  Additional modeling on top of journey visualization can be done to find great insights. 

What is a sunburst ?

First things, let’s clarify the topic: although a beautiful sunset probably has more value than this whole article, this is NOT what we’re about to talk of here. Yep, sorry, this page is about data analytics. 

A ring chart, also known as a sunburst chart or a multilevel pie chart, is used to visualize hierarchical data, depicted by concentric circles. 

Wikipédia

In the digital analytics world and especially in CRO (Conversion Rate Optimization) it’s used for describing user behavior, especially on a website to understand how people navigate from one page to another. The first central ring is basically a pie chart of your landing pages, then the second ring is where people go after the first page, etc… until they ultimately leave your website, with or without buying (and that is the question).

Note that the usage of this type of chart can be used outside of web product optimization. For instance, you can definitely imagine the same approach for describing how people ultimately land on your site, through different acquisition channels, where touchpoints would not be web pages anymore but rather a category of acquisition channels.

Sunbursts appear to be trendy in the CRO eco-system because it’s relatively new, it’s easy to understand, and it’s somehow good looking. At least much more than the ugly “behavior flow” of google analytics, although it’s saying the same thing. 

The ugly and old “behavior flow” of Google analytics

A sunburst in Datama. You should read the highlighted as 0.3% of visitors have seen a product page then temporarily left the site and then came back in another session on a product page, … etc, until they finally left out from the website on a product page.

How do you get it ? 

There are many ways you can get a sunburst describing your user journey, and everything depends on how you collect the data. 

Some tools like Content Square give it right in their platform, but it comes with a significant cost (often 100s of k€).  The below aims at giving you the keys to do it for much cheaper, with the web analytics tracking tool that you’re already using (and likely paying for). For what it’s worth, typical fees at Datama for those types of assignments are below 10k€, and it comes with actionable insights.

We’ll give you a glance at how to get it from Google Analytics because it represents >80% of the web analytics market. However, it’s also possible on other tools like Adobe or AT Internet. All you need is access to the session in order to be able to track the journey consistently.

GOOGLE ANALYTICS

Google 360 customers with Big Query

Strangely, I’ve always found few people mastering Big Query requests to extract what they could not extract from Google Analytics interface. This is probably because Web analytics is a complicated area for general data analysts, SQL is (wrongly) not a prerequisite for being a web analyst in most companies and nested tables are frightening at the start. Yet, it’s amazing how many things you can do when you start playing with Big Query on top of Google Analytics tables.

Anyhow, the idea is to write a query that ends up with a result that looks a bit like the screenshot below, with a “Journey” column that contains the pages that the user has seen in chronological order (something like “Page 1 – Page 2 – Page 3”), and the number of users who did this. 

I won’t spend time here on the technical aspects to get it, but you just need to know that the GA BQ export schema is your reference, STRING_AGG is the function you’re looking for and that you need to be careful about what you account and what you don’t account, because you can easily mess up with hits definitions.

How the results of your query should look like in Big Query

Google Analytics free customers 

Side note for people who are not Google 360 and don’t have access to Big Query: you can still make it, although it might be a bit less detailed. Google gives you access to the landing page path, the exit page path, the previous page path, and the current page path. That’s quite enough to get at least a 5 floors sunburst, as long as you don’t bother getting into some hard thinking and have a language to tap into the GA reporting API to avoid sampling (and if you don’t, I can certainly drive you to the excellent GoogleanalyticsR package by Mark Edmonson that does an awesome job in it).

Anyhow, the idea is to write a query that ends up with a result that looks a bit like the screenshot below, with a “Journey” column that contains the pages that the user has seen in chronological order (something like “Page 1 – Page 2 – Page 3”), and the number of users who did this.  

A screenshot from the Dimensions & Metrics explorer for GA. The last 4 are the ones you’re looking for!

Visualize it

Having a dataset is the hard part. Visualizing the chart itself is much easier. There are many ways you could visualize the data you’ve built as a sunburst. Most of them use D3 JS technology under the wood. The funny part is that a former Google employee is actually publishing the original article behind this (see here) but I’ve never seen it in Google Analytics, nor in Data Studio.

If you’re familiar with Plotly (either in R or Python), they have a great template that you can use easily with your favorite language.  At Datama we’re instead using the R SunburstR package from Kent Russel that does a great job in displaying the sequential journeys.

Sunburst function in sunburstR package in R

Note that you can also integrate that R/ Python/ JS generated chart into your favorite reporting tools, such as Tableau, Power BI, or Data Studio. 

There are also some ways to create native sunburst in Tableau (see this incredible hack here – take your time, and avoid having more than 1000 lines given the number of nested calculations inside!) or this extension in Power BI.

How do you get value out of it ?

Well, that’s actually the only good question. Because if sunbursts look good, they are actually hard to read, hard to compare, and hard to make actionable. So you probably want to go in a bit more modeling before actually using it in real life. Below are two use cases that we are extensively using at Datama. 

USE CASE 1 : UNDERSTANDING CUSTOMER BEHAVIORS – SIMILARITY INDEX

We developed that use case right after the lockdown for COVID 19 in 2020. Traditional website KPIs were really low. In particular, clients in the travel industry were badly hit in terms of traffic and conversion. So, after a few weeks of fire fighting crisis mode, the question from our clients was: how to know when my customers start to behave “normally” on my website so that I can start reinvesting on my acquisition campaigns? 

One could say that you can just look at conversion (transactions/ sessions) and this is a good indicator of whether or not people come back to your website in order to purchase, or still to get informed about how to deal with COVID impacts. But it may not be entirely sufficient. Two main reasons: 

  • Conversion is based on transactions, that were really low at this time, so the signal is quite volatile and an increase or decrease could not be statistically significant
  • There is some delay (typically one or two weeks) between the first time a user visits a website and the moment he converts, in particular in the travel industry. So you may have a lag between the moment people start to behave normally and the moment you see conversion going back up.

Another option would be just to look at the volume of traffic, but if you use that to trigger your acquisition investments, then it’s a self-fulfilling prophecy… no investment -> no traffic, and no traffic -> no investment…

So we end up saying: “why not sizing how much user journeys are similar between two periods?”. The idea is basically to compare the volumes on each journey (each “radius” of the sunburst) between one period and another to end up with an index between 0 and 100%. 100% would mean that you have exactly the same split of traffic, and 0% would mean that you have no comparable journeys.

The concept of similarity index

Cool thing is that you can monitor that index over time (each week vs. the same week last year, to net the effect of seasonality) and use it to inform your decision to open investments at the right time before competition. Experience has shown that the index has a more stable evolution over time than conversion, and that it was helpful to understand when people started to behave normally. 

An example of similarity index in Tableau

During the COVID, we observed a significant difference in customer journey (type and order of web pages viewed by our web visitors) with the help of Datama Solution.
The similarity index is strongly correlated to the evolution of the crisis : shock, denial, doubt, accept, adapt.

Fanjuan Shi
Analytics & Data Science director, at Pierre et Vacances Center Parcs

USE CASE 2 : VALUE ATTRIBUTION – DATAMA JOURNEY

Another interesting use case to get value out of a Sunburst is to create a value attribution model to allocate value to each page of your website based on how Journey converts.

For instance, if you have a converting Journey “Home – Search – Product – Purchase ” that brings you 100€, you can use a constant attribution model to allocate 25€ to each page. Hence your Homepage worths 25€, your Search 25€, your Product page 25€ and your purchase page 25€.

After this, you can now use that attribution to allocate “lost value” to non-converting journeys. For instance, continuing the same example above, if you have a “Search-Product” journey that doesn’t convert, you could say that it lost a potential value of 50€ (25+25) due to the product page (which is the exit page). So the Product page is then responsible for bringing 25€ value but also for losing -50€ virtual value. 

Doing this allocation and dealing with volumes can get a little bit tricky, so we’ve built an automated tool called Datama Journey that helps you do that (you can test it here). 

An exciting usage of this approach in CRO is to prioritize your roadmap of AB tests between pages based on where you get value and where you lose it. This has proven to be tremendously helpful to back up help product teams being data-driven, on top of other insights from customer feedback, or competition watch. We recently did a webinar on this that I encourage you to watch here!

Example of AB test prioritization matrix in Datama Journey

Share the Post:

Subscribe to our newsletter