Using your heatmap

The color of the tile represents the magnitude of difference for each comparison (read more below). Yellow represents the most different pair of data sets while navy blue represents the least different pairing.

This can help you craft a high-level narrative about the competitors, products, customer segments or geographic markets.

Clicking into a specific tile will also present you with several options:

  • View - this will take you to the comparison explorer where you can see all of the detail of the analysis and build insight cards

  • Dismiss - this icon can be used to note that you are not interested in further analysis for a particular comparison

  • Pin - this icon can be used to note that you want to do further investigation for the comparison

  • Favourite - this icon can be used to note a comparison that you have looked at and contains interesting insights

Understanding difference percentiles

The topical differences in a pair of data sets are compared to determine the difference percentile.

What are difference percentiles?

The difference percentile represents the magnitude of the differences present in a pair of data sets in comparison to a benchmark sample. The score (percentile) shown tells you how normal or abnormal a difference between the two data sets is.

For example, if the score is in the 90th percentile, this means that only 10% of dataset comparisons will be more different. On the other hand, if it is in the 20th percentile, that means that 80% of the data will be more different, so you can assume that it is quite similar.

To view the difference percentile for each comparison, enable the toggle in the bottom-right of the heatmap.

How are difference percentiles calculated?

The difference percentile is derived from the correlation score between the linguistic features in your data sets.

This score for each pair of data sets is then compared against a benchmark distribution of difference scores generated from a sample of over 5,000 comparisons to understand how different a pair of data sets are.

Why do we use difference percentiles instead of raw scores?

The difference percentile is used rather than the raw correlation score because it clarifies the magnitude of differences between your two data sets, rather than just providing a decimal number.

To put it simply, the colors on Heatmaps take into account all the scores between the individual comparisons and will be relative to your project. The percentiles help you to understand the differences in the wider landscape context as represented by a large sample of comparisons.

Did this answer your question?