Generally speaking, the more the better.
Relative Insight can analyse language sets as small as 200 words, but small language sets are unlikely to yield any significant differences. For practical purposes, we recommend aiming for a minimum of 1,000 words in order to receive meaningful insights.
As an example, when looking at social listening data we generally recommend 1,000 tweets per language set as a good-sized sample. This would amount to approximately 15,000 words.
2,000,000 words is our recommended upper limit. Once you’ve reached 2,000,000 words, additional language is unlikely to bring about new insights.
Keep in mind that larger datasets can take several hours to process and analyse. ⏰