Skip to main content

How can I clean my data?

Learn how Relative Insight can help clean your data to prevent noise such as duplicates, spam and retweets from distorting your analysis.

Trish Pencarska avatar
Written by Trish Pencarska
Updated this week

What is data cleaning?

Data cleaning allows you to exclude certain elements of a data set that may otherwise distort the analysis.ย 

The available cleaning options will vary depending on the data source you are uploading, but may include the removal of:

  • Retweets

  • Spam and promotions

  • Duplicates and similar posts

  • Posts from bots, public figures and organizations

Data cleaning during upload

  1. Upload your data file and select your metadata.

  2. Once processed, you'll see cleaning options. Use the tickboxes to select what you would like to remove and click 'Continue.' The platform will now remove any unwanted data.

Data cleaning after upload

  1. Navigate to the relevant project folder in the Dashboard.

  2. Click the three dots next to the data set you wish to clean and select 'Clean data'.

  3. Select the desired cleaning options from the pop-up screen and click 'Okay.'

Please note: Cleaning duplicates cannot be used to remove duplicates from across data sets that have been combined.

Did this answer your question?