What is data cleaning?
Data cleaning allows you to exclude certain elements of a data set that may otherwise distort the analysis.
The available cleaning options will vary depending on the data source you are uploading but may include the removal of:
Retweets
Spam and promotions
Duplicates and similar posts
Posts from bots, public figures and organizations
Data cleaning can be enabled at the time of upload, or afterwards via the Data Library.
Data cleaning at the time of upload
When creating a new project or question, select the applicable data source/format when prompted
Select the applicable cleaning options you wish to enable and click 'Save'
Enabling data cleaning after data has been uploaded
Navigate to the relevant project folder in the Library
Use the checkboxes to select the data set(s) for which you want to enable or change cleaning options
Select the washing machine icon at the top of the screen
Select the desired cleaning options and click 'Okay'
The results of any existing comparisons that include data sets for which you have changed the cleaning options will automatically be updated. |