Member-only story

Make sure you do these while data cleaning!

Dhaval Thakur
3 min readMar 19, 2022

--

If you are in the data analytics/science field then you might already know that data cleaning is the most crucial and critical task before you start analysing your data to grow your business or get an answer for a question you looking for!

Clean data the right way (credits: AnalyticsIndamag)

I won’t be again repeating what’s data cleaning and why it's important, but I would be guiding

  • Take care of the wrong spellings

Misspellings occur mostly due to typing errors. The wrong spelling can be detected and corrected for common words but as databases constrain huge amount of data that is unique, it is hard to detect spelling mistake at input-level. Further, Spelling mistakes in data such as names, addresses are always difficult to identify and correct.

  • Take care of Lexical Errors

These errors occur in data due to name discrepancies between the structure of the data items and the specified format.

  • Take care of Misfielded Value

Misfielded value problem occurs when the values entered are correct as far format is concerned but does not belong to the field.

For example :In field of city, value recorded is Germany.

  • Take care of Irregularities

--

--

Dhaval Thakur
Dhaval Thakur

Written by Dhaval Thakur

Data Enthusiast, Geek, part — time blogger. Every week 1 new Data Science/ Product Management story 🖥 I also write on Python, scripting & blockchain

No responses yet