Member-only story
Data Cleaning & Transformation using SQL Only!
When we talk about data cleaning, python or R are the only programming languages that come to our mind. True! these both languages have awesome libraries to clean the data before we can start out analysis or build machine learning models . But I feel that SQL is still underrated and we can do normal data cleaning directly using some of the uncommon SQL functions!
TRIM
The TRIM
function is used to remove characters from the beginning and end of a string. Here's an example:
SELECT column,
TRIM(position 'characters' FROM column)
FROM Tablename
The TRIM
function takes 3 arguments. First, the position, whether you want to remove characters from the beginning , the end , or both . Next ,we have to specify all characters to be trimmed.The last argument is the column name.
Parsing Dates
If the date is written as a string we can surely convert it to date data type using CAST
. But this only works when it is in an SQL identified format. But what if it is other formats like MM/DD/YYYY? We first need to convert it to a string in an acceptable format and then cast it. For example,
(SUBSTR(date, 7, 4) || ‘-’ || LEFT(date, 2) || ‘-’ || SUBSTR(date, 4, 2))::date AS cleaned_date