I usually write all my tutorials in the Python language as I find it quite easy. But few of data enthusiasts know the importance of the R language. Thus I thought to do a project in R. What couldn't be better to start a simple project on the most common topic of text analytics which is Sentiment Analysis.

Image Credit: Dhaval

As this is the voting season, I thought of analyzing the sentiments of the tweets sent by famous personalities. Thus I have kept this tutorial in a simple format. …


If you are into data science and looking for starter projects then the SMS Spam classification Project is one of those you should work upon! In this tutorial, we would go step by step from importing libraries to full model prediction and lately measuring the accuracy of the model.

Image by Dhaval (drawn on iPad)

About SMS Spam Classification

A good text classifier is a classifier that efficiently categorizes large sets of text documents in a reasonable time frame and with acceptable accuracy, and that provides classification rules that are humanly readable for possible fine-tuning. If the training of the classifier is also quick, this could become in some application…


Whenever we hear the word Data Science, we think about large data and machine learning algorithms that helps data scientists to predict the values or to classify the outcome into two or more classes.

But, if you ask any Data scientist, or if you might have read some textbooks or maybe you have been into many projects , you might have realised that 70–80 percent of the time of data people is spent on Data cleaning.

What is Data cleaning?

Its the process of detecting and correcting corrupt or inaccurate records from a record set, table, or database and refers to identifying incomplete, incorrect…


If you are new or learning about the world of data or maybe data analytics, you would be familiar with the job titles that you would be joining such as Data Analyst, Data Engineer , Data Scientists… Although these roles might sound very familiar or you may think that there is a small difference, it NOT. In this writing I would be going through two well known job descriptions in the world of data. Data Scientist and Data Engineer. …


I had recently written an article in Text Analytics where I taught how we can tokenize sentences without using any libraries of Python. I recieved good response on that tutorial.

Tokenizing sentences into words using NLTK

In this tutorial we would be easing out our work in the field of NLP by using a python library called NLTK.

What is NLTK?

is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language.NLTK includes graphical demonstrations and sample data. …


The YouTube Outro is the last segment of the video, one of the best-practised SEO techniques that leaves an impact on the viewer. A compelling and unique Outro helps increase the overall viewership as well as encourage the audience to take a certain action on the video. An attractive Outro helps in growing the channel using strategic techniques.

What Exactly is a YouTube Outro?

Before going into specifics about making an Outro, we need to know what exactly it is and why it is essential for the channel.

The Outro Video helps in optimizing the video and increase viewer engagement, redirect more traffic, and increase subscribers…


Scroll Depth Tracking is commonly used nowadays. Business owners want to know about the user who is coming to the website, how he/she is interacting with the page even if it is a single-page website.

Google Tag Manager has new in-build Variables and Triggers functionality to track scroll depth. We have covered every step which will need to follow to implement it.

4 Simple Steps To Implementing Scroll Depth Tracking in GTM

Step 1: Enable the Scrolling Data Layer Variables

In GTM > Variables > Click on Configure in Built-In Variables > Scrolling Label and Enables all the options

What does these 3 Variables mean:

  1. Scroll Depth Threshold: this is the scroll depth of a user. This is…


Ever wondered what we call those image sliders on the Apps and websites like Amazon, Flipkart, etc.

Those elegant looking image Sliders are known as Carousels. We have two most famous frequently used Dart packages for Carousels widget.

In this very Blog, we will be only Discussing the Second Package mentioned above. The choice is personal, but I would recommend if you are either or at Intermediatelevel of your Flutter Development. Try this one.

Carousel_pro 1.0.0

This Dart package is really easy to implement just like any other Widget in Flutter. …


Tokenize sentences without any library in Python. Image credit: Dhaval

So, for the past few months, I have been tinkering with Natural language processing concepts, I got to know about NLTK, keras, Spicy library, and many others. These libraries provide direct one-line functions to do the same.

But during my masters, we were told not to use any libraries and do tokenize the sentences using pure python core features.

So in this piece of article, I would go through the process of tokenizing the sentences from scratch.

So let's start
Background: We would be taking a corpus in a textfile and we would be a tokenizing sentence and would be…


Learn about data splitting

Whenever you are dealing with large amount of data and or creating any type of model, a predictive model or classification model for instance, or any model you are creating, for better model evaluation metrics, you would require to know about these terms.

BONUS: — I have talked about a concept called Data wrangling here in this post! and you might like it !

First lets see why we need to do the splitting. The thing is that we want to evaluate our model’s performance.

Why Splitting the dataset at first place?

The main and the most important purpose of splitting data into three different categories is…

Dhaval Thakur

Data Enthusiast, Geek, part — time blogger.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store