Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features.

Gini Index is the loss function used by the decision tree…


Linear regression doesn’t have constraints on its predicted value ( eg, it can predict house with negative values), hence we go for logistic regression whenever there is categorical output.

  • we take linear combination (or weighted sum of the input features)
  • we apply the sigmoid function to the result to obtain…


ML is the process of training model on the best possible parameters to show relationship between few features and target. There are three components to ML Model:

  1. Model : How to model the relationship between target and features. eg: linear equation, weighted sum, decision tree etc.
  2. Cost Function : To…


Image Courtesy: https://studyonline.unsw.edu.au/blog/types-of-data

Descriptive Statistics

Describes our collected data

Analyzing Quantitative Data

Four aspects of analyzing Quantitative data:

  1. Measures of Center
  2. Measures of Spread
  3. Shape of the data
  4. Outliers

Histograms is used for visualizing quantitative data. It is used to show frequency distribution.

Measures of Center

  1. Mean : Average of values
  2. Median : Median splits data into 50%…


Dataset contains simulated data that mimics customer behavior on Starbucks rewards mobile app. Starbucks sends out to offers to mobile app users. Offer can be advertisement of the drink or an actual offer as discount or BOGO(buy one get one). Not all users receive same offer and some users might…


Airbnb is most popular option for travelers as its very convenient and affordable. While planning the stay travelers look at multiple factors of the shortlisted properties, one of the most important being price of property. In this blog, we will be analyzing Seattle property listings.

This project provides us with…


Data scientists need software engineering skills while implementing solution in production as well as while collaborating with software engineers. One of the most important software engineering skills is writing efficient & readable code. In this article, I would be mentioning few best coding practices that data scientists should follow before…


Data Analytics is the main driving force of change for HR Professionals across industries. Right from hiring the right talent to increasing the employee retention rate, HR analytics can change it all.

“Today HR has a seat at the table, and in order to maintain that business partnership, you need…

Chetna Shahi

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store