What is Logistic Regression?

TLDR: “Logistic regression is a classification algorithm. It classifies data into different classes.”

We can classify a bird based on it’s independent variables such as color, size etc by using logistic regression.

Suppose you have lot of emails in your inbox. However, your personal mail server does not have any email spam filtering system. So you have to build a system that can filter your emails and classify as spam or not.

You can classify an email as spam by using some parameters or variables. These variables are called independent variables. For example, you may say that if the email is coming from this particular email address then it is spam. Or if this email is coming from that particular IP address or server then it’s spam. Another variable or parameter can be a “list of words”. If these words are present in your email’s subject or email’s body then you can identify the email as spam.
By using these independent variables we can find out the dependent variable – “spam”. Using independent variables we find out dependent variable with help of an algorithm. This algorithm is called Logistic regression.

When to use Logistic regression:

If you can answer using YES or NO to your questions, then you can apply logistic regression.

For example:

Is the email spam? Yes / No;

Do the symptoms (independent variables) have risk of potential heart attack? Yes / No;

Is the transaction fraud?  Yes / No;

By this way, you can find out when to use Logistic regression.

Photo Credit: Wikipedia