Clean Swift Coding Style

Have you ever seen thousand lines of code on one view? If so, we are on the same page. At first, it was easy to find and easy to read. (repeat: at the beginning). However, it would be hard to add…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Probability and Likelihood

Probability is the exact outcome of certain events. In Probability you know what is the outcome of an occurring of an event. Whereas, in likelihood you are not certain about the outcomes. Outcomes here depends on various factors. For e.g. let say in a match an event of toss is occurring between two teams; Now, the outcome of the toss will be either Head or Tail. Suppose a team chose Head and he won the toss, then the probability of the winning team to chose to bat will be 0.5 or ½. Here we know the exact outcome. Now, likelihood doesn’t give the straight answer to this, we know that the probability to choose to bat for a winning team is 0.5. But are we certain or sure that the winning team will bat first? Here comes the Likelihood; what is the likelihood of team that they will bat first? It depends on many factors: like condition of the pitch; whether dry or wet, condition of the weather, which players are in the opposite team? There weakness and strength and many more things.

If these conditions are not in favor of toss winning team, they will opt out of bat first or we can say the likelihood for them to choose to bat first will be very-very less and if the above conditions are in their favor then the likelihood for them to bat first will be more. Therefore, as probability gives you straight answer, likelihood depends on various conditions and parameters. Probability value ranges from 0 to 1 but there is no certainty for the outcome of likelihood. Likelihood tells about how likely an event is to occur.

1) A random variable has only two outcomes.

· Success or failure (result of an outcome)

· Head or Tail (tossing of coin)

· Fraudulent claim or Genuine claim (Fraud Insurance claim)

· Default or No Default (Loan repayment default)

2) Objective is to find the probability of getting x success out of n trials.

3) The probability of success is p and that of failure is (1-p).

4) The probability p is constant and does not change with trials.

The Probability Mass Function (PMF) of binomial distribution (probability that the number of success will be exactly x out of n trials) is given by;

The Cumulative Distribution Function (CDF) of Binomial Distribution (probability that the number of success will be x or less than x out of n trials) is given by;

Now, back to our case; Likelihood is the conditional probability. We know that outcome of tossing a coin will be either Head or Tail with probability of 0.5 each. If we toss the same coin for 10 times and let say, 7 out of 10 times we got Head and 3 times we got tail. From binomial distribution, we can calculate likelihood;

Likelihood of coming of Head 7 times given the probability of coming of Head as an outcome is 0.5.

D is the observed dataset and theta is the parameter of likelihood function.

Likelihood is the probability that an event already been occurred would give a specific outcome. Whereas, probability is for event that will occur in future. If the coin is tossed for certain times and it is a fair coin, then what is the probability that every time the coin landed will give Head. Probability is used when describing a function of an outcome given a fixed parameter. Here the function is coming of Head every time the coin lands given a fixed parameter, the coin is fair.

Whereas, likelihood is exact opposite of it. Likelihood is describing a function of parameter given a fixed outcome. If the coin is flipped for certain number of times and every time it lands gives Head, then what is the probability that coin was fair? Here the fixed parameter is the outcome of flipping of coin and the function is whether coin is fair or not?

you can also read my other articles on Medium:

Bayesian or Frequentist

Now, back to the coin problem: if we toss the coin and hold it in our hand or close our eyes and not see what the outcome of the coin landed; then what is the probability that the coin landed is Head. Now, there is two case for this;

Case I — We can say that the probability of getting of Head will be 0.5.

Case II — We can say that, as we haven’t seen the outcome or the coin landed and the event has already been taken place. The probability of getting of Head will be either 100 percent or Zero in case we get Tail (like the Schrodinger’s coin we can say).

Case II, here is Frequentist. It doesn’t matter what our opinion is or not; we follow the truth here and the truth is the coin has already been landed, it doesn’t matter what we say, outcome has already been written. If it is Head then probability will be 100 percent and if it is Tail, the probability will be zero.

Remember till now, we haven’t seen the outcome. It is hidden. Bayesian will say that the probability of getting Head is 50 percent and Frequentist will say, the coin has already been landed, so it doesn’t matter what we say; the truth is — if it is Head then probability will be 100 percent or else zero.

We have seen both Bayesian and Frequentist. Now the question is which one is right and which one is wrong, which one should we follow? The point here isn’t which method is right or wrong; they both are equally right in their own perspective.

Bayesian follow their opinion; they don’t seek for truth. For them there always be the result of an event. They can never be wrong.

But Frequentist goes for quality of the outcome. The true value of the outcome. If the event is occurring for many numbers of times, there can be the chance of occurring the 100 percent of result be more than the zero percent and vice-versa.

If we want to perform an experiment based on our opinions and see how the experiment performed, we can go for Bayesian. And if, in ignorance of the result and for quality of our experiment to be right always then we go by the frequentist way. If our experiment is right then it’s okay but if we fail, we can change the scenario and the actions we performed earlier based on our evidence. Here we go for the Quality of an experiment.

Bayesian statistics are the conditional probability as of likelihood and Probability model refers to Frequentist.

Parameter Estimation

What is the need for estimation of parameters? Well, estimators help in estimating the value of parameters of the Independent features in dataset to understand the relation between the independent features and the response (target variable) and in order to minimize the cost function to increase the accuracy of model.

There are many types of estimators. Here, I’ll discuss about Maximum Likelihood Estimation and Bayes Estimation.

Maximum Likelihood Estimation

Maximum Likelihood Estimation (MLE) is the estimating method which estimates the parameters of probability function by maximizing the likelihood function to find the most probable observed data. We use MLE in Normal Distribution (Gaussian Distribution) of data by using mean and variance as the parameters and taking the derivation of Gaussian function and by maximizing it we get the function to calculate the mean and variance. Maximum Likelihood estimation uses probability model for data.

Maximum Likelihood for Binomial Distribution

Let say, we randomly selected group of people and asked them, their preference in Marvel comics or DCEU comics. Few preferred for Marvel and rest for DCEU.

What is the likelihood that p= 0.5?

The likelihood of p (the probability of picking Marvel), given n=10 (the number of people were asked) and x=7 (the number of people chose marvel).

Finding the maximum likelihood by the taking the derivative.

Maximum likelihood estimate for p is the average.

Maximum Likelihood for Normal Distribution

Log-Likelihood function with respect to Mu(mean):

Log-Likelihood function with respect to sigma (standard deviation):

maximum likelihood estimate for where the center of the Gaussian distribution will go:

maximum likelihood estimate for how wide the Gaussian curve would be:

Bayes Estimation:

Bayes Estimation or Maximum A Posteriori (MAP) estimation, minimizes the posterior expected value of the loss function. It works on the posterior distribution and not only the likelihood. We could get the posterior as the product of likelihood and prior function. Bayes theorem works on the concept of already given prior function. When two events are occurring (event A and B), the probability of occurring of one event based on another event that already been occurred.

If something is incorrectly written here or want to add something in it, Please help me to correct here. As I am also a learner and this is only that I have learned. Much to learn on this topic. If you know more and simple way about Bayesian Statistics, please let me know. Positive criticism is highly encouraged.

Add a comment

Related posts:

Is Web Development Your Passion? Join vNative as an intern to achieve it

Founded in August 2016 by two graduates from Delhi College of Engineering (DTU), CloudStuff Technology Pvt Ltd. is Delhi based technology company that offers Software as a Service (SaaS) to its…

Want to be Cultured? Then Learn a New Language.

I love learning languages. As noted on my about page, I know English and Spanish. I also know how to read Russian and German. As I master new languages, I begin to understand the similarities between…

My Stressful Stress Test

Yesterday I had a nuclear stress test. The word “nuclear” is a bit unsettling, considering what’s happening in the world these days. Maybe the doctor wanted to see if I was radioactive? Because there…