Pages

Quantities of Information : KL Divergence


4.Kullback-Leibler divergence ( Information gain ): The Kullback–Leibler divergence (or information divergence, information gain, or relative entropy) is a way of comparing two distributions: a “true” probability distribution p(X), and an arbitrary probability distribution q(X). If we compress data in a manner that assumes q(X) is the distribution underlying some data, when, in reality, p(X) is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined




Although it is sometimes used as a 'distance metric', KL divergence is not a true metric since it is not symmetric and does not satisfy the triangle inequality (making it a semi-quasi metric).


5.Kullback–Leibler divergence of a prior from the truth: Another interpretation of KL divergence is this: suppose a number X is about to be drawn randomly from a discrete set with probability distribution p(x). If Alice knows the true distribution p(x), while Bob believes (has a prior) that the distribution is q(x), then Bob will be more surprised than Alice, on average, upon seeing the value of X. The KL divergence is the (objective) the expected value of Bob’s (subjective) surprisal minus Alice’s surprisal, measured in bits if the log is in base 2. In this way, the extent to which Bob’s prior is “wrong” can be quantified in terms of how “unnecessarily surprised” it’s expected to make him.

No comments:

Post a Comment

If you have any doubt, let me know

Email Subscription

Enter your email address:

Delivered by FeedBurner

INSTAGRAM FEED