What happens if we use absolute value? (3/365)

Yesterday, I looked at how the mean minimizes the variance and how this happened due to defining the variance as the mean of the square of the differences (also known as the L_2 norm). In particular, replacing the square with higher powers does not lead to a simple minimization problem. But what if we set the variance as the mean of the absolute value of the differences (also known as the L_1 norm?

Keep in mind that the derivative of |x| is 1 when x \ge 0 and -1 when x < 0.

\begin{aligned}  &\underset{\mu}{\arg \min} \int_{x \in X} | x-\mu | f(x) \\  0 &= \int_{x - \mu < 0} f(x) - \int_{x - \mu \ge 0} f(x) \text{ after taking derivative and setting to 0}  \end{aligned}

The \mu that satisfies the above expression is exactly the median since we see that the sum of f(x) when x - \mu \ge 0 must be 0.5. As an example, compute the L_1 variance for the numbers 1,2,12. We see that the mean is 15/3 = 5 which leads to a variance of 4+3+7 and median is 2 which leads to a variance of 0+1+10.

Advertisements
This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s