Interesting blog and theories.. Some of them, I might have never come across otherwise..
But to get and stay onboard, I will have to pretend that Statistics and Economics are not so prevelant on your blog 😀
And may I add, nice and orderly structured blog!
Rachana.

I’m giving a presentation on kernel smoothing in my survival analysis class on Thursday. One of the things I’m supposed to provide advice on is how to pick an optimal bandwidth for the smoothing function. I learned from the SAS documentation that it uses an approach based on minimizing the mean integrated square error; I admit I don’t understand this very well or why minimizing the MISE results in an optimal bandwidth. Do you have any words of wisdom that I could share with my class?

Thanks so much. 🙂 Keep up the good work.
Cheers,
Mike

The MISE measures the closeness of the kernel estimator to its target parameter globally (over the whole real line). As expected, the MISE depends heavily on the bandwidth parameter. Hence the optimal kernel is the one that minimises this “closeness” (asymptotically).

Note that alternative measures of kernel performance also exist but the MISE is probably the most easy to work with due to its mathematical simplicity. One such measure is the mean integrated absolute error (MIAE). For more details you can refer to the books of Silverman (1986) and Wand and Jones (1995).

Hi! Thanks for the quick response. What I wonder is, where does the “target parameter” come from? If the target parameter is our understanding of the proper shape of the curve, why would we even need to go through the effort to find a bandwidth that produced a curve closest to the target? It would seem like we already knew the answer that we were trying to solve for.
Hope you don’t mind this one follow up. I’ve gotten a quick look at the Wand and Jones book and will try to find a copy of Silverman’s.
Regards,
Mike

Yes you are right. The MISE (and hence the optimal bandwidth) depends on the unknown target parameter and therefore it is not directly attainable. But they are ways through it. A popular way is the “Least Squares Cross Validation” (LSCV). This approach allows us to find an unbiased estimator of the unknown quantity of MISE which depends on the bandwidth h. Hence the h which we eventually choose is the one which minimises the LSCV(h). But of course there are more ways to tackle this problem. See more at Chapter 3 of Wand & Jones.

Interesting blog and theories.. Some of them, I might have never come across otherwise..

But to get and stay onboard, I will have to pretend that Statistics and Economics are not so prevelant on your blog 😀

And may I add, nice and orderly structured blog!

Rachana.

Thanks for your message Heart. I am glad you like my blog.

Interesting blog. Funny I missed it so far.

Thank you vasvas and you are welcome

Hi there Epanechnikov,

I’m giving a presentation on kernel smoothing in my survival analysis class on Thursday. One of the things I’m supposed to provide advice on is how to pick an optimal bandwidth for the smoothing function. I learned from the SAS documentation that it uses an approach based on minimizing the mean integrated square error; I admit I don’t understand this very well or why minimizing the MISE results in an optimal bandwidth. Do you have any words of wisdom that I could share with my class?

Thanks so much. 🙂 Keep up the good work.

Cheers,

Mike

Hi Mike.

The MISE measures the closeness of the kernel estimator to its target parameter globally (over the whole real line). As expected, the MISE depends heavily on the bandwidth parameter. Hence the optimal kernel is the one that minimises this “closeness” (asymptotically).

Note that alternative measures of kernel performance also exist but the MISE is probably the most easy to work with due to its mathematical simplicity. One such measure is the mean integrated absolute error (MIAE). For more details you can refer to the books of Silverman (1986) and Wand and Jones (1995).

Hi! Thanks for the quick response. What I wonder is, where does the “target parameter” come from? If the target parameter is our understanding of the proper shape of the curve, why would we even need to go through the effort to find a bandwidth that produced a curve closest to the target? It would seem like we already knew the answer that we were trying to solve for.

Hope you don’t mind this one follow up. I’ve gotten a quick look at the Wand and Jones book and will try to find a copy of Silverman’s.

Regards,

Mike

Yes you are right. The MISE (and hence the optimal bandwidth) depends on the unknown target parameter and therefore it is not directly attainable. But they are ways through it. A popular way is the “Least Squares Cross Validation” (LSCV). This approach allows us to find an unbiased estimator of the unknown quantity of MISE which depends on the bandwidth h. Hence the h which we eventually choose is the one which minimises the LSCV(h). But of course there are more ways to tackle this problem. See more at Chapter 3 of Wand & Jones.

I hope that helps. 😉

Regards

Well, I’m still working through understanding MISE… but for now I got an A on my in-class presentation, so life is pretty good! 🙂

Congratulations Mike! Don’t hesitate to contact me whether you need any help

Are you V. A. Epanechnikov, the discoverer of the Epanechnikov kernel. If so, where and what do you teach and research?

No I am not John. I am just a regular user of the kernel 🙂

Nice blog!

Thank you Ein Steppenwolf