Friday, February 23, 2018


06/23/2018: Xiaoyi Yin (尹肖贻) has translated this post to 中文. Thanks Xiaoyi!

Once upon a time, there was a machine learning researcher who tried to teach a child what a "teacup" was.

"Hullo mister. What do you do?" inquires the child.

"Hi there, child! I'm a machine learning scientist. My life ambition is to create 'Artificial General Intelligence', which is a computer that can do everything a human --"

The child completely disregards this remark, as children often do, and asks a question that has been troubling him all day:

"Mister, what's a teacup? My teacher Ms. Johnson used that word today but I don't know it."

The scientist is appalled that a fellow British citizen does not know what a teacup is, so he pulls out his phone and shows the child a few pictures:

"Oh..." says the child. "A teacup is anything that's got flowers on it, right? Like this?"

The child is alarmingly proficient at using a smartphone.

"No, that's not a teacup," says the scientist. "Here are some more teacups, this time without the flowers."

The child's face crinkles up with thought, then un-crinkles almost immediately - he's found a new pattern.

"Ok, a teacup is anything where there is an ear-shaped hole facing to the right - after all, there is something like that in every one of the images!"

He pulls up a new image to display what he thinks a teacup is, giggling because he thinks ears are funny.

"No, that's an ear. A teacup and ear are mutually exclusive concepts. Let's do some data augmentation. These are all teacups too!"

The scientist rambles on,

"Now I am going to show you some things that are not teacups! This should force your discriminatory boundary to ignore features that teacups and other junk have in common ... does this help?"

"Okay, I think I get it now. A teacup is anything with an holder thing, and is also empty. So these are not teacups:"

"Not quite, the first two are teacups too. And teacups are actually supposed to contain tea."

The child is now confused.

"but what happens if a teacup doesn't have tea but has fizzy drink in it? What if ... what if ... you cut a teacup in halfsies, so it can't hold tea anymore?" His eyes go wide as saucers as he says this, as if cutting teacups is the most scandalous thing he has ever heard of.

"Err... hopefully most of your training data doesn't have teacups like that. Or chowder bowls with one handle, for that matter."

The scientist also mutters something about "stochastic gradient descent being Bayesian" but fortunately the kid doesn't hear him say this.

The child thinks long and hard, iterating over the images again and again.

"I got it! There is no pattern, a teacup is merely any one of the following pictures:"

"Well... if you knew nothing about the world I could see how you arrived at that conclusion... but what if I said that you had some prior about how object classes ought to vary across form and rendering style and --"

"But Mister, what's a prior?"

"A prior is whatever you know beforehand about the distribution over random teacups ... err... never mind. Can you find an explanation for teacups that doesn't require memorizing 14 images? The smaller the explanation, the better."

"But how should I know how small the explanation of teacup ought to be?", asks the child.

"Oh," says the scientist. He slinks away, defeated.