- Read Tutorial
- Watch Guide Video
WEBVTT
1
00:00:02.780 --> 00:00:04.160
In this guide,
2
00:00:04.160 --> 00:00:06.060
we're gonna be talking about a common way
3
00:00:06.060 --> 00:00:08.750
of categorizing classification algorithms
4
00:00:08.750 --> 00:00:10.040
and that's by dividing them
5
00:00:10.040 --> 00:00:12.943
between generative or discriminative models.
6
00:00:13.990 --> 00:00:15.540
When push comes to shove,
7
00:00:15.540 --> 00:00:16.720
it doesn't really matter
8
00:00:16.720 --> 00:00:19.090
what classifier we're talking about.
9
00:00:19.090 --> 00:00:21.570
Whether it's multi-class, binary,
10
00:00:21.570 --> 00:00:24.030
generative, or discriminative,
11
00:00:24.030 --> 00:00:28.030
they all work towards the same goal of grouping observations
12
00:00:28.030 --> 00:00:30.003
by establishing a decision boundary.
13
00:00:30.940 --> 00:00:34.270
But what really differentiates classification algorithms
14
00:00:34.270 --> 00:00:36.653
are the steps they take to get the result.
15
00:00:37.870 --> 00:00:40.400
Before we get into some of the finer points,
16
00:00:40.400 --> 00:00:43.270
an example of a generative model that we've already covered
17
00:00:43.270 --> 00:00:45.070
is Naive Bayes.
18
00:00:45.070 --> 00:00:47.150
And like Naive Bayes,
19
00:00:47.150 --> 00:00:51.170
generative models are generally fairly simple to implement
20
00:00:51.170 --> 00:00:54.040
and usually quick to run.
21
00:00:54.040 --> 00:00:55.420
Because of their efficiency,
22
00:00:55.420 --> 00:00:57.640
they can also scale really well
23
00:00:57.640 --> 00:01:00.560
when you're working with a large data set.
24
00:01:00.560 --> 00:01:02.640
They also don't need much training data
25
00:01:02.640 --> 00:01:04.433
which gives them another advantage.
26
00:01:05.820 --> 00:01:08.400
Now to get a bit more technical,
27
00:01:08.400 --> 00:01:11.700
generative models work by trying to actually model
28
00:01:11.700 --> 00:01:13.820
how the data was generated
29
00:01:13.820 --> 00:01:16.663
and use that to categorize a new observation.
30
00:01:17.980 --> 00:01:19.260
On the other hand,
31
00:01:19.260 --> 00:01:21.580
discriminative models don't really care
32
00:01:21.580 --> 00:01:23.790
how the data was generated.
33
00:01:23.790 --> 00:01:26.513
It just wants to categorize a new observation.
34
00:01:28.630 --> 00:01:30.600
To help explain this a little bit better,
35
00:01:30.600 --> 00:01:32.800
here's a pretty good analogy.
36
00:01:32.800 --> 00:01:36.660
So I have two nephews, Carter and Jackson.
37
00:01:36.660 --> 00:01:37.970
And on one afternoon,
38
00:01:37.970 --> 00:01:40.220
my sister called to ask if she could drop them off
39
00:01:40.220 --> 00:01:42.500
at my place for a couple of hours.
40
00:01:42.500 --> 00:01:43.550
While they were over,
41
00:01:43.550 --> 00:01:46.810
they spent most of their time playing with Myles, my cat
42
00:01:46.810 --> 00:01:49.070
and Sam, my dog.
43
00:01:49.070 --> 00:01:51.610
Later in the evening, when they're back home
44
00:01:51.610 --> 00:01:54.460
Carter and Jackson are reading a book with my sister
45
00:01:54.460 --> 00:01:57.230
when they come across a picture of a dog.
46
00:01:57.230 --> 00:02:00.890
So my sister asks both Carter and Jackson
47
00:02:00.890 --> 00:02:03.723
if it's a picture of Sammy or Myles.
48
00:02:04.670 --> 00:02:08.540
Carter, the generative classifier, loves to draw.
49
00:02:08.540 --> 00:02:10.450
So he grabs his box of crayons
50
00:02:10.450 --> 00:02:12.480
and based on what he remembers
51
00:02:12.480 --> 00:02:16.130
draws a picture of both Sam and Myles.
52
00:02:16.130 --> 00:02:18.820
He compares his drawing to the picture in the book
53
00:02:18.820 --> 00:02:23.690
and decides the picture is probably a dog just like Sammy.
54
00:02:23.690 --> 00:02:26.240
Jackson on the other hand is only a year old
55
00:02:26.240 --> 00:02:28.000
and can't draw yet.
56
00:02:28.000 --> 00:02:29.670
But he's still able to figure out
57
00:02:29.670 --> 00:02:31.720
the picture in the book is a dog
58
00:02:31.720 --> 00:02:34.093
based strictly on his observations.
59
00:02:35.070 --> 00:02:36.750
So while both of my nephews
60
00:02:36.750 --> 00:02:40.160
were able to successfully determine the type of the animal,
61
00:02:40.160 --> 00:02:41.760
the way they came up with their answer
62
00:02:41.760 --> 00:02:43.713
turned out to be completely different.
63
00:02:44.960 --> 00:02:47.590
Now let's say we're given this training set.
64
00:02:47.590 --> 00:02:49.820
What a discriminative model is gonna do
65
00:02:49.820 --> 00:02:51.860
is try to separate the two classes
66
00:02:51.860 --> 00:02:53.433
by using a straight line.
67
00:02:54.410 --> 00:02:57.040
The first iteration might use a boundary
68
00:02:57.040 --> 00:02:58.590
that looks something like this,
69
00:02:59.500 --> 00:03:01.310
but as the parameters are optimized
70
00:03:01.310 --> 00:03:03.390
and more iterations are run,
71
00:03:03.390 --> 00:03:06.023
it will begin to look more and more like this.
72
00:03:06.940 --> 00:03:10.110
In contrast, rather than looking at both classes
73
00:03:10.110 --> 00:03:12.890
and trying to figure out how to separate them,
74
00:03:12.890 --> 00:03:15.990
a generative model will look at one class
75
00:03:15.990 --> 00:03:17.920
like the cat training set
76
00:03:17.920 --> 00:03:19.490
and then try to build a model
77
00:03:19.490 --> 00:03:21.773
encapsulating all of their features.
78
00:03:22.850 --> 00:03:25.720
Then once it has the first model figured out,
79
00:03:25.720 --> 00:03:27.270
it moves over to the second class
80
00:03:27.270 --> 00:03:30.343
and tries to build a model of what a dog might look like.
81
00:03:32.190 --> 00:03:35.210
So if a new observation comes in
82
00:03:35.210 --> 00:03:38.830
and based on the features falls within this boundary,
83
00:03:38.830 --> 00:03:40.913
it will be classified as a dog.
84
00:03:42.840 --> 00:03:46.880
Internally, what a discriminative model is attempting to do
85
00:03:46.880 --> 00:03:50.793
is learn the probability of Y given X directly,
86
00:03:51.650 --> 00:03:53.720
where Y are the class labels
87
00:03:53.720 --> 00:03:56.113
and X represents all of the features.
88
00:03:57.470 --> 00:03:58.690
On the other hand,
89
00:03:58.690 --> 00:04:01.030
a generative algorithm tries to learn
90
00:04:01.030 --> 00:04:03.930
the probability of X given Y
91
00:04:03.930 --> 00:04:06.300
which makes sense when you think about the analogy
92
00:04:06.300 --> 00:04:08.210
we just used.
93
00:04:08.210 --> 00:04:10.610
Before any observations came in,
94
00:04:10.610 --> 00:04:12.830
Carter already had a cat and dog model
95
00:04:12.830 --> 00:04:14.460
built in his own head.
96
00:04:14.460 --> 00:04:16.520
So when a new observation came in,
97
00:04:16.520 --> 00:04:20.280
he already knew what each class label should look like.
98
00:04:20.280 --> 00:04:23.273
And all he had to do was compare the features.
99
00:04:24.960 --> 00:04:27.730
Now I know this wasn't the most exciting guide,
100
00:04:27.730 --> 00:04:29.750
but it's an aspect of machine learning
101
00:04:29.750 --> 00:04:32.350
that you definitely need to be aware of.
102
00:04:32.350 --> 00:04:35.470
And as we introduce new algorithms throughout the course,
103
00:04:35.470 --> 00:04:37.810
we'll break down how each of them works
104
00:04:37.810 --> 00:04:40.833
to determine if they're generative or discriminative.
105
00:04:42.400 --> 00:04:44.770
But for now, I'll wrap things up
106
00:04:44.770 --> 00:04:46.520
and I'll see you in the next guide.