Independent and Dependent Variables in Machine Learning

In this guide we're going to briefly discuss the difference between independent and dependent variables in machine learning. This is a topic that has a lot of crossover into other areas, so there's a fairly solid chance you've had some experience with this in the past.

Using the very simple example, lets apply the definition that independent variables, or feature variables, are the input for the feature that's being analyzed. in contrast, dependent variables are the output of the process. So, if we have a small data frame that tracks distance and time we consider time to be the independent variable and distance the dependent variable. That's because distance traveled is dependent on how much time has passed.

In machine learning the same basic idea applies, but we're usually working with more than two variables like we just did with time and distance. As an example, lets use the dynamic learning data frame that we looked at before. Here our independent variables are school years and semester, the professor, course, course title, and dynamic learning. The dependent variable is average grade. In this example we have multiple independent variables, but still just one dependent variable. But there are still some cases where you can have multiple dependent variables. And if you've taken a vector calculus class, this is where partial derivatives come into play, because now we're starting to work in multiple dimensions.

I was planning on making this just a quick overview, so that's going to do it for now. But in the next guide we're actually going to take this data frame and break it down into independent and dependent variables and convert them into usable components.