Posted on

Learn Matlab Episode #9: Data Analysis Plots

Click here to subscribe for more videos like this!

What are the best plots for data analysis in MATLAB?

So in this tutorial we’re going to be talking about different types of plots that could be useful for data analysis. So the first type of plot if you’ve ever used Excel or watched a PowerPoint presentation, you’ve probably seen this probably seen most of these actually, so the bar plot. So suppose we have some sequence of data let’s, just say x equals 1 to 10, and we want to plot this on a bar chart. We just call the method bar, pass in X, and we see the bar chart. So this does pretty much the exact same thing as a plot function except instead of connecting them all in a line it connects well it doesn’t connect the points at all it gives them each their own individual bar. So, just for reference let’s do let’s do our sine example again. Ok, so we’re going to do bar( X, Y), right, so this is a bar chart of the sine function. Ok, so the next type of plot that I want to talk about is the histogram. So if you’ve ever seen, if you’ve ever studied probability you know about probability distributions and given a data set we might want to figure out what kind of distribution does it follow, so let’s generate some data where we know the distribution. So I’m going to create a thousand points that are normally distributed. So if I plot this it doesn’t do much it just looks like noise, so I won’t be able to see much from that. What you want to do is you want to call a function called hist which will automatically create a histogram out of your data. So I’m going to call the simplest version which is hist of just the data right, and so you see that it’s pretty much normally distributed. There’s a little bit more weight on the right bar than the left bar in the center and so the resolution for this chart is not that great. So what you can do is you can set the number of bins right, so if you have a lot of data, in our case a thousand points, you can set the bins to maybe say 50, ok, and so this gives us a much more granular representation of the data. So now we can see it still looks normal, you know there’s a little less data than expected kind of right here at zero, and then a little more to the right of that. But, generally speaking this plot is pretty much what we expect. The next type of plot that I want to talk about, again if you use Excel you’ve probably seen this or PowerPoint, is a pie chart. So we’re going to work with some simple data again let’s say x is 1 to 5, and all I do is call a function called pie, pass in the data right, and so it automatically splits up the data. So the dark blue is the first element, light blue is the second element, green is the third element, and so on. And so there are ways that you can label this to to make your visualization better and so when you show it to people they know what everything stands for, so there’s a lot of documentation on the matlab website for that. The last type of plot that I want to talk about, probably the most useful for data analysis, is called the scatterplot. So we’re going to again return to our sine example, so x = linspace(0,2*pi,1000); So now I’m going to do something a little different, I’m going to set it y = 10*sin(X) + randn(0,1000) and then I’m going to add some random noise with variants 1 centered at 0, and so the reason why I want to scale it by 10 is so that the sine wave is big compared to the noise otherwise the noise will just take over the shape of the sine wave. I did that wrong. Ok, so now if I were to just plot x and y so you can see a noisy sine wave which is ok, if I do it in a scatterplot I can see that pattern pretty much just as well. So the nice thing about our sine X example was that the data was ordered. Now suppose that my data is not ordered so instead of having linspace my X is just a bunch of random points. So X is normally distributed with a standard deviation of 2, and so Y is going to be let’s say 5*sin(X) plus some noise. Ok, so if I plot x and y now I don’t really see anything it just looks like a bunch of craziness, but scattering so a scatterplot that will just plot all the dots so all the individual data points right, and so I can now very clearly see the sine wave that I’m supposed to see. So if you’re doing data analysis and your points are not ordered, you want to use a scatterplot.