Click here to subscribe for more videos like this!
How do you open and process sound files in MATLAB?
In this tutorial we are going to talk about sound processing in matlab. For this tutorial there are no toolboxes needed. So the first thing we’re going to talk about is what sound is. So sound, as you may have heard, is a wave and it’s a wave of air particles, and so when you think of a wave usually we think of something like this like a sine wave, we call that a transverse wave. When we think of sound though it’s a variation in pressure right that bounces off your eardrums, so a wave like this doesn’t really make a lot of sense. There’s another kind of wave called a longitudinal wave, and this is the kind of wave that includes sound. So if we look at this visualization of a spring this describes what is happening to air as sound travels through it. Now interestingly enough when you record sound on a microphone and convert it into an electrical signal, that signal can be viewed as a wave like this. So, we again get a transverse wave. And so when we look at sound in matlab we’re going to be looking at a transverse wave even though in reality it’s a longitudinal wave. So, one thing you have to remember is that an electrical signal is an analog wave, and when we represent things in matrices on computers those are digital signals. So, for example, you could have a sound recording of say two seconds long and then that translates into a matrix of length 1000, so that would mean you have 500 samples per second, and so this process of sampling is what we call discretization. And so sampling is the discretization of data points along the x-axis, or in this case time. The other kind of discretization we have is along the y-axis. So, this is because numbers in matlab, numbers in any programming language, can only take on certain values. So you can’t have an infinitely precise number. This is because numbers are stored in quantities of say 32-bit or 64-bit and so we have to not only discretize the signal in time along the x-axis, but we also have to quantize the signal on the y-axis. And so both of these lead to some error, but in general say you’re playing a WAV file or you’re playing mp3 you don’t really notice these differences, they are imperceptible to the human ear. So now that we’ve covered a little bit of the theory behind what sound is and how it’s translated into a digital signal that we can read in matlab, we’re going to go ahead and open a sound file in matlab. But, before I do that I’m going to show you how I created the sound file. So there’s a free program on all major platforms linux, windows, and mac called Audacity. It’s free and open-source, you can download it just by searching on Google, and so if you have a microphone connected to your computer you can record things. So I’ve recorded myself saying hello world, hello world, and I’ve saved it to a WAV file. I have this WAV file in my workspace, and so there used to be a function in matlab called the wavread that you can use to extract data from a WAV file. Now notice I get a warning when I use wavread. When you search on Google for how to open a sound file this is the thing that’s going to come up. So, you’re going to get this warning and wonder what’s going on. So wavread is going to be deprecated and there is a method called audio read that’s going to be used instead, but this is currently not showing up in Google results. So let’s look at wavread first. So it only reads WAV files, and so if you don’t know a WAV file is just raw digital sound data. So we sample the sound, and we quantize it, and then we save each of those data points as an array essentially into the file. So there are different method signatures that we can use with wavread, we can get just the data which I’ve done or we can get the data and the sampling rate. So commonly we use FS to refer to the sampling rate the, F stands for frequency, S stands for sampling, so the sampling frequency. Now there’s some interesting things we can do once we have the data. So I can plot the data, not sure where it went, there we go. So you can see here it pretty much looks the same as what I had in audacity, also check out the size of the matrix. So there are 84,480 samples just for that maybe one second of data, and notice that the other dimension here is two. If you think about why that is let’s go back to audacity, there are two channels. So I was recording in stereo right, so there’s the left side, and the right side, so that’s why there are two dimensions to this matrix. Now another interesting thing you can do is play sound. So I can actually play this file back but for that I need to so there’s a method called sound in matlab that you can use to play sound, but we need to use a different method signature for a wavread in particular. We need to get the sampling rate. So I’m going to use wavread hello world and return both the data and the sampling rate. So I do sound, pass in the data, and pass in the sampling rate, “hello world” and it plays back the sound. So now you might be wondering what will happen if I try to read a mp3 file. So I’m going to save this as a mp3, the exact same file, so keep in mind that wavread of course only works with wav files. If we want to read mp3’s or any other kind of compressed sound file you should use audioread, the new function for reading audio in matlab which can also read waves by the way. Okay, so now let’s say one play this “hello world” and so it sounds exactly the same.