The Discrete Fourier Transform: Most Important Algorithm Ever?

The Discrete Fourier Transform: Most Important Algorithm Ever?

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI

Оглавление (6 сегментов)

Segment 1 (00:00 - 05:00)

the problem we're going to dive into today is best understood through the lens of Music suppose we had three notes that are played together as a chord one natural way to represent this chord is through a plot of the amplitude of the sound signal over time but in a practical sense a representation that's much more useful when working with audio processing is just the underlying notes that make up the chord each chord is a distinct frequency in having granularity on what frequencies are present in the original audio lets you unlock a lot of powerful tools want a specially useful tool in the world of Sound Engineering is an equalizer allows you to bring down specific frequencies giving a new signal with the different sound profile conversely we can also enhance frequencies that we'd like to bring out and equalizer just one of many tools that rely on a frequency representation of a signal what you and I are going to talk about today is a small but essential piece of how this fundamental idea of representing a signal with its frequencies works the core concept involves arguably one of the most important algorithms in modern technology the Discrete Fourier transform or DFT most people who learn about the DFT for the first time often find it initially difficult to grasp it's a complex topic that's hard to teach the goal of today's video is to attempt to see if we can reinvent the clever ideas behind a Discrete Fourier transform from first principles on the way we'll naturally interact with some of the fundamental concepts in digital signal processing foreign discussing the topic of signals most mathematical representations model them in a continuous time setting specifically a signal could have a value for any real number time input but in a practical sense no Computing machine can Ever represent signals in this manner so naturally what we have to do is sample points from the signal the most common sampling scheme is to uniformly sample these signals to get a sequence of points that represent the original continuous signal Nuance involved in sampling continuous signals here's a fun thought experiment let's say I give you a cosine wave with frequency 7 cycles per second or seven Hertz how many points do you think we need to accurately capture the signal think about it it's not an easy question by any means let's start by ruling out possibilities can we get away with representing the signal with seven evenly sampled points well one way to quickly see the issue here is if our sampling scheme happens to unfortunately sample every peak of this wave the subsequent signal looks just like a constant signal this by the way is commonly called aliasing if we sample too few points there are many possible signals that could produce the same samples that we obtain okay so one way to think about this is we probably need to find a way to sample both the Peaks and the troughs so is there a way we can get away with sampling 14 points instead of seven funnily enough there's a way to even break 14 points if you happen to sample evenly such as this once again you get a constant signal it turns out the correct answer for this question is 15 points if you sample 15 points in any uniform manner you're guaranteed to accurately represent the 7 Hertz signal this is an example of a quite famous theorem in Signal processing called the Shannon Nyquist sampling theorem there's a lot more material here that I'm purposely glossing over but the important takeaway that I want you to remember is that for any signal of frequency f we need to strictly have more than two times that frequency F sampled points to accurately represent the signal another equivalent representation is that given n samples of a signal the highest possible frequency that could exist within the signal is strictly less than n divided by 2. any higher frequencies are not guaranteed to be accurately captured this will become important later so keep this fact in the back of your mind one cool example of this in practice is most audio sources have a sample rate of at least 44. 1 kilohertz and the reason why is the human ear generally can't catch frequencies above 20 kilohertz foreign these samples taken from the original signal give us a clear understanding of how the signal changes over time now what we showed with equalizers is another representation of the signal but in terms of its frequencies fundamentally these two representations give us the exact same signal they're just two different ways of communicating that information what this frequency representation tells us is that this particular signal is composed of three different waves of

Segment 2 (05:00 - 10:00)

frequency 2 5 and 10 Hertz the idea of this frequency representation is if we take these three waves scale them by their relative strength and add them together we will get the original signal or idea of what's called a frequency domain representation extraordinarily important that we understand that all the DFT involves is a conversion between a Time domain representation to a frequency domain whatever we want to understand how we might ReDiscover a complex idea it really helps to have an understanding of how we would like it to behave for some simple examples suppose we have a cosine wave of frequency to Hertz ideally what we would like the frequency domain representation of such a signal to B is a Peak at the frequency 2 and 0s for all other frequencies so in the context of the simplified example we're looking for some operation that produces a high value for an input frequency of 2 and 0s for all other frequencies so what that means is the operation we're looking for is fundamentally a measure of similarity when the frequencies match we should get a high similarity score but when they have different frequencies we ideally want to score a similarity to b0 given a sequence of input points from a Time signal we can essentially represent the sequence as a vector now one of the most common ways to represent similarity within the context of vectors is the notion of a DOT product so let's start out analysis by seeing how we can properly construct a similarity measure for our original time domain signal and different frequency cosine waves in order to properly construct a DOT product we need a set of samples from the cosine waves that we want to compare with as a first attempt it seems reasonable to sample the cosine wave at the same sampling rate as the original signal and this does Work Well when cosine wave match one to one as it does here in this simplified example but what about other frequency cosine waves there are theoretically infinitely many cosine waves that we could compare to in this time segment and we can't evaluate all of them so how do we go about picking a set of frequencies to compare with to be clear there are inherently two frequencies involved one is the frequency of the original cosine wave and the other is what we'll formally call an analysis frequency for any analysis frequency F hat we'll currently Define it as a cosine wave with amplitude 1 and frequency f-hat so when picking frequencies we want to compare against remember a key property we're looking for in our frequency domain representation is we want all frequencies other than the intrinsic frequency to have zero contribution so let's see what happens when we take the dot product of this cosine wave with other frequency cosine waves here we'll slowly adjust the analysis frequency and observe what happens to our similarity measure one thing to notice here is that if you take any random cosine wave of some real number frequency the contribution is actually almost always non-zero but there are a specific set of frequencies where things just perfectly cancel out and we end up with a contribution of zero as we desire what's interesting from this particular example is that when our analysis frequencies defined such that there is an integer number of Cycles within the time range provided the similarity measure seems to always end up being zero this looks kind of promising so let's dive a little deeper into this Behavior so far we've been quite hand wavy without treatment of what we're looking for in the DFT to go one level further let's formalize things a little bit more carefully right now we have a Time domain signal which as mentioned can be represented as a vector of sampled points we want to Define some transformation that takes this time domain signal and gives us a frequency based representation where a certain set of frequencies is mapped to a value that represents a contribution of that frequency to the original signal we already asserted that for a simple cosine wave with frequency F how we want this transform to behave ideally is to have a high value for that frequency F and zeros for all other frequencies we saw some promising signs of this Behavior as we compared our cosine wave to different analysis frequencies currently we have a similarity measure that's defined in the dot product between our original signal and Sample points from an analysis frequency and what we're looking for is performing this operation across a set of analysis frequencies what sort of transformation operation expresses this really well what we're essentially trying to Define is a matrix Vector product by

Segment 3 (10:00 - 15:00)

representing samples from different analysis frequency cosine waves as rows of this Matrix we get a compact representation of our DOT product similarity measure applied to all analysis frequencies what we're ideally looking for is three fundamental requirements to be satisfied in this specific example of a single frequency cosine wave first we want it to be the case that when the analysis frequency is exactly the same as our original signals frequency the respective frequency domain values should be greater than zero and two when we compare against analysis frequencies that have some other value than the original Signal's frequency we should get a corresponding dot product of 0 to signify that there was no contribution of those frequencies to the signal and lastly we would like this transformation to be reversible given a frequency representation of the signal we'd like to exactly reconstruct the original time domain signal now it turns out these requirements can be translated to fundamental mathematical properties of matrices starting from the bottom to make this operation reversible this mysterious Matrix must be invertible and to fit these two constraints what we're looking for is an orthogonal Matrix so immediately off the bat we know that our eventual definition of the Fourier transform needs to have the same number of analysis frequencies as the number of samples in the signal otherwise there's no way the Matrix can be invertible well let's see if our earlier definition of analysis frequencies for cosine waves fits the criteria we're looking for as we have seen earlier the nice property of these sample points from analysis frequencies of cosine waves is that when the cosine waves had the same integer frequency the dot product is non-zero and when they have differing integer frequencies they have a DOT product of exactly zero or at least that's what I implied over here some of you may have noticed that this is actually not accurate it turns out that there are particular pairs of analysis frequencies that have a non-zero dot product even when the underlying frequencies are different in fact if we compute every combination of dot products between these analysis frequencies we see an interesting pattern for every non-zero analysis frequency there exists exactly one other frequency that results in a positive dot product what's going on here well if we pull it apart we see the problem is that even though these particular frequencies have different underlying cosine waves the samples themselves are actually identical this is essentially a byproduct of the Shannon Nyquist sampling theorem since we're only sampling eight points any frequency four or above cannot be accurately represented with eight samples so we end up with overlap one thing that you might notice here is that a subset of this Matrix is nicely defined and one way that we can sort of get past this sampling problem is to only consider the first half of any frequency spectrum when we use this transform on any non-zero frequency cosine wave there will always be some redundant information as a result of the nature of our current definition of the transform the bigger problem here is because multiple rows of our analysis frequency Matrix are identical that actually means a matrix is not only non-orthogonal it's also non-invertible this is unfortunately a serious problem invertibility is an essential requirement so that we can actually reverse our transform as a result Miner spoiler here this is not the actual Fourier transform but sometimes from an engineering perspective it's that we know is wrong and see where things break down before we dive into how frequency decomposition via the DFT really works I want to take a quick aside to talk about some similar types of activity that happen on a day-to-day basis in some countries it is legal to monitor your internet traffic and build up an online profile of you basically taking in a signal that's a function of web activity and decomposing it into a fairly accurate advertising profile one that can perhaps also be sold and used maliciously when browsing online this is not something any of us would like to think about so that's why I recommend nordvpn the sponsor of today's video nordvpn prevents outside parties from intrusive tracking and building up such a profile it's also great in areas where internet service providers can purposely slow down your connection in addition to these essential features one of my favorite uses of nordvpn is while traveling nordvpn can give you a spoofed location in any of 60 countries allowing you to get access to shows and entertainment that you may not usually get the ability to connect six devices simultaneously allows you to take these features anywhere you go nordvpn has been in this business for years making it easier than ever to get set up they've built a reputation that

Segment 4 (15:00 - 20:00)

values trust and cyber security if you're interested go to nordvpn. com reducible to get the two-year plan with an exclusive deal plus one bonus month it's risk-free with nordvpn's 30-day money-back guarantee thanks to nordvpns for sponsoring this video going back to our previous question we know our original transform has an issue with invertibility and to see where things break down let's observe what happens when we try to apply this fake Fourier transform to a variety of signals of different definitions let's think about some test cases to set some ground rules surrounding our assumptions all the transforms from here on out will be taken from 16 points of the original signal and we simply display half of the frequency spectrum this allows us to at least capture frequencies of up to seven by the Shannon Nyquist theorem let's start with what we've been doing so far cosine waves the first case is that we want any changes to amplitude to be appropriately reflected in the strength of the respective frequency in the frequency domain and when we gradually change the amplitude of the original signal we indeed see that this works as expected amplitude changes do seem to be properly captured by our fake Fourier transform the next test case is to change the frequency of the cosine wave what we expect to happen is that our test transforms Peak will be correspondingly updated to the new frequency as of frequency changes and indeed that is the case the peak moves in the frequency domain as the frequency of the original signal changes let's see what happens when we shift the y-intercept of our cosine waves when we shift these waves up we are essentially adding a signal of zero frequency and our fake Fourier transform does encapsulate that information in the contribution to zeroth analysis frequency all right so far nothing has clearly been broken so let's make our test cases a bit more complex here's a combination of different cosine waves and let's see if our tests transform properly extracts all the frequency components we want an ideal Fourier transform to extract exactly these frequencies when we pass the sample point from the signal into our test transform we do indeed get those frequencies the fundamental reason this works is that our transform is just a matrix multiplication so it's linear what that means is the output of this transform on this aggregate signal is exactly the same as if we perform the transform on the individual signal separately and then sum the outputs together so far we've focused on cosine waves let's now see what happens when we use sine waves what we expect is if we pass in a sine wave of some frequency we should ideally get that same frequency displayed in our frequency spectrum so let's try it what we get is a bit surprising it seems like all the frequency components are zero and if we do some further digging it turns out that if we take the dot product between this cosine wave analysis frequency and the matching sine wave frequency the dot product is indeed zero and interestingly enough this is true for any frequency sine waves and cosine waves of matching frequencies are orthogonal to each other I would test transform as currently constructed will be unable to extract frequency information from sine waves to understand what's going on we have to go back to our test cases we test the changing amplitudes we tested changing frequencies and we also tested Shifting the signals up and down but one other component of signals that we can change is the phase remember what we expect to happen when we change the phase is that our frequency component strength should stay the same after all it's the same frequency still present in the original signal just shifted across time but what happens with our test transform is unfortunately the frequency strength is impacted by the phase the sine wave example is just a specific case when the phase is offset by 90 degrees so fundamentally this is where our fake Fourier transform breaks let's now see how we can fix it well in the case where we originally had a sine wave one kind of hacky way to fix this problem is to use analysis frequencies that correspond to sine waves instead of cosine waves that with the sine wave version of the transform every frequency sine wave properly gets a frequency contribution after going through our test transform but as you may expect this breaks our original cosine wave comparisons where we now end up with the same problem so there seems to be some kind of balance that needs to be struck here but how exactly can we do it well let's play around with these separate cosine and sine wave analysis frequencies and see what happens

Segment 5 (20:00 - 25:00)

given a signal let's modify the phase of the signal and capture values from both the cosine and sine versions of our fake transforms what's going on here is we are capturing the dot product of the matching analysis frequency for our original signal for both versions of these transforms as you can see when the underlying wave is a cosine wave the cosine component dominates when the phase is altered to a sine wave the sine component takes over one thing that's kind of interesting here is these values are never simultaneously zero and there's also a fun sort of dance going on between them one idea that seems interesting is to see what happens when we plot these values on a 2d coordinate plane and what's pretty interesting is we see that these coordinates trace a circle as we change the phase of the original signal how cool is that but the question remains how do we put this together in a way that solves our phase problem as a reminder the original issue we're trying to solve here is as we were changing the phase of our signal we were getting different frequency contributions from the test transform or we want ideally as a way for our frequency contribution to stay the same regardless of phase can we use a feature of these coordinates that trace a circle to give us a value that stays the same regardless of the phase as some of you might have already guessed it's the radius or more concretely the magnitude of these coordinates so let's put this together into a new proposed solution we originally had a matrix containing analysis frequencies of cosine waves now what we'll add to the equation is a similar Matrix but for sine waves of different analysis frequencies the big idea now is that we can apply the cosine wave analysis frequency Matrix to give us x coordinates and then apply the sine wave analysis frequency Matrix to give us Y coordinates the final frequency domain representation will put these two coordinates together via the magnitude of these coordinates Here's the final combined version of the transform applied to our previous test cases this combined version of sine and cosine analysis frequencies by the way is what most people refer to when they talk about frequency domain representations of the true Discrete Fourier transform but the DFT uses one more trick to simplify things after all generating these two matrices performing this transformation twice and then extracting the magnitude is a bit convoluted the key Insight is instead of keeping track of two matrices that have sample points from cosine and sine waves we can jointly represent the two coordinates as sample points from a unit circle into a single Matrix so let's see how this actually works for a signal with 10 sampled points breaking down an example is one of the best ways to see the underlying pattern for the analysis frequency zero all the points are simply the same coordinate it's a zero frequency signal so this is expected for the analysis frequency one things get more interesting if we plot the corresponding points from our cosine and sine samples we get the following 10 points and intuitively what's going on here is we're taking a single cycle through the unit circle and uniformly sampling points until we get 10 points for the analysis frequency 2 the points from the sine and cosine waves align as follows and once again a nice way to interpret this is we are regularly sampling points from the unit circle but now the key difference is we do it over two cycles let's do analysis frequency 3 to make sure the same pattern holds we plot the sine and cosine points as before and what we again see is that they are uniformly sampled points from the unit circle with an underlying frequency of three Cycles over all analysis frequencies the key interpretation here is that each analysis frequency samples points from the unit circle in a uniform manner with an angular frequency that matches the analysis frequency the final part of the simplification to the true DFT is when we discuss points on a unit circle it's quite convenient to represent the coordinates as complex numbers the real components of these complex numbers correspond to points on the original cosine wave while the imaginary components are mapped to points on the original sine wave in practice these points are compactly represented as complex exponentials via Euler's formula foreign the final DFT Matrix is composed of complex exponentials where every rule corresponds to sample points from the unit circle at distinct integer analysis frequencies one of the most mind-blowing facts about this DFT Matrix is by defining it as complex exponentials regularly sampled from the unit circle It's actually an

Segment 6 (25:00 - 29:00)

orthogonal Matrix this Matrix as a result ends up satisfying all the requirements we were looking for when we started this journey so taking a step back the DFT is fundamentally just a matrix Vector product of a Time domain signal with the DFT Matrix when you multiply this Matrix with your time domain signal you get a sequence of complex numbers the magnitude of each of these complex numbers tells you the contribution of that respective analysis frequency throughout the original signal the angle of the complex coordinate is a measure of the phase of the signal the best way to see this is without previously failing test case when we perform the DFT transform and plot the magnitude of the frequency components we can see that the magnitudes of the Peaks do not change as the phase of the signal changes but what's really cool is the angle of these corresponding Peaks when plotted on a complex plane correspond exactly to the phase of the signal at that how the DFT uniquely captures frequency contributions and phase in a signal earlier we talked briefly about how there are two peaks in the full spectrum of a cosine wave when using our fake transforms the true DFT also has this property for real valued cosine waves but now the interpretation is a bit different the DFT decomposes a signal into a set of complex exponentials the first Peak corresponds to a complex exponential that has an underlying frequency matching the original cosine wave frequency while the second Peak corresponds to a complex exponential that has an underlying frequency that is moving in the opposite direction and perfectly cancels out the imaginary component of the first complex exponential and this will always be the case for all real value signals this symmetry is required to cancel out the imaginary components and get a real signal a final example that's also helpful to gain intuition on what's going on in the DFT is notice how as we change frequencies of our test signal when the test signal matches one of our analysis frequencies in our DFT Matrix the values from the respective analysis frequencies shoot out while all the other frequencies are centered around zero this is another way to visualize how our earlier measure of similarity materializes now in a complex plane from an implementation perspective the DFT is fundamentally just the Matrix Vector product and the inverse DFT to get the original signal back is also a matrix product with the inverse DFT Matrix multiplied by a frequency domain representation most of you could probably implement this Matrix Vector product yourselves but in practice the true DFT is computed using an algorithm called the fast Fourier transform I have an entire video on this really elegant algorithm that I actually approached from a completely different perspective part of the reason I wanted to make this video is that there are so many great comments from the previous video on wanting to understand the DFT in a signal processing context and I hope this video delivers on that so even though the DFT does get a bad rep for being quite complex pun intended at the end of the day by going through a kind of playful process of trying different ideas even with a lot of hand-holding I'm hoping you come away from this video with the sense that there's a logical reason for why it's defined the way it is most of the time in computer science there's so many abstraction layers that you deal with for example you can utilize the TFT using a single call to an fft function and somewhat get away with not really understanding what's going on underneath the hood and honestly that's completely fine abstraction is so important in terms of just being able to focus on what you need to accomplish but I do think that peeling away the abstraction layers behind how some of these Concepts work holds some of the most beautiful ideas in computer science thanks for watching and I'll see you all in the next one

Другие видео автора — Reducible

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник