And, with your own drawing: https://gofigure.impara.ai
The real world is somewhere in between. It must involve quantum mechanics (in a way I don't really understand), as maximum bandwidth/minimum wavelength bump up against limits such as the Planck length and virtual particles in a vacuum.
An interesting anecdote from Lanczos[1] claims that Michelson (of interferometer fame) observed Gibbs ringing when he tried to reconstruct a square wave on what amounted to a steampunk Fourier analyzer [2]. He reportedly blamed the hardware for lacking the necessary precision.
1: https://math.univ-lyon1.fr/wikis/rouge/lib/exe/fetch.php?med...
2: https://engineerguy.com/fourier/pdfs/albert-michelsons-harmo...
For example, one viewpoint is that "Gibbs ringing" is always present if the bandwidth is limited, just that in the "non-aliased" case the sampling points have been chosen to coincide with the zero-crossings of the Gibbs ringing.
I find that my brain explodes each time I pick up the Fourier Transform, and it takes a few days of exposure to simultaneously get all the subtle details back into my head.
No amount of precision, no number of coefficients, no degree of lowpass filtering can get around the fact that sin(x)/x never decays all the way to zero. So if you don't have an infinitely-long (or seamlessly repeating) input signal, you must apply something besides a rectangular window to it or you will get Gibbs ringing.
There is always more than one way to look at these phenomena, of course. But I don't think the case can be made that bandlimiting has anything to do with Gibbs.
Essentially it's just projection in infinite-dimensional vector spaces.
Turns out... they are not! You can do the same thing using a different set of functions, like Legendre polynomials, or wavelets.
Yup, any set of orthogonal functions! The special thing about sines is that they form an exceptionally easy-to-understand orthogonal basis, with a bunch of other nice properties to boot.
Which to your point: You're absolutely correct that you can use a bunch of different sets of functions for your decomposition. Linear algebra just says that you might as well use the most convenient one!
For someone reading this with only a calculus background, an example of this is that you get back a sine (times a constant) if you differentiate it twice, i.e. d^2/dt^2 sin(nt) = -n^2 sin(nt). Put technically, sines/cosines are eigenfunctions of the second derivative operator. This turns out to be really convenient for a lot of physical problems (e.g. wave/diffusion equations).
Like you can make any vector in R^3 `<x,y,z>` by adding together a linear combination of ` <1,0,0> `, ` <0,1,0> `, ` <0,0,1> `, turns out you can also do it using `<exp(j2pi0/30), exp(j2pi0/31), exp(j2pi0/32)>`, `<exp(j2pi1/30), exp(j2pi1/31), exp(j2pi1/32)>`, and `<exp(j2pi2/30), exp(j2pi2/31), exp(j2pi2/32)>`.
You can actually do it with a lot of different bases. You just need them to be linearly independent.
For the continuous case, it isn't all that different from how you can use a linear combination of polynomials 1,x,x^2,x^3,... to approximate functions (like Taylor series).
To be extra clear: it’s a very good video and you should watch it if you don’t have a feel for the Fourier transform. I’m just trying to proactively instil a tiny bit of dissatisfaction with what you will know at the end of it, so that you will then go looking for more.
(Fractional fourier transform on the top face of the cube)
And for short time fourier transform showing how a filter kernel is shiftes across the signal. [2]
Yes the simplest way to think of it is to exponentiate the dft matrix to an exponent between 0 and 1 (1 being the classic dft). But then the runtime complexity is O(n^2) (vector multiplied with precomputed matrix) or O(n^3) opposed to the O(n log n) of fast fourier transform. There are tricks to do a fast fractional fourier transform by multiplying and convolving with a chirp signal. My implementation is in rust [1] compiled to web assembly, but it is based on the matlab of [2] who gladly answered all my mails asking many questions despite already being retired.
[1]: https://github.com/laszlokorte/svelte-rust-fft/tree/master/s...
This took me down a very fascinating and intricate rabbit hole years ago, and is still one of my favorite hobbies. Application of Fourier, Laplace, and z transforms is (famously) useful in an incredibly wide variety of fields. I mostly use it for signal processing and analog electronics.
Image, video, and audio processing and synthesis, compression algorithms, 2D and 3D graphics and geometry, physics simulation, not to mention the entire current AI boom that's nothing but linear algebra… yeah, definitely algebra isn't useful in anything programming related.
gingerBill – Tools of the Trade – BSC 2025 https://youtu.be/YNtoDGS4uak
Recently I was buying a chromecast dongle thing and one of the listings had some kind of "Amazon recommends" badge on it, from the platform. It had hundreds of 5 start reviews, but if you read them they were all for a jar of guava jam from Mexico.
I'm baffled why Amazon permits and even seemingly endorses this kind of rating farming
Jokes aside, I graduated as "Computer Engineer" (BSc) and then also did a "Master in Computer Science"; I was (young and) angry at the universe why soooo many classical engineering classes and then theory I had to sit through (Control theory, Electrical engineering, Physics), and we never learned about the cool design patterns etc etc.
Today I see that those formative years helped me a lot with how I develop intuition when looking at large (software) systems, and I also understand that those ever changing best design patterns I can (could have) just look up, learn, and practice in my free time.
I wish a today-me would have told my yesterday-me all this.
After graduating I joined an academic research team based mainly in a EE department who were mainly Control Engineers - we were mainly doing stuff around qualitative reasoning and using it for fault diagnosis, training etc.
Rather than talking about sine and cosine waves, they motivate the Fourier transform entirely in terms of polynomials. Imagine you want to multiply two polynomials (p(x) and q(x)). The key is to recognize that there are two ways to represent each polynomial:
1. "Coefficient form," as a set of coefficients [p_0, p_1, p_2, ..., p_d] where p(x) = p_0 + p_1x + p_2x^2 + ... + p_dx^d, OR
2. "Sample form," as a set of sampled points from each polynomial, like [(0, p(0)), (1, p(1)), (2, p(2)), ..., (d, p(d))]
Now, naive multiplication of p(x) and q(x) in coefficient form takes O(d^2) scalar multiplications to get the coefficients of p(x)q(x). But if you have p(x) and q(x) in sample form, it's clear that the sample form of p(x)q(x) is just [(0, p(0)q(0)), (1, p(1)q(1)), ...], which requires only O(d) multiplications!
As long as you have enough sample points relative to the degree, these two representations are equivalent (two points uniquely defines a line, three a quadratic, four a cubic, etc.). The (inverse) Fourier transform is just a function that witnesses this equivalence, i.e., maps from representation (1) to representation (2) (and vice-versa). If the sample points are chosen cleverly (not just 1/2/3/...) it actually becomes possible to compute the Fourier transform in O(d log d) time with a DP-style algorithm (the FFT).
So, long story short, if you want to multiply p(x) and q(x), it's best to first convert them to "sample" form (O(d log d) time using the FFT), then multiply the sample forms pointwise to get the sample form of p(x)q(x) (O(d) time), and then finally convert them back to the "coefficient" form (O(d log d) using the inverse FFT).
This lecture by Dennis Freeman from MIT 6.003 "Signals and Systems" gives an intuitive explanation of the connections between the four popular Fourier transforms (the Fourier transform, the discrete Fourier transform, the Fourier series, and the discrete-time Fourier transform):
https://ocw.mit.edu/courses/6-003-signals-and-systems-fall-2...
Also, a property of wavelets is they're non-parametric, which limits their utility in knowledge discovery applications.
For ML applications, my opinions is that they're somewhat superseded by deep learning methods that apply less restrictive inductive bias. As data grows, the restrictive prior assumptions of wavelets will hurt, sort of like how CNN is being abandoned for ViT, even though CNN can outperform in situations where data is scarce.
So overall, they have a pretty small set of usecases where they're more suited than other alternative tools.
Has anyone ever seen something like that?
The Fourier Transform equation essentially maps a signal from the time domain onto orthogonal complex sinusoidal basis functions through projection.
And the article does not even mention this. =)
Explaining Laplace to solve differential equations with analogies it's a bit more... complicated.
I slightly object to this. Removing small details = blurring the image, which is actually quite noticeable.
For some reason everyone really wants to assume this is true, so for the longest time people would invent new codecs that were prone to this (in particular wavelet-based ones like JPEG-2000 and Dirac) and then nobody would use them because they were blurry. I think this is because it's easy to give up on actually looking at the results of your work and instead use a statistic like PSNR, which turns out to be easy to cheat.
For vision we are much more sensitive to large scale detail (corresponding to low frequency FFT components) than fine scale detail (corresponding to high frequency components), so given the goal of minimizing reduction in perceived quality this is an obvious place to start - throw away some of that fine detail (highest frequency FFT components), and it may not even be noticeable at all if you are throwing away detail at a higher level of resolution than we are able to perceive.
It turns out that human vision is more sensitive to brightness than color (due to numbers of retinal rods vs cones, etc), so compression can also take advantage of that to minimize perceptual degradation, which is what JPEG does - first convert the image from RGB to YUV color space, where the Y component corresponds to brightness and the U,V components carry the color information, then more heavily compress the color information than brightness by separately applying FFT (actually DCT) to each of the Y,U,V components and throwing away more high frequency (fine detail) color information than brightness.
But, yeah, there is no magic and lossy compression is certainly going to be increasingly noticeable the more heavily you compress.
1. You've start with a signal fluctuating going up and down, and it's on a strip of little LEDs labeled from -1 to +1.
2. You mount that strip to a motor, and spin it at a certain rate. After a while the afterimages make a blob-shape.
3. For each rotation rate, measure how much the shape appears off-center.
In this way you can figure out how much the underlying signal does (or doesn't) harmonize with a given rotation hertz.
The Wikipedia FFT article (https://en.wikipedia.org/wiki/Fast_Fourier_transform) credits Gauss with originating the FFT idea later expanded on by others, and correctly describes Cooley and Tukey's work as a "rediscovery."
The FT, as are many other transforms, are 1-1, so, in theory, there's no information lost or gained. In many real world conditions, looking at a function in frequency space greatly reduces the problem. Why? Pet theory: because many functions that look complex are actually composed of simpler building in the transformed space.
Take the sound wave of a fly and it looks horribly complex. Pump it through the FT and you find a main driver of the wings beating at a single frequency. Take the sum of two sine waves and it looks a mess. Take the FT and you see the signal neatly broken into two peaks. Etc.
The use of the FT (or DCT or whatever) for JPEG, MP3 or the like, is basically exploiting this fact by noticing the signal response for human hearing and seeing it's not uniform, and so can be "compressed" by throwing away frequencies we don't care about.
The "magic" of the FT, and other transforms, isn't so much that it transforms the signal into a set of orthogonal basis but that many signals we care about are actually formed from a small set of these signals, allowing the FT and cousins to notice and separate them out more easily.
And there's another good reason why so many real-world signals are sparse (as you say) in the FT domain in particular: because so many real-world systems involve periodic motion (rotating motors, fly's wings as you noted, etc). When the system is periodic, the FT will compress the signals very effectively because every signal has to be harmonic of the fundamental frequency.
Well, stable systems are can either be stationary or oscillatory. If the world didn't contain so many stable systems, or equivalently if the laws of physics didn't allow so, then likely life would not have existed. All life is complex chemical structures, and they require stability to function. Ergo, by this anthropic argument there must be many oscillatory systems.
I would say that the it's very difficult to imagine a world that would not be governed by differential equations. So it's not just that life wouldn't exist it's that there wouldn't be anything like the laws of physics.
As a side note chaotic systems are often better analysed in the FT domain, so even in a world of chaotic systems (and there are many in our world, and I'd argue that if there wasn't life would not exist either) the FT remains a powerful tool
We have discovered a method (calculus) to mathematcally describe continuous functions of various sorts and within calculus there is a particular toolbox (differential and partial differential equations) we have built to mathematically describe systems that are changing by describing that change.
The fact that systems which change are well-described by the thing we have made to describe systems which change shouldn’t be at all surprising. We have been working on this since the 18th century and Euler and many other of the smartest humans ever devoted considerable effort to making it this good.
When you look at things like the chaotic behaviour of a double pendulum, you see how the real world is extremely difficult to capture precisely and as good as our system is, it still has shortcomings even in very simple cases.
A learning that describes chaos well enough may not want to be associated with "calculus", or even "math" (ask a friendly reverse mathematician about that)
https://www.johndcook.com/blog/2021/04/09/period-three-impli...
Somewhat tangentially, if Ptolemy I had responded (to Euclid) with anything less specific ---but much more personal--- than "check your postulate", we wouldn't have had to wait one millennium.
(Fermat did the best he could given margin & ego, so that took only a century or so (for that country to come up with a workable strategy))
Less tangentially, I'd generalize Quigley by mentioning that groups of hominids stymie themselves with a kind of emergent narcissism. After all, heuristics,rules and even values informed by experience & intuition are a sort of arrogance. "Tautology" should be outlawed in favour of "Narcissism" as a prosocial gaslighting term :)
Ordinary differential equations can describe any system with a finite number of state variables that change continuously (as opposed to instantaneously jumping from one state to another without going through states in between) and as a function of the system's current state (as opposed to nondeterministically or under the influence of the past or future or some kind of supernatural entity).
Partial differential equations extend this to systems with infinite numbers of variables as long as the variables are organized in the form of continuous "fields" whose behavior is locally determined in a certain sense—things like the temperature that Fourier was investigating, which has an infinite number of different values along the length of an iron rod, or density, or pressure, or voltage.
It turns out that a pretty large fraction of the phenomena we experience do behave this way. It might be tempting to claim that it's obvious that the universe works this way, but that's only because you've grown up with the idea and never seriously questioned it. Consider that it isn't obvious to anyone who believes in an afterlife, or to Stephen Wolfram (who thinks continuity may be an illusion), or to anyone who bets on the lottery or believes in astrology.
But it is at least an excellent approximation that covers all phenomena that can be predicted by classical physics and most of quantum mechanics as well.
As a result, the Fourier and Laplace transforms are extremely broadly applicable, at least with respect to the physical world. In an engineering curriculum, the class that focuses most intensively on these applications is usually given the grandiose title "Signals and Systems".
For example, the concept of homeostasis in biology is like this. Lots of things are happening inside the living body, but it's still effectively at a functional equilibrium.
Similarly, lots of dynamic things are happening inside the Sun (or any star), but from the perspective of Earth, it is more or less stationary, because the behavior of the sun won't escape some bounds for billions of years.
(And cases where that isn't true can still be instructive - a Taylor series expansion for air resistance gives a linear term representing the viscosity of the air and a quadratic term representing displacement of volumes of air. For ordinary air the linear component will have a small coefficient compared to the quadratic component.)
They do a really good job at breaking down the fundamental knowledge needed to build an understanding.
https://www.youtube.com/watch?v=spUNpyF58BY&list=PL4VT47y1w7...
This setup then allowed Fourier to take the limit of the Fourier series using a complex exponential expression. But keep in mind, this is all in the context of 19th century determinisitic thinking(see Fourier's analysis of the heat equation 1807, and Maxwell's treatise on 'Theory of Heat' c. 1870s), and it ran into real-world limits in the late 19th century, first with Poincare and later with sensitive dependence on initial conditions. . Poincare showed that just because you have a deterministic system, you don't automatically get predictability. Regardless this Fourier transform mathematical approach worked well in astronomy (at least at solar-system scale) because the underlying system really was at least quasiperiodic - essentially this led to prediction of new planets like Neptune.
But what if you apply Fourier transform analytics to data that is essentially chaotic? This applies to certain aspect of climate science too, eg, efforts to predict the next big El Nino based on the historical record - since the underlying system is significantly chaotic, not strictly harmonic, prediction is poor (tides in contrast are predictable as they are mostly harmonic). How to treat such systems is an ongoing question, but Fourier transforms aren't abandoned, more like modified.
Also, the time-energy quantum mechanics relation is interesting though not really a pure QM uncertainty principle, more like a classical example of a Fourier bandwidth relation, squeeze it on one axis and it spreads out on the other - a nice quote on this is "nothing that lives only for a while can be monochromatic in energy." Which sort of leads on to virtual particles and quantum tunneling. (which places a size limit of about 1 nm on chip circuitry).
The bottom line is that if you're applying elegant and complex mathematical treatments to real-world physical problems, don't forget that nature doesn't necessarily follow in the footsteps of your mathemetical model.
That sounds complicated but if you look at the integral the exponential kernel is essentially a continuous “matrix” and the function you are integrating over that kernel is a continuous vector.
This observation on can be a guide to better understand infinite dimensional Hilbert spaces ( inner product space + stuff) and is one of the core observations in quantum mechanics where it’s part of the particle-wave concept as it transforms location space -> momentum space.
Cauchy had just proved that the limit of the sum of an infinite set of continuous functions was itself continous, and then along came Fourier with “are you sure about that bro?” and showed that you could take the infinite sum of very clearly continuous functions (just sine and cosine) and approximate something like a sawtooth function (which was very obviously discontinous) as closely as you like.
[1] by which I mean making obvious the fact that they had been proceeding for 100+ years using calculus without a rigorous basis.
It would be very sad to lose those treasures by passing them by because you thought you already had them. As the song says:
> When you're young, you're always looking
> On the far side of the hill;
> You might miss the fairest flower
> Standing by your side very still.
And the flowers of Fourier analysis are as fair as the fairest flowers in this universe, or any universe.
https://news.ycombinator.com/item?id=45134843 may be a clue to the hidden beauty, for those who are puzzled as to what I might be talking about.
Also a related fact is that the rate of change of a sine wave is itself shifted by pi/2, while the exponential curves need no shifting. I'm not a professional in these matters, but I guess there is a deeper connection between the rate of change and the static world.
How do you even begin to think of such things? Some people are wired differently.
A common mistake I see in people reading mathematics (or even computer science papers) is to think the proof set out in the paper is the thought process that lead to the interesting insight. It is almost always an ex post facto formalisation.
Other mathematicians before Fourier had used trigonometric series to study waves, and physicists already understood harmonic superposition on eg a vibrating string. I don't have the source but I believe Gauss even noted that trigonometric series were a solution to the heat equation. Fourier's contribution was discovering that almost any function, including the general solution to the heat equation, could be modelled this way, and he provided machinery that let mathematicians apply the idea to an enormous range of problems.
I think he could have explained how the Gaussian filter almost kills all details / high frequency features, then rounding completely destroy them and then they can not be magically reconstructed to "unblur". He gives some hints about this, but they are to few and too general.
PS: There are some non lineal ticks "enhance" the "unblur" version , like minimizing the L1 norm of the gradient (or something like that). It helps with some images. I'm not sure which is the current state of the art.
https://youtu.be/spUNpyF58BY?feature=shared
Edit: In fact it was already mentioned in the comments, but I haven't noticed
Until I read this article I didn't properly understand Fourier transforms (I didn't know how image compression bitmaps were derived), now it's opened a whole new world - toying with my own compression and anything that can be continuous represented as it's constituent parts.
I can use it for colour quantisation too possibly to determine main and averaged RGB constituents with respect to hue, allowing colour reduction akin to dithering, spreading the error over the wave instead and removing the less frequent elements.
It may not work but it'll be fun trying and learning!
And the window of course creates a latency, which is sometimes relevant for realtime audio filtering by FFT.
This is the "Fourier transform" that is most often used, in practice in computing.
A longer window length gives lower frequency resolution, but longer latency.
I guess what i want to know, in the examples it always shows like 3 or 4 constituents frequencies as output, but why not hundreds or a million? Is that decided upfront /paramtetizable?
The article isn't helpful, it just says something like "all possible frequencies".
But in the continuous Fourier transform, the output you get is a continuous function that's defined on the entire real line.
The bins do not actually measure a specific frequency, more like an average of the power around that frequency.
As you take more and more samples, the bin spacing gets finer and the range of frequencies going into the average becomes tighter (kind of). By collecting enough samples (at an appropriate rate), you can get as precise a measurement as you need around particular frequencies. And by employing other tricks (signal processing).
If you graph the magnitude of the DFT, signals that are a combination of power at just a few frequencies show just a few peaks, around the related bins. Eg a major chord would show 3 fundamental peaks corresponding to the 3 tones (then a bunch of harmonics)
https://en.wikipedia.org/wiki/Fourier_transform#/media/File:...
So you detect these peaks to find out what frequency components are present. (though peak detection itself is complicated)
Suppose you blinde, and are on one side of a black picket fence. Let's say the fence uprights are 1" wide, and 1" apart.
If you aim a light meter at the fence, it will sum up all the light of the visible slices of the background. (This meter reads its values out loud!)
Suppose the background is all white - the meter sees white slices, and get a 50% reading (because the fence stripes are black). It's as high as you can get.
Now move the picket fence half an inch to the left - still all white, still a 50% reading. Now you know the background is 100% white, no other colors.
But if when you move it, black stripes are exposed, then the meter reads 0%, and you know that the background is 50/50 white and black, in stripes.
Congratulations! You've just done a Fourier transform that can detect the difference between 0-frequency and a 1"-striped frequency pattern. In reality, you're going to continuously move that fence to the left, watching all the time, but this is simple.
But instead, imagine you got readings of 40% and 10%. What kind of pattern is that? Gray stripes? White stripes that aren't as wide as the fence posts? You'll have to build another fence, with another spacing - say 1/2" fence posts 1/2" apart. Repeat.
"The LCT generalizes the Fourier, fractional Fourier, Laplace, Gauss–Weierstrass, Bargmann and the Fresnel transforms as particular cases."
https://en.wikipedia.org/wiki/Linear_canonical_transformatio...