Justus Polzin

Justus Polzin

Why are polynomials so fundamental?

I was recently reminded of this question when I saw this tweet:

Is there an intuitive rationale for the necessity of the complex numbers to exist? Saying "we needed to solve x^2=-1" is a bit short, why not "x+1=x"?

— François Fleuret (@francoisfleuret) January 21, 2024

I will explain at the end how "the necessity of the complex numbers" ultimately stems from "the necessity of polynomials" but maybe you can already see how.

I think the property of polynomials that I will explain is so obvious in hindsight, that people who have seen it, often don't even realize that others may not know.

I remember reading a reddit post about this topic years ago where the top comment was about eigenvalues and the characteristic polynomial of a matrix, and some other comments where about function approximation and other applications. While I agree that polynomials would be interesting for those reasons alone, I think the reason why we force everyone to study them in school is even more fundamental. I could not find that post but when looking for the question I did find answers that get to the bottom of what I want to discuss. For example this post from 5 years ago mentions this math overflow question where the first sentence of the top answer is this:

Polynomials are, essentially by definition, precisely the operations one can write down starting from addition and multiplication.

The rest of the answer are about commutative algebra and relate to category theory, which are maybe in some sense even more fundamental, but probably not what most people are looking for when asking this question. (In fact, the reply on reddit is exactly this).

There is also this r/math post from 2 years ago where the top answer is:

Polynomials are exactly what you can get by adding and multiplying known and unknown things, so they necessarily occur at everything that involves basic algebra.

So this post is of course not a new perspective, but I will try to explain these answers in a bit more detail. This post will be very non-technical so that (hopefully) someone learning about polynomials in school for the first time could hopefully understand it. At least up to the point where I start talking about complex numbers.

A Tale of Addition, Subtraction, Multiplication, and Division

If you want to understand why polynomials are so important, you will have to accept that addition, subtraction, and multiplication are important. Given that they are taught very early on in elementary school and used by almost everyone, I think this is a reasonable expectation.

The fact that polynomials are important is now almost an immediate consequence. We will restrict ourselves to the case of a single variable . Everything remains true with multiple variables, but multiple variables would make everything more complicated to write down and unnecessarily distract from the ideas.

Theorem of the Polynomial Normal Form
Any expression involving only
  • addition ()
  • subtraction ()
  • multiplication ()
  • numbers
  • the variable

can be written as a polynomial expression. That is an expression of the form where the are numbers called coefficients.

I don't know if this theorem has an actual name, which is strange because I find it so fundamental. I have called it "Theorem of the Polynomial Normal Form", because polynomials act as a normal form for such expressions. I think once you see this theorem, it is clear that it is true by using associativity, commutativity and distributivity. (This can be proven pretty easily by induction on the syntax tree of the expression). Just in case there is actually someone in highschool who googled this question reading this, let's just see an example. Take the following expression:

By using associativity, commutativity, and distributivity we can transform this expression into the usual polynomial form. First, let's simplify the second bracket.

Now we can distribute with the first bracket.

I am not yet done, but I think at this point it also clear that it is very sensible to introduce the notation where is an integer greater than or equal to 0. Sometimes you see the question "Why don't we "allow" any (i.e. negative, fractional, or even irrational) exponents in polynomials?". I think the real reason is that we want "polynomial expressions" to act as a normal form for expressions using only these three operations. We just use the notation as a textual replacement to avoid having to write long chains of multiplications by the same variable. So a variable in the exponent is also not allowed, because there is no expression using only addition, subtraction, and multiplication that is equivalent to e.g. .

Having said all this, the expression can now be written as

Now we can combine the terms and as well as and , and sort the terms by degree and we have the form of a polynomial you are used to.

What about division?

One immediate question is why division is disallowed. It is one of the four basic arithmetic operations after all. There are many reasons you could give here, I think the most fundamental is that division is not defined for all divisors, i.e. 0. So when you write down such an expression, e.g. , you can't plug in -3 for x, i.e. the function is not defined everywhere. Even is not really equivalent to because is defined everywhere whereas is not defined for . So for our simplest class of functions (polynomials), we just want to avoid all that trouble altogether, by only being able to even write down expressions that can be evaluated everywhere. But still, there actually is a similar theorem as in the case without division.

Theorem of the Rational "Normal" Form
Any expression involving only
  • addition ()
  • subtraction ()
  • multiplication ()
  • division ()
  • numbers
  • the variable

can be written as a quotient of two polynomials, i.e. as where and are polynomials in the variable .

Maybe you have had to deal with rational functions in school and have wondered why anyone cares about those. It is (in my opinion) because they are the normal form of expressions that allow all four basic arithmetic operations. Moreover, this actually gives us another answer to the original question of why polynomials are so important:

If you want to know where the original expression is undefined, you have to find the zeros of the denominator polynomial. If you want to know where the original expression evaluates to zero, you have to find the zeros of the numerator polynomial. So even if you include division, the properties of the expressions without division (polynomials) are still the basis of your understanding.

We have to be a bit careful here with the normal form. The difficulty here is again that the function doesn't have to be defined everywhere and to be truly equivalent we have to preserve the locations where the function is undefined, so is not equivalent to . I don't really want to go into the details here or prove anything, but basically you have to make sure that the final function is still undefined at the same places as before, but to be a normal form it has to be reduced as much as possible such that two expression that compute the same function and are undefined at the same places have the same normal form. E.g. the normal form of is and not . You can also just allow that your normal form is defined in more places than before and equal on all previously defined inputs.

Why not other operations?

Why don't we allow exponentiation, or logarithms, or sine and cosine? I mean obviously people do consider more complicated expressions than just polynomials, but I think there is some sense in which addition, multiplication, etc. really are more fundamental than e.g. sine.

I have an exercise for you. (Don't actually do it). On a piece of paper without a calculator/computer, compute (say) 5 digits of . How do you even start? How does a calculator do it? On the other hand: Calculate the first 5 digits of . You can probably do this on a piece of paper. So the basic arithmetic operations we can somehow do on a piece of paper but sine is not as easy.

One way to actually define sine is actually as an "infinite sum" (also called series). In this case it is actually a power series, which is like an infinite polynomial. Expressions have to have finite length so the complete infinite sum isn't actually an expression. This is the beginning of that sum:

This is called the Taylor series of sine. It only involves the basic arithmetic operations but an infinite amount of them! But when you start evaluating it for any particular value of , you will notice that the terms of this sum start getting smaller and smaller at some point. So when you want to know the result to any particular precision, you can just stop when those digits don't change anymore. This is a way to calculate the approximate value of sine for any number. Your calculator would also do it in a similar way, potentially using more complicated algorithms, but I think you would be hard pressed to find an algorithm that doesn't use the basic arithmetic operations somewhere. In fact, you'd have a hard time even defining sine formally without first defining basic arithmetic.

I don't wanna spend too much time on this, but basically when you actually formally define most other functions you know, they are defined in terms of the basic arithmetic operations, so in that sense they really are more fundamental.

Wrapping up

I hope you can see how polynomials naturally arise from the four basic arithmetic operations. Any such expression is equivalent to a quotient of two polynomials and any such expression without division is equivalent to a polynomial.

For those who know what complex numbers are, we will now discuss how complex numbers naturally arise from trying to solve polynomial equations.

Complex Numbers

As the tweet at the beginning of the post suggests, complex numbers arise when you try to find solutions to the equation , which doesn't have any solutions in the real numbers. But why and not , as the tweet suggests?

The first step to make things easier is to rephrase those equations as root finding problems of polynomials, by just subtracting one side from both sides. In our case that gives us and .

First things first, if you have ever wondered why we care about the roots of polynomials so much, it is just because of the simple observation that when trying to solve , where and are polynomials, we can instead solve , i.e. find the roots of , where is also still a polynomial. Secondly, if we accept the second equation, i.e. , then every number will just be equal to zero, because we can multiply both sides by that number and get . One way of looking at this is to say that we get a very boring number system with only 0, but you could also just say this is false, because we started with the real numbers and it is false there. If you define the polynomial , then in any kind of sensible number system you define, no matter what value you plug in for , you will still get 1. For example is still 1 in the complex numbers. So roots of non-zero constant polynomials still have no solutions, or in other words: Equations where the left and right hand sides only differ by a non-zero constant, still have no solution.

But okay, still, why and not, say, , which also doesn't have a solution in the real numbers? It turns out that it doesn't matter! Given any non-constant polynomial that doesn't have a solution in the real numbers, if you adjoin a hypothetical solution, say, , i.e. such that , then your new number system will also contain a solution to , which we could call , so essentially you will still get the complex numbers. In fact, every non-constant polynomial has a root in the complex numbers, which means that any equation, where the left and right hand sides are expressions using only basic arithmetic (and don't differ by a non-zero constant), has a solution. (In fact, it will have exactly as many solutions as the degree of the non-constant polynomial when counting with multiplicity). In fact, an even stronger claim is true: The polynomial (or equations) can contain complex numbers, i.e. the polynomial can have complex coefficients, e.g. , and it will still always have a root in the complex numbers. This fact is known as the fundamental theorem of algebra.

I hope this makes it clear that the complex numbers arise naturally when you try to solve any equations where the left and right hand sides are using only basic arithmetic, which is the same as finding the roots of polynomials.