Solving the cubic equation

While developing a graphical presentation of the Elliptic Curve function, I needed to discover how to find the roots of a cubic equation. There are two major solutions, one by Gerolamo Cardano, from 1545, and another based on trigonometry and de Moivre's theorem. The two solutions are needed for differing values of the coefficients in the equation.

Cardano's solution

This exposition is based on Cardano and the Solution of the Cubic, by Bryan Dorsey, Kerry-Lyn Downie, and Marcus Huber (University of Kentucky), and Wikipedia, but showing the working that they left out.

Given an initial cubic equation:

$$a x^3 + b x^2 + c x + d = 0$$

we substitute $x = t - b / 3 a$ to get a reduced (or depressed) equation of the form: $t^3 + p t + q = 0$, where $p = (3 a c - b^2)/3a^2$ and $q = (2 b^3 - 9 a b c + 27 a^2 d)/27 a^3$.

$$ a \left( t - \frac{b}{3a}\right)^3 + b \left( t - \frac{b}{3a}\right)^2 + c \left( t - \frac{b}{3a}\right) + d = 0 \\ a \left( t^3 - 3 t^2 \left(\frac{b}{3a}\right) + 3 t \left(\frac{b}{3a}\right)^2 - \left(\frac{b}{3a}\right)^3 \right) + b \left( t^2 - 2 \frac{b}{3a}t + \left(\frac{b}{3a} \right)^2 \right) + c \left(t - \left(\frac{b}{3a}\right) \right) + d = 0 \\ a t^3 - b t^2 + \frac {b^2}{3 a} t - \frac{b^3}{27a^2} + b t^2 -2 \frac{b^2 t}{3a} +\frac{b^3}{9a^2} + c t - \frac{b c}{3a} + d = 0 \\ a t^3 + \left( \frac{b^2}{3 a} -2\frac{b^2}{3 a} + c \right) t - \frac{b^3}{27a^2} +\frac{b^3}{9a^2} - \frac{b c}{3 a} + d = 0 \\ t^3 + \left( \frac{b^2}{3 a^2} -2\frac{b^2}{3 a^2} + \frac{c}{a} \right) t - \frac{b^3}{27a^3} +\frac{b^3}{9a^3} - \frac{b c}{3 a^2} + \frac{d}{a} = 0 \\ t^3 + \frac{3 a c - b^2}{3 a^2} t + \frac{2 b^3 - 9 a b c +27 a^2 d}{27a^3} = 0 \\ \boxed{ t^3 + p t + q = 0 } $$

Then substitute $t = u + v$ to get: $$\begin{align} (u + v)^3 + p (u + v) + q &= 0 \\ u^3 +3 u^2 v + 3 u v^2 + v^3 + p (u + v) + q &= 0 \\ u^3 + v^3 + (3 u v + p)(u + v) + q &= 0 \end{align}$$

Now set $3 u v + p = 0$, giving $u v = -p/3$ and $u^3 + v^3 + q = 0$, to get:

$$\begin{align} u^3 + v^3 &= -q \\ u^3 v^3 &= - \left( \frac{p}{3} \right)^3 \end{align} $$

Substitute for $u^3$: $$ u^3 = -q - v^3 \\ (-q - v^3) v^3 = - \left( \frac{p}{3} \right)^3 \\ (q + v^3) v^3 - \left( \frac{p}{3} \right)^3 = 0 \\ v^6 + q v^3 - \left( \frac{p}{3} \right)^3 = 0 $$

Although this is a sextic equation in $v$, it is actually only quadratic in $v^3$, so we can use the standard quadratic solution: $$\begin{align} v^3 &= \frac{ - q \pm \sqrt{q^2 - 4 \left( - \frac{p}{3} \right)^3 }}{2} \\ &= -\frac{q}{2} \pm \sqrt{ \left( \frac{q}{2} \right)^2 + \left( \frac{p}{3} \right)^3 } \\ &= -\frac{q}{2} \pm \sqrt{ \Delta } \\ & \text{where } \Delta = \left( \frac{q}{2} \right) ^2 + \left( \frac{p}{3} \right) ^3 \end{align}$$

Since $u^3 + v^3 = - q$, we also have:

$$\begin{align} u^3 &= - q - v^3 \\ &= -q - \left( - \frac{q}{2} \pm \sqrt{ \Delta } \right) \\ &= -\frac{q}{2} \mp \sqrt{ \Delta } \end{align}$$

Since $t = u + v$, Cardano's solution is now:

$$\ \boxed{ t_0 = \sqrt[3]{- \frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{ - \frac{q}{2}-\sqrt{\Delta}}} $$

This result is only valid when the square root of the discriminant can be taken; that is, when $\Delta \geq 0$.

Show that Cardano's solution is a root

When we substitute the above result for $t_0$ into the original (reduced) equation, we get:

$$\begin{align} t_0^3 + p t_0 + q = &\left( \sqrt[3]{ -\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{ -\frac{q}{2}-\sqrt{\Delta}} \right)^3 + p \left( \sqrt[3]{ -\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{ -\frac{q}{2}-\sqrt{\Delta}} \right) + q \\ \text{Expand using the binomial expansion: } (A + B)^3 = A^3 + 3 A^2 B + 3 A B^2 + B^3 \\ = &\left( -\frac{q}{2}+\sqrt{\Delta} \right) + 3 \left( \sqrt[3]{-\frac{q}{2}+\sqrt{\Delta} } \right)^2 \left( \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta} } \right) + 3 \left( \sqrt[3]{-\frac{q}{2}+\sqrt{\Delta} } \right) \left( \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta} } \right)^2 \\ &+ \left( -\frac{q}{2}-\sqrt{\Delta} \right) + p \left( \sqrt[3]{-\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta}} \right) + q \\ \text{The unrooted terms in } q \text{ cancel out.} \\ = & 3 \left( \sqrt[3]{-\frac{q}{2}+\sqrt{\Delta} } \right)^2 \left( \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta} } \right) + 3 \left( \sqrt[3]{-\frac{q}{2}+\sqrt{\Delta} } \right) \left( \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta} } \right)^2 + p \left( \sqrt[3]{-\frac{q}{2}+\sqrt{\Delta} } + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta} } \right) \\ \text{Bring the power terms inside the roots: } (\sqrt[3]{A})^2 \sqrt[3]{B} = \sqrt[3]{A^2 B} \\ = & 3 \sqrt[3]{\left(-\frac{q}{2}+\sqrt{\Delta} \right)^2 \left(-\frac{q}{2}-\sqrt{\Delta} \right) } + 3 \sqrt[3]{\left(-\frac{q}{2}+\sqrt{\Delta} \right) \left(-\frac{q}{2}-\sqrt{\Delta} \right)^2 } + p \left(\sqrt[3]{-\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta}} \right) \\ \text{Using the “difference of squares” identity: } (A + B)(A - B) = A^2 - B^2 \\ = & 3 \sqrt[3]{\left(-\frac{q}{2}+\sqrt{\Delta} \right) \left( \left( \frac{q}{2} \right)^2 -\Delta \right) } + 3 \sqrt[3]{\left( \left( \frac{q}{2} \right)^2 -\Delta \right) \left(-\frac{q}{2}-\sqrt{\Delta} \right) } + p \left(\sqrt[3]{-\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta}} \right) \\ \text{By definition of } \Delta \text{: } (q/2)^2 - \Delta =(-p/3)^3 \\ = & 3 \sqrt[3]{\left(-\frac{q}{2}+\sqrt{\Delta} \right) \left( \frac{-p}{3} \right)^3 } + 3 \sqrt[3]{ \left( \frac{-p}{3} \right)^3 \left(-\frac{q}{2}-\sqrt{\Delta} \right) } + p \left(\sqrt[3]{-\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta}} \right) \\ \text{Bring the cubed terms outside the roots: } \\ = & 3 \left(\frac{-p}{3} \right) \sqrt[3]{\left(-\frac{q}{2}+\sqrt{\Delta} \right) } + 3 \left(\frac{-p}{3} \right) \sqrt[3]{\left(-\frac{q}{2}-\sqrt{\Delta} \right) } + p \left(\sqrt[3]{-\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta}} \right) \\ = & - p \sqrt[3]{\left(-\frac{q}{2}+\sqrt{\Delta} \right) } - p \sqrt[3]{\left(-\frac{q}{2}-\sqrt{\Delta} \right) } + p \left(\sqrt[3]{-\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta}} \right) \\ = & - p \left(\sqrt[3]{-\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta}} \right) + p \left(\sqrt[3]{-\frac{q}{2}+\sqrt{\Delta}} + \sqrt[3]{-\frac{q}{2}-\sqrt{\Delta}} \right) \\ = & 0 \end{align}$$

This confirms that the Cardan formula is indeed a root.

The other two roots

The above process obtains the major root of the cubic equation, but there are two other roots. These are real if $\Delta \leq 0$, but complex if $\Delta > 0$. When $\Delta = 0$ at least two of the real roots are equal. The easiest way to find these roots is with a trigonometric identity and de Moivre's theorem. This is described in Solving the Cubic Equation and Roots of Polynomials. Solution of Cubic Equation with Three Real Roots by Henry Baker explains the trigonometry and has a nice animated GIF to show it graphically.

The trigonometric solution

The trigonometric solution depends of the triple angle trigonometric identity:

$$\cos ( 3 \theta) = 4 \cos^3 \theta - 3 \cos \theta \text{, or } \\ 4 \cos^3 \theta - 3 \cos \theta - cos(3 \theta) = 0 $$

As the Henry Baker paper states, this is a bit of a deus ex machina, or something up my sleeve, which we will have to take on trust, for now (but will be proved later). If we set $t = \lambda \cos \theta$ in the reduced equation: $$\begin{align} t^3 + p t + q &= 0 \\ (\lambda \cos \theta)^3 + p (\lambda \cos \theta) + q &= 0 \\ \lambda^3 \cos^3 \theta + p \lambda \cos \theta + q &= 0 \\ 4 \lambda^3 \cos^3 \theta + 4 p \lambda \cos \theta + 4 q &= 0 \\ 4 \cos^3 \theta + \frac{4 p}{\lambda^2} \cos \theta + \frac{4 q} {\lambda^3} &= 0 \\ \text{Now set } 4 p/\lambda^2 = - 3 \text{, or } \lambda = \sqrt{- \frac{4 p}{3}} \text{, to get:} \\ 4 \cos^3 \theta + 3 \cos \theta + \frac{4 q} {\left(\sqrt{- \frac{4 p}{3}} \right)^3} &= 0 \\ 4 \cos^3 \theta + 3 \cos \theta + \frac{4 q} {\sqrt{- (\frac{4 p}{3})^3}} &= 0 \\ 4 \cos^3 \theta + 3 \cos \theta + \frac{4q} {(\frac{4 p}{3}) \sqrt{- (\frac{4 p}{3}) }} &= 0 \\ 4 \cos^3 \theta + 3 \cos \theta + \frac{q} {( \frac{p}{3}) \sqrt{- (\frac{4 p}{3}) }} &= 0 \\ 4 \cos^3 \theta + 3 \cos \theta + \frac{q} {( \frac{2 p}{3}) \sqrt{- \frac{p}{3} }} &= 0 \\ 4 \cos^3 \theta + 3 \cos \theta + \frac{3 q}{2 p}\sqrt{-\frac{3}{p}} &= 0 \\ \end{align}$$

This is equivalent to the triple angle identity if:

$$\begin{align} \cos ( 3 \theta ) & = \frac{3 q}{2 p} \sqrt{ - \frac{3}{p}} \end{align}$$

Now, for any angle $\alpha$,

$$\begin{align} \cos ( 3 \alpha ) & = \cos (3 \alpha + \tau ) = \cos (3 ( \alpha + \tau / 3)) \\ & = \cos (3 \alpha + 2 \tau ) = \cos (3 ( \alpha + 2 \tau / 3)) \\ & \quad \text{ where } \tau \text{ is } \href{http://www.tauday.com/tau-manifesto}{\text{one turn}} \text{, also known as } 2 \pi . \end{align}$$

So we can say: $$\begin{align} 3 \theta & = \arccos \left( \frac{3 q}{2 p} \sqrt{ - \frac{3}{p}} \right) - k \tau \text{, for } k = 0,1,2 \\ \theta & = \frac{1}{3} \arccos \left( \frac{3 q}{2 p} \sqrt{ - \frac{3}{p}} \right) - k \tau / 3 \text{, for } k = 0,1,2 \\ \end{align}$$

Recall that we had originally set $t = \lambda \cos \theta$. Substituting for $\lambda$ and $\theta$ gives the three roots: $$ \boxed{t_k = \sqrt{- \frac{4 p}{3}} \cos \left( \frac{1}{3} \arccos \left( \frac{3 q}{2 p} \sqrt{ - \frac{3}{p}} \right) - \frac{k \tau}{3} \right) \text{ for } k = 0,1,2 } $$

This is only valid when the argument of the $\arccos$ function lies within the range $ [ -1, +1 ] $: $$ -1 \leq \frac{3 q}{2 p} \sqrt{ - \frac{3}{p}} \leq + 1 \\ -1 \leq \sqrt{ - \frac{27 q^2 }{4 p^3} } \leq + 1 $$

The value within the square root must be positive, so, since $q^2$ is positive, $p$ (and $p^3$) must be negative. $$ 0 \leq - \frac{27 q^2 }{4 p^3} \leq 1 $$

Multiply by $-4 p^3 $, which is positive, so preserves the direction of the inequality:

$$ 0 \leq 27 q^2 \leq - 4 p^3 $$

Divide by $27 \times 4$:

$$ 0 \leq \frac{q^2}{4} \leq -\frac{p^3}{27} $$

Add $p^3/27$:

$$ \frac{p^3}{27} \leq \frac{q^2}{4} + \frac{p^3}{27} \leq 0 \\ \left( \frac{p}{3} \right)^3 \leq \left( \frac{q}{2} \right)^2 + \left( \frac{p}{3} \right)^3 \leq 0 $$

The final inequality is just a restatement of the discriminant condition: $\Delta \leq 0$: $$ \left( \frac{q}{2} \right)^2 + \left( \frac{p}{3} \right)^3 \leq 0 $$

Derivation of the triple angle identity

The triple angle identity can be obtained from de Moivre's theorem, using complex algebra:

$$\begin{align} \cos ( 3 \theta ) &= \frac{1}{2} ( e^{3 i \theta} + e^{-3 i \theta}) \\ &= \frac{1}{2} (( \cos \theta + i \sin \theta )^3 + ( \cos \theta - i \sin \theta )^3 ) \\ &= \frac{1}{2} ( ( \cos^3 \theta + 3 i cos^2 \theta \sin \theta + 3 i^2 \cos \theta \sin^2 \theta + i^3 \sin^3 \theta ) + ( \cos^3 \theta - 3 i cos^2 \theta \sin \theta + 3 i^2 \cos \theta \sin^2 \theta - i^3 \sin^3 \theta ) ) \\ &= \frac{1}{2} ( ( \cos^3 \theta + 3 i cos^2 \theta \sin \theta - 3 \cos \theta \sin^2 \theta - i \sin^3 \theta ) + ( \cos^3 \theta - 3 i cos^2 \theta \sin \theta - 3 \cos \theta \sin^2 \theta + i \sin^3 \theta ) ) \\ &= \frac{1}{2} ( 2 \cos^3 \theta - 6 \cos \theta \sin^2 \theta ) \\ &= \cos^3 \theta - 3 \cos \theta \sin^2 \theta \\ &= 4 \cos^3 \theta - 3 ( \cos^3 \theta + \cos \theta \sin^2 \theta ) \\ &= 4 \cos^3 \theta - 3 ( \cos \theta (\cos^2 \theta + \sin^2 \theta )) \\ &= 4 \cos^3 \theta - 3 \cos \theta \end{align}$$