Solving Kepler's Equation with Halley's Method
Edmond Halley is best known for the comet that was named after him, but he was also a brilliant mathematician and astronomer. In 1694, he published Methodus nova accurata & facilis inveniendi radices æqationum quarumcumque generaliter, sine praviæ reductione. In this work, he described a method for solving equations that is now known as Halley’s method.
It is similar to Newton’s method, but converges faster.
Solving Equations in One Variable
Solving an equation is one of the most basic problems in mathematics. Given a function \(f\), we want to find the value of \(x\) such that \(f(x) = 0\). Some equations can be solved analytically, but most cannot. In these cases, we must resort to numerical methods. These methods are not new to the computer era. Many of them date back centuries.
The most basic method is the bisection method. It is simple and robust, but slow. It also requires that you have an interval that is known to contain a solution. The method works by repeatedly dividing the interval and checking which half contains the solution.
Newton’s method, which was put into its modern form by Joseph Raphson in 1690 starts from only an initial guess and uses the derivative of the function to find the solution.
Order of Convergence
The order of convergence of an algorithm is a measure of how quickly it converges to the solution. The higher the order, the more quickly the algorithm converges once it is close enough to the solution.
Newton’s method is a second-order algorithm. This means that with each iteration, the number of correct digits in the solution roughly doubles. More explicitly, in going to the next iteration, the new error is the second power of the previous error. It is also said that the algorithm converges quadratically.
By contrast, the bisection method is a first-order algorithm. In each iteration we gain one binary digit. Other methods exist that do better. Two such methods, which show super-linear convergence, are the secant method (order 1.618) and the Dekker-Brent method (order 1.839).
Neither method reaches the second order speed of Newton’s method. Newton’s method achieves its quadratic convergence by taking advantage of the first derivative of a function. Which begs the question: can we do better?
Of course we can.
Halley’s method is a third-order method that is similar to Newton’s method, but it takes advantage of the first and second derivatives of a function. This allows it to converge even faster than Newton’s method. At each iteration, the number of correct digits in the solution roughly triples.
Trade Offs
At first glance, it might seem that Halley’s method is always the best choice. After all, who wouldn’t want a method that converges faster? However, there are trade-offs.
First, in nearly all cases, we don’t need to calculate that many digits of the solution. The cubic convergence only really kicks in when the current approximation is close enough to the real solution. So the gains aren’t necessarily that great in practice.
Second, Halley’s method requires the second derivative of the function. This can be computationally expensive to calculate, especially for complicated functions. Any remaining gains can be easily wiped out by this extra cost.
Even the cost of computing the first derivative can be prohibitive. Other methods like Brent’s method don’t require derivatives at all, and are often more efficient.
Where Halley’s method really shines is when the function is well-behaved and the second derivative can be easily computed. For example, many so-called special functions, like the Bessel functions, satisfy a second -order differential equation. Therefore, the second derivative is a simple expression of the function itself and its first derivative and can be computed very cheaply.
Solving Kepler’s Equation with Numerics.NET
Another example of a suitable function is Kepler’s equation. This is an equation that arises in celestial mechanics, and is used to calculate the position of a planet, comet, or asteroid in its orbit.
Kepler’s Equation
The equation reads:
\[M = E - e \sin E\]where \(M\) is the mean anomaly in radians, \(E\) is the eccentric anomaly in radians, and \(e\) is the eccentricity.
The exact meaning of these numbers isn’t important for our purposes. For orbits of objects in the solar system, the eccentricity ranges from close to zero to almost 1. For example, the eccentricity of the Earth’s orbit is about 0.0167. The eccentricity of a comet’s orbit can be much higher. For Halley’s comet, it is about 0.967!
\(M\) and \(e\) are known, so the equation must be solved for \(E\). It looks like we can use \(M\) as an initial guess, especially when \(e\) is small.
Formally, we want to solve \(f(E) = 0\), where
\[f(E) = E - e \sin E - M.\]Applying Halley’s Method
Taking the derivative of \(f\) with respect to \(E\), we quickly get
\[f'(E) = 1 - e \cos E\]and
\[f''(E) = e \sin E\]This is perfect! The most expensive part of the calculation is the sine and cosine functions, and we only need to calculate them once per iteration. Compared to Newton’s method, we’re doing a little bit more algebra, but that’s relatively minor.
Solving the Equation with Numerics.NET
Here’s how we can solve Kepler’s equation using Halley’s method in Numerics.NET. First, we get the orbital parameters for Halley’s comet from the Jet Propulsion Laboratory. Then we calculate the mean anomaly \(M\) for the date of the comet’s most recent closest approach to earch, now almost 40 years ago on April 10, 1986:
double e = 0.9679221169240834;
DateTime epoch = new DateTime(1968, 02, 21, 0, 0, 0, DateTimeKind.Utc);
DateTime closest = new DateTime(1986, 4, 10);
double daysSinceEpoch = (closest - epoch).TotalDays;
double Mdegrees = 274.8113481510388 + 0.01298172464034513 * daysSinceEpoch;
double M = Mdegrees * Constants.DegreesToRadians;
Halley’s method is implemented by the HalleySolver
class in Numerics.NET.
We create an instance of this class:
using Numerics.NET;
using Numerics.NET.EquationSolvers;
HalleySolver halley = new HalleySolver();
Now we’re ready to define Kepler’s equation. We have a few options here. We can define the function and its first and second derivatives as separate delegates, or we can specify one delegate that returns all three values at once in a tuple. Because we want to optimize our calculations, we use the latter method:
Func<double, (double, double, double)> keplerEquation = E =>
{
var (sinE, cosE) = Math.SinCos(E);
var f = E - e * sinE - M;
var fPrime = 1 - e * cosE;
var fDoublePrime = e * sinE;
return (f, fPrime, fDoublePrime);
};
halley.TargetFunctionWithDerivatives = keplerEquation;
By default, equation solvers in Numerics.NET use a tolerance of around \(10^{-8}\). This means that the solver will stop when the error, estimated by its last correction, is less than that value. In practice, however, a lower tolerance is often sufficient. Here’s why.
Remember that Halley’s method convergences cubically. The correction at each iteration is roughly equal to the cube of the previous correction. This gives us a prediction of the size of the correction in the next step. If this predicted correction is smaller than the tolerance, we can stop the iteration early.
So, in this case, we’ll set the tolerance to the surprisingly effective value of \(10^{-3}\), which should give us at least 8 digits of accuracy:
halley.RelativeTolerance = 1e-3;
Finally, we call the Solve
method of the
HalleySolver
class and specify the mean anomaly as the initial guess:
halley.InitialGuess = M;
double EHalley = halley.Solve();
Console.WriteLine($"E = {EHalley} ({halley.IterationsNeeded} iterations)");
We can compare the performance of Halley’s method to Newton’s method, which is implemented by the NewtonRaphsonSolver class:
NewtonRaphsonSolver newton = new NewtonRaphsonSolver();
newton.TargetFunctionWithDerivative = E =>
{
var (sinE, cosE) = Math.SinCos(E);
var f = E - e * sinE - M;
var fPrime = 1 - e * cosE;
return (f, fPrime);
};
newton.RelativeTolerance = 3e-5;
newton.InitialGuess = M;
double ENewton = newton.Solve();
Console.WriteLine($"E = {ENewton} ({newton.IterationsNeeded} iterations)");
The initial guess
There is one final twist. It’s a lesson that is useful to keep in mind any time we use an iterative algorithm to find a solution to a problem, whether it’s solving an equation, fitting a curve, or some other optimization problem.
In nearly all cases, of all the factors that determine the effectiveness of a method, the most important one is the quality of the initial guess.
Using the mean anomaly as the initial guess for Kepler’s equation works fine in most situations, but it can produce poor results when the eccentricity is very close to 1. In such cases, the successive approximations can bounce around chaotically for some time until they finally stumble upon a value that is close enough to start converging.
Many techniques have been proposed to improve the initial guess for Kepler’s equation, some better than others. In 1987, the Finnish astronomer Seppo Mikkola came up with a relatively short method. We will give the code here without much further comment.
double alpha = (1 - e) / (4 * e + 0.5);
double beta = 0.5 * M / (4 * e + 0.5);
double sqrt = Math.Sqrt(Elementary.Pow(beta, 2) + Elementary.Pow(alpha, 3));
double z = Math.Cbrt(beta + Math.CopySign(sqrt, beta));
double s = z - alpha / z;
s -= 0.078 * Elementary.Pow(s, 5) / (1 + e);
double E0 = M + e * s * (3 - 4 * s * s);
With this initial guess, Halley’s method only needs one iteration!
Conclusion
Halley’s method as implemented in Numerics.NET offers another approach to solving equations in one variable that is particularly well-suited to a class of problems where the first and second derivative of the function can be computed efficiently.