Nonlinear Curve Fitting

Nonlinear curve fitting extends linear curve fitting to curves whose parameters appear in the function expression in arbitrary ways, not just linearly. Almost any function that can be expressed in closed form can be used for nonlinear curve fitting. With this increased power comes the drawback that it is more difficult to estimate the parameters. It may not be possible to guarantee the best solution is found.

Mathematical background

A nonlinear curve with parameters a1, a2..., an is a curve of the form

f(x) = f(a1, a2, ..., an; x).

Nonlinear least squares fitting follows the same general method as linear least squares. However, no closed form algebraic expressions exist for the solution. The solution must be found by using a general multidimensional optimization algorithm to find the minimum of the sum of the squares of the residuals.

Even though the objective function is not as simple as in the linear case, the fact that it equals a sum of squares can be exploited to obtain more efficient methods. The Extreme Optimization Mathematics Library for .NET offers two algorithms: the Levenberg-Marquardt method and the trust region reflective algorithm.

Nonlinear Curves

The optimization algorithms require that the partial derivatives of the fitted curve with respect to the curve parameters be available. However, a numerical approximation may be used if the closed form is not available. Moreover, in .NET 4.0 and higher, it may be possible to compute the derivatives using Automatic Differentiation.

It may also be advantageous to compute initial values for the parameters based on the data.

The NonlinearCurve class defines two methods, FillPartialDerivatives and SetInitialValues, that enable this functionality.

The Extreme.Mathematics.Curves.Nonlinear namespace defines a number of nonlinear curves. This includes exponential, rational and logistic curves. Details are discussed later.

You can define your own non-linear curve in two ways.

The first is available only in .NET 4.0 and above. The NonlinearCurve class has a static FromExpression method that uses Automatic Differentiation to calculate the derivative of the function and the partial derivatives with respect to the curve parameters. The first argument of the lambda is the x value. The remaining arguments correspond to the function parameters. The resulting curve object can be used directly for nonlinear curve fitting or other purposes.

The following code constructs a 3-parameter logistic curve:

C#
NonlinearCurve curve = NonlinearCurve.FromExpression(
    (x, a, b, c) => c / (1 + a * Math.Exp(b * x)));

The GetGeneratorFromExpression method is similar to The FromExpression(Expression<Func<Double, Double>>), but returns a function that creates instances of the curve object. This is useful in situations where more than one instance of the curve is required.

The second method gives you more control. You can define a class that inherits from NonlinearCurve and implement the ValueAt and FillPartialDerivatives methods. The latter takes two arguments. The first is the x value where the partial derivatives should be evaluated. The second is a Vector whose elements are, on exit, the partial derivatives with respect to each of the curve parameters. If you don't override FillPartialDerivatives, then a numerical approximation is used.

The example below defines a hyperbola 1 / (x + c), that takes one argument c.

C#
public class Hyperbola : NonlinearCurve
{
    public Hyperbola() : base(1)
    {
        Parameters[0] = 1;
    }

    override public double ValueAt(double x)
    {
        return 1 / (x + Parameters[0]);
    }

    override public double SlopeAt(double x)
    {
        x += Parameters[0];
        return -1 / (x * x);
    }

    public override void FillPartialDerivatives(double x, DenseVector<double> derivatives)
    {
        x += Parameters[0];
        derivatives[0] = -1 / (x * x);
    }
}

The curve constructor must call the base constructor with as its only argument the number of parameters of the curve. For calculations, the curve parameters are accessed through the Parameters collection.

Depending on the application, it may be useful to override other methods of the Curve class. For example, for the above curve, the Integral is easily calculated as Math.Log(x+c).

The NonlinearCurveFitter class

The NonlinearCurveFitter class performs a nonlinear least squares fit. To perform the fit, a NonlinearCurveFitter needs data points, and a curve to fit. You must set the Curve property to an instance of a NonlinearCurve object. This can be an instance of any of the classes in the Extreme.Mathematics.Curves.Nonlinear namespace, or you can supply your own, as discussed in the previous section.

The data is supplied as Vector objects. The XValues and YValues properties specify the X and Y values of the data points, respectively.

By default, the fit is unweighted. All weights are set equal to 1. Weights can be specified in one of two ways. The first way is to set the WeightVector property to a vector with as many elements as there are data points. Each component of the vector specifies the weight of the corresponding data point.

Alternatively, the WeightFunction can be used to calculate the weights based on the data points. The WeightFunctions class contains a number of predefined weight functions listed in the table below. In the descriptions, x and y stand for the x and y values of each data point.

Field

Description

OneOverX

The weight equals 1/x.

OneOverXSquared

The weight equals 1/x2.

OneOverY

The weight equals 1/y.

OneOverYSquared

The weight equals 1/y2.

If error estimates for the Y values are available, then the GetWeightVectorFromErrors of the WeightFunctions class can be used to construct a vector with the weights that correspond to the errors. Its only argument is a Vector that contains the error for each observation.

The algorithm that is used to compute the least squares solution can be selected by setting the the Method. By default, the Levenberg-Marquardt method is used, but the trust region reflective algorithm is also available. The Optimizer property gives access to the corresponding multidimensional optimizer object. You can set tolerances and control convergence as set out in the chapter on optimization.

The Fit method performs the actual least squares calculation, and adjusts the parameters of the Curve to correspond to the least squares fit. The BestFitParameters property returns a Vector containing the parameters of the curve. The GetStandardDeviations returns a Vector containing the standard deviations of each parameter as estimated in the least squares calculation.

The Optimizer property also gives access to details of the calculation. It is useful to inspect its Status property to make sure the algorithm ended normally. A value other than Converged indicates some kind of problem with the algorithm.

Example: Fitting a Dose Response Curve

In this example, we fit dose response data to a four parameter logistic curve.

C#
Vector<double> dose = Vector.Create<double>(/*...*/);
Vector<double> response = Vector.Create<double>(/*...*/);
Vector<double> error = Vector.Create<double>(/*...*/);
FourParameterLogisticCurve doseResponseCurve
    = new FourParameterLogisticCurve();
NonlinearCurveFitter fitter = new NonlinearCurveFitter();
fitter.Curve = doseResponseCurve;
fitter.XValues = dose;
fitter.YValues = response;
fitter.WeightVector = WeightFunctions.GetWeightVectorFromErrors(error);
fitter.Fit();
Vector<double> s = fitter.GetStandardDeviations();
Console.WriteLine("Initial value: {0,10:F6} +/- {1:F4}",
    doseResponseCurve.InitialValue, s[0]);
Console.WriteLine("Final value:   {0,10:F6} +/- {1:F4}",
    doseResponseCurve.FinalValue, s[1]);
Console.WriteLine("Center:        {0,10:F6} +/- {1:F4}",
    doseResponseCurve.Center, s[2]);
Console.WriteLine("Hill slope:    {0,10:F6} +/- {1:F4}",
    doseResponseCurve.HillSlope, s[3]);

Two things are worth noting. One is the use of the GetWeightVectorFromErrors method to transform the errors of the observations into their corresponding weights. Also, the FourParameterLogisticCurve class exposes properties that name the curve properties. Many of the predefined curves define such named parameters.