Fix some small positive . Say we want to approximate (i.e. the th derivative of at ), using the distinct samples , , . (Usually all will be integers, but this isn’t critical.) Specifically, we’d like to find coefficients so that
Each sample can be expanded in a Taylor series about .
So the linear combination of samples can be written as
The only term of the outer sum that contains is . To ensure that its coefficient is one, we impose
In an ideal world, for all , we could impose
This would make our approximator perfect (i.e. not an approximation at all). But we can’t expect to satisfy an infinite number of equations with only unknowns, so we have to restrict the that we apply it to. In particular, as long as , we can impose this constraint for all (non-negative) such that and (since then we have unknown coefficients and equations). Then our approximation expands as
where
At first glance, this remainder appears to be order in . But the coefficients we solved for could have factors of that change this. Indeed, they’ll usually each have a factor of . So the remainder ends up being order , which is at least one (recall that so that the system isn’t overdetermined). More often than you might expect, we get lucky and for some , which makes the order greater than . (Try using and .) To guarantee that we get a higher order approximation, we can always add more samples (thus increasing , and the number of constraints we can impose).
The constraints that we imposed form a system of linear equations that we can write as
where , are the unknown coefficients as defined above, and
Note that is a matrix, and both and are vectors of length .
The rows of will always be linearly independent. To prove this, consider the polynomial
As long as at least one is nonzero, is a nonzero polynomial of order at most , and thus has at most distinct zeros. Thus there is no set of coefficients such that for all samples. In other words,
for at least one sample , as long as some is nonzero. This is the definition of linear indedence for the rows of . As such the system will always be consistent. It can be solved using an LU decomposition.
The same basic theory works in the multivariate case, the algebra just gets a little more complex. Let be a function of variables. Given a multi-index in , define
Then the multivariate Taylor expansion of a sample about (where denotes the element-wise or Hadamard product) is
We can restrict this to a certain order by restricting the order of .
So the linear combination of samples becomes
And to approximate a particular mixed partial derivative we impose the constraints
and, for as many as possible,
Finally, if instead of approximating a single mixed partial derivative, we’d like to approximate a linear combination of them, we can encode this directly into our scheme. This is useful, for example, for approximating the Laplacian of a function.
At the end of the day, we’re still solving a system of linear equations that we can write as
where are the unknown coefficients. But defining and is a little trickier.
In particular, in the univariate case, to constrain derivatives zero through , we needed to satisfy equations. But in the multivariate case, the number of mixed partial derivatives of order grows nonlinearly. In fact, the number of multi-indices in of order is the binomial coefficient
And the number of multi-indices with order less than or equal to is
Additionally, the multi-index exponentiation introduces a new way for the rows of to be linearly dependent.
To get around these issues, we manually choose multi-indices corresponding to the mixed partial derivatives that we want to constrain. One of these should be , certainly (the derivative we want to approximate). And the approximation is useless unless all mixed partials of order less than or equal to are constrained to be zero, so they need to be in the mix as well.
Then
and
If the resulting matrix is singular, just swap out one constraint for another. Note that you should also pay attention to your : if the values are the same for two linearly dependent rows of , then that constraint will still be satisfied even after it is removed (the system was underdetermined). But if the values of are different, then your system is inconsistent. This likely means you’re not respecting some symmetry of the problem (assuming you haven’t made a typo).
Luckily, beyond this annoyance, having be singular is actually helpful for us. In the general case, we’d need choose samples in order to constrain all derivatives up to order . But if these equations end up linearly dependent, it means they’ll all be satisfied and we only need to include a subset of them in . This means we can include some higher order derivatives as well, or reduce the number of samples.
To stay general, I will only consider the multivariate case.
First, an obvious point that’s worth stating explicitly. The coefficients in an approximation
do not depend on , so we can move arbitrarily. This is essential, since it means we can apply the same approximation to every point in a discretized PDE.
Second, recall that the expansion of an approximation can be written as
So the order of the approximation can be found by finding the lowest order such that
However this doesn’t give you the order of the error directly, since the coefficients will have powers of in their denominators. So you need to divide it out to find the actual order of the error.
The leading error term is (again, consider the powers of in the coefficients)
Note that there may be multiple of the order of the error; you may want to sum over them.