Evaluating and Solving Kernel Operators with Eigen

Creating Dense Operators
Creating Sparse Operators
Creating Chebyshev Operators
Creating Fast Multipole Method Operators
Creating Hierarchical Matrix Operators
Block Operators
Iterative Solvers

Given that Aboria can describe non-linear operators, this naturally covers linear operators (i.e. matrices) as well. Consider the summation operator given by the kernel function $K(x_i,x_j)$, over a set of $N$ particles

$$ a_i = \sum_j^N b_j K(x_i,x_j) \text{ for } i=1..N $$

This is a common enough operator that can be used in many areas. If $K(x_i,x_j) = 1/(x_j-x_i)$, then the operator might be calculating the force on a set of charged particles via a Coulomb force. If $K(x_i,x_j) = \sqrt{(x_j-x_i)^2 + c^2}$, then the operator might be used for function interpolation using the multiquadric basis function.

One way to evaluate this operator is to use a matrix to store the values of $K(x_i,x_j)$ for each particle pair, leading to a matrix $\mathbf{K}$ with storage size $N^2$. Then the summation operator above is equivalent to the matrix-vector product

$$ \mathbf{a} = \mathbf{K} \mathbf{b}. $$

However, $\mathbf{K}$ could be too large to fit in memory, or the values of $x_i$ and $x_j$ might change too frequently for this to be useful. Or you may wish to take advantage of the fact that $K(x_i,x_j)$ is a continuous function and use method such as Chebyshev interpolation or Fast Multipole methods to efficiently calculate the action of the operator on the vector $\mathbf{b}$. For any of these reasons and many others, Aboria can help you out.

Aboria provides functionality to describe linear operators arising from the evaluation of kernel functions at a set of points (i.e. particles in Aboria's terminology) in $N$ dimensional space. From now on we will refer to these types of operators as kernel operators. To provide the concept and API of a matrix or linear operator, we will use the C++ linear algebra library Eigen. Aboria provides functionality to wrap kernel operators as Aboria::MatrixReplacement, so that Eigen can treat them as normal dense or sparse matrices.

Using this wrapping, we can take advantage of Eigen to:

calculate the action of a kernel operator on a vector (i.e. a matrix-vector multiplication) # use Eigen's iterative solvers to solve a large linear system of equations arising from a kernel operator.

Creating Dense Operators

The most general case involves a kernel operator $K$ that is non-zero for every possible particle pair. The kernel function can depend on the particle positions and/or the variables assigned to each particle. For example, say we had a particle set with particle positions $\mathbf{x}_i$ for $i=1..N$, and a single variable $a_i$. We wish to create a summation operator using the kernel function

$$ K(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j) = \frac{a_i a_j}{||\mathbf{x}_j-\mathbf{x}_i|| + \epsilon} $$

were $||.||$ refers to the 2-norm, or magnitude of a vector.

First we need a particle set to apply the operator to. We will create a particle set containing $N=100$ particles with a single additional variable $a$.

const size_t N = 100;
const double epsilon = 0.1;
ABORIA_VARIABLE(a, double, "a");
typedef Particles<std::tuple<a>> particle_type;
typedef particle_type::position position;
particle_type particles(N);
std::default_random_engine gen;
std::uniform_real_distribution<double> uniform(0, 1);
for (size_t i = 0; i < N; ++i) {
  get<position>(particles)[i] =
      vdouble3(uniform(gen), uniform(gen), uniform(gen));
  get<a>(particles)[i] = uniform(gen);
}

For convenience, we will also define a constant reference type to refer to each particle in the container

typedef particle_type::const_reference const_particle_reference;

We then create a dense kernel operator using the Aboria::create_dense_operator function

auto K = create_dense_operator(
    particles, particles,
    [epsilon](const_particle_reference i, const_particle_reference j) {
      const auto dx = get<position>(j) - get<position>(i);
      return (get<a>(i) * get<a>(j)) / (dx.norm() + epsilon);
    });

Note that Aboria::create_dense_operator takes three arguments. The first two are particle containers which give the two particle sets involved in the operator. The first container holds the particles indexed by $i$ in the kernel function, and the second holds the particles indexed by $j$. For a matrix representation, you might say that these refer to the rows and columns of the matrix.

The third argument to Aboria::create_dense_operator can be a function object, or C++ lambda expression. Basically any valid C++ object that can be called with two arguments, the first of type const_particle_reference (i.e. a constant reference to a particle in the set indexed by $i$), and the second of type const_particle_reference (i.e. a constant reference to a particle in the set indexed by $j$). Note that in this case the particle sets indexed by $i$ and $j$ is the same particle set particles. However, in other cases you may want $i$ and $j$ to index different particle sets, in which case the types for arguments 1 and 2 could be different.

In the code above, we are using a lambda expression as our function object, and create one that returns the particular kernel function $K(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j)$.

Once we have created the operator K, we can use it within Eigen as if it were a normal matrix. For example, to apply K to a vector b, we could write the following

Eigen::VectorXd b = Eigen::VectorXd::LinSpaced(N, 0, 1.0);
Eigen::VectorXd c_1 = K * b;

This line of code calculates the following

$$ c_i = \sum_j^N b_j K(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j) \text{ for } i=1..N $$

Note that rather then storing all the values of K that are needed for this summation, Aboria will instead evaluate these values as they are needed. Therefore the memory requirements are only $\mathcal{O}(N)$, rather than $\mathcal{O}(N^2)$ for a traditional matrix. However, this requires evaluating the kernel function for each pair of particles at a cost of $\mathcal{O}(N^2)$. If you wish to calculate this operation approximately, then Aboria can perform the same operation using the Fast Multipole Method or Hierarchical Matrices, please see subsequent sections for more details on how to do this.

	Caution
	the `Aboria::MatrixReplacement` operator `K` cannot be used, for example, in multiplications or additions with other Eigen matrices. Thus far, it has only really been tested with matrix-vector multiplication and Eigen's iterative solvers

If we wish to perform the same operator, but using a traditional matrix, we can use K's Aboria::MatrixReplacement::assemble member function to fill in a normal Eigen matrix with the values of the kernel function $K(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j)$. This might be useful if you wish to perform the same operation repeatedly, or in one of Eigen's direct solvers.

Eigen::MatrixXd K_eigen(N, N);
K.assemble(K_eigen);

Eigen::VectorXd c_2 = K_eigen * b;

Creating Sparse Operators

Is is common in particle-based methods that the kernel function $K$ be non-zero only for particle pairs separated by less than a certain radius. In this case we have a summation operation like so

$$ c_i = \sum_j^N b_j K_s(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j) \text{ for } i=1..N $$

where $K_s$ is a truncated version of $K$ that is only non-zero for $||\mathbf{dx}_{ij}||<r$, where $\mathbf{dx}_{ij}$ is the shortest vector between particles $i$ and $j$. Note that for non-periodic systems, this will be $\mathbf{dx}_{ij}=\mathbf{x}_j-\mathbf{x}_i$.

$$ K_s(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j) = \begin{cases} K(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j), & \text{for } ||\mathbf{dx}_{ij}||<r \\0 & \text{otherwise}. \end{cases} $$

Since the summation is only non-zero for $||\mathbf{dx}_{ij}||<r$, we wish to aim for better than $\mathcal{O}(N^2)$ time and combine the sum with a spatial search of radius $r$.

Lets assume that we wish a similar kernel function as before

$$ K(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j) = \frac{a_i a_j}{||\mathbf{dx}_{ij}|| + \epsilon} $$

We can create the operator K_s in Aboria like so (setting $r=0.1$ in this case)

const double r = 0.1;
auto K_s = create_sparse_operator(
    particles, particles, r,
    [epsilon](const vdouble3 &dx, const_particle_reference i,
              const_particle_reference j) {
      return (get<a>(i) * get<a>(j)) / (dx.norm() + epsilon);
    });

Sparse operators support periodic domains, and therefore we now need the dx argument in the kernel function object. This is a distance value that gives the shortest vector from particle i to particle j.

When applied to a vector, this operator will use the neighbour search of the particles container to perform a neighbour search for all particle pairs where $||\mathbf{dx}_{ij}||<r$.

Before we can use this operator, we need to make sure that the neighbour search for particles is initialised. By default, the particle container was created using three spatial dimensions, so we need to set up a domain from $(0,0,0)$ to $(1,1,1)$ which is not periodic in all three directions.

vdouble3 min = vdouble3::Constant(0);
vdouble3 max = vdouble3::Constant(1);
vbool3 periodic = vbool3::Constant(false);
particles.init_neighbour_search(min, max, periodic);

Once this is done, we can then apply the operator to the vector b from before

Eigen::VectorXd c_3 = K_s * b;

Once again, we can write out K_s to a traditional matrix. This time, we will write out the values of K_s to a sparse matrix, so we can still obtain an efficient operator

Eigen::SparseMatrix<double> K_s_eigen(N, N);
K_s.assemble(K_s_eigen);

Eigen::VectorXd c_4 = K_s_eigen * b;

Creating Chebyshev Operators

Lets assume that the kernel function of interest depends only on the particle positions

$$ c_i = \sum_j^N b_j K(\mathbf{x}_i,\mathbf{x}_j) \text{ for } i=1..N $$

and that we can approximate $K(\mathbf{x}_i,\mathbf{x}_j)$ using interpolation. That is, if $w_l(x)$ denotes a set of interpolating functions, then

$$ K(\mathbf{x}_i,\mathbf{x}_j) \approx \sum_l \sum_m K(\mathbf{x}_l,\mathbf{x}_m) w_l(\mathbf{x}_i) w_m(\mathbf{x}_j) $$

Using this idea, we can use chebyshev interpolation to interpolate $K(\mathbf{x}_i,\mathbf{x}_j)$ at the chebyshev nodes, leading to an operator that can be applied to a vector at a reduced cost. How reduced depends on the number of chebyshev nodes that are chosen. A small number of nodes means that the error in the approximation grows, while a larger number of nodes means increased computational cost.

Note that this idea has been published in the following paper, which was used as a reference for Aboria's implementation:

Fong, William, and Eric Darve. "The black-box fast multipole method." Journal of Computational Physics 228.23 (2009): 8712-8725.

One restriction on the kernel function in this context is that it must be a smooth function of only position. With this in mind, we will define the following kernel function

$$ K(\mathbf{x}_i,\mathbf{x}_j) = \sqrt{||\mathbf{dx}_{ij}|| + \epsilon} $$

which is the multiquadric radial basis function.

A kernel operator using this function and chebyshev interpolation can be created in Aboria using the Aboria::create_chebyshev_operator function

const unsigned int n = 10;
auto K_c = create_chebyshev_operator(
    particles, particles, n,
    [epsilon](const vdouble3 &i, const vdouble3 &j) {
      return std::sqrt((j - i).norm() + epsilon);
    });

where n sets the number of chebyshev nodes in each dimension. Here the function object given to Aboria::create_chebyshev_operator must have two arguments representing the positions $\mathbf{x}_i$ and $\mathbf{x}_j$.

	Caution
	Once `K_c` is created, it assumes that the positions of the particles in `particles` do not change. If this is not true, then you can use the function of the underlying kernel class `Aboria::KernelChebyshev` to update the positions.

	Note
	Aboria's neighbourhood searching does not need to be initialized in this case, as no neighbourhood queries are used.

As before, this operator can be applied to an Eigen vector

Eigen::VectorXd c_5 = K_c * b;

Creating Fast Multipole Method Operators

One disadvantage of using chebyshev interpolation to construct an operator is that it is not valid if the kernel function is discontinuous. This is violated by many kernels which contain singularities when $\mathbf{x}_i = \mathbf{x}_j$. Also, often large numbers of chebyshev nodes are required, which does not reduce the computational cost sufficiently while retaining accuracy.

Another powerful class of methods are based on the Fast Multipole Method, which has been also implemented in Aboria. These are valid for kernel functions with singularities and which boasts $\mathcal{O}(N)$ complexity.

Once again, we will use chebyshev interpolation, but now will construct a tree data structure using one of Aboria's tree data structures (kd-tree or oct-tree). When the kernel operator is applied to a vector, the fast multiplole algorithm will be invoked, which leads to traversals up and down the tree, using chebyshev interpolation to compress the kernel function for clusters of particles that are well separated. The interaction of particle clusters that are not well separated are calculated directly. The details of this are in the following article, which was used as a reference for Aboria's implementation:

Fong, William, and Eric Darve. "The black-box fast multipole method." Journal of Computational Physics 228.23 (2009): 8712-8725.

A kernel operator using the fast multipole method can be created in Aboria using the Aboria::create_fmm_operator function

const unsigned int n2 = 4;
auto K_f = create_fmm_operator<n2>(
    particles, particles,
    [epsilon](const vdouble3 &i, const vdouble3 &j) {
      return std::sqrt((j - i).norm() + epsilon);
    },
    [epsilon](const_particle_reference i, const_particle_reference j) {
      return std::sqrt(
          ((get<position>(j) - get<position>(i)).norm() + epsilon));
    });

where n2 sets the number of chebyshev nodes in each dimension. Note that this is typically much less than what is required by the chebyshev operator discussed in the previous section. Here we need to pass two function objects to Aboria::create_fmm_operator, both of which represent the kernel applied in different situations. The first takes two positions that are guarenteed to be not equal, and which in FMM terminology are used to create the M2L (multipole to local) operators. The second function object takes two particle references from the row and column particle sets, and is used to create the P2P (particle to particle) operator. Note that the position of i and j, and/or the particles themselves, might be identical. Therefore this function object must handle any kernel singularities.

	Note
	the `Aboria::create_fmm_operator` function creates a kernel operator in which the fast multipole algorithm is applied completely "on the fly". The fmm can be significantly speeded up for static particle positions by storing a hierarchical matrix. Please see the next section for more details of how to do this in Aboria.

As before, this operator can be applied to an Eigen vector

Eigen::VectorXd c_6 = K_f * b;

Creating Hierarchical Matrix Operators

The fast multipole method (fmm) results in multiple tree traversals while applying linear operators to the multipole vectors. Rather than calculating these linear operators during the algorithm, it is also possible to pre-calculate them and store them in a hierarchical H2 matrix. The advantage of this is that applying the operator to a vector is much faster, the downside is that it assumes that the particle positions do not change.

A kernel operator which does this can be created using the Aboria::create_h2_operator function

auto K_h = create_h2_operator(
    particles, particles, n2,
    [epsilon](const vdouble3 &i, const vdouble3 &j) {
      return std::sqrt((j - i).norm() + epsilon);
    },
    [epsilon](const_particle_reference i, const_particle_reference j) {
      return std::sqrt(
          ((get<position>(j) - get<position>(i)).norm() + epsilon));
    });

As before, this operator can be applied to an Eigen vector

Eigen::VectorXd c_7 = K_h * b;

Block Operators

It is common that you would like to compose operators in a tiled or block format, and Aboria provides a functionality to do this using the Aboria::create_block_operator.

Let us assume that we wish to compose the two operators K and K_s from before, and want to perform the following combined operator

$$ \begin{align} e_i &= \sum_j^N d_j K_s(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j) \text{ for } i=1...N \\ e_{i+N} &= \sum_j^N d_{j+N} K(\mathbf{x}_i,a_i,\mathbf{x}_j,a_j) \text{ for } i=1...N \end{align} $$

where $e_i$ and $d_j$ are elements of vectors $\mathbf{e}$ and $\mathbf{d}$ of size $2N$. Using matrix notation, and using $\mathbf{K}$ and $\mathbf{K}_s$ to represent the operators K and K_s, this is equivalent to

$$ \mathbf{e} = \begin{pmatrix} \mathbf{K}_s & 0 \\\ 0 & \mathbf{K} \end{pmatrix} \mathbf{d} $$

We first need operators representing the zero matrices in the upper right and lower left corners of the block operator. We create these in Aboria like so

auto Zero = create_zero_operator(particles, particles);

and then we can create the block operator Full like so

auto Full = create_block_operator<2, 2>(K_s, Zero, Zero, K);

Finally we can create vectors e and d and apply the block operator

Eigen::VectorXd d(2 * N);
d.head(N) = Eigen::VectorXd::LinSpaced(N, 0, 1.0);
d.tail(N) = Eigen::VectorXd::LinSpaced(N, 0, 1.0);

Eigen::VectorXd e = Full * d;

Iterative Solvers

The Aboria::MatrixReplacement class can multiply other Eigen vectors, and can be used in Eigen's iterative solvers. Both Eigen::IdentityPreconditioner and Eigen::DiagonalPreconditioner preconditioners are supported. Below is an example of how to use Eigen's GMRES iterative solver to solve the equation

$$\mathbf{c} = \mathbf{K} \mathbf{h}$$

for input vector $\mathbf{h}$.

We can simply pass the dense operator K to Eigen's GMRES iterative solver like so

Eigen::GMRES<decltype(K), Eigen::DiagonalPreconditioner<double>>
    gmres_matrix_free;
gmres_matrix_free.setMaxIterations(2 * N);
gmres_matrix_free.set_restart(2 * N + 1);
gmres_matrix_free.compute(K);
Eigen::VectorXd h_1 = gmres_matrix_free.solve(c_1);

This will solve the equation in a matrix-free fashion. Alternatively, we can use the normal matrix K_eigen that we assembled previously to solve the equation

Eigen::GMRES<decltype(K_eigen), Eigen::DiagonalPreconditioner<double>>
    gmres_matrix;
gmres_matrix.setMaxIterations(2 * N);
gmres_matrix.set_restart(2 * N + 1);
gmres_matrix.compute(K_eigen);
Eigen::VectorXd h_2 = gmres_matrix.solve(c_1);