Main Page

From Matlab_FMM
Jump to: navigation, search

MediaWiki has been successfully installed.

Consult the User's Guide for information on using the wiki software.

CMSC878R/AMSC698R/MAIT 627

  • WIKI for the joint course project

FMM Solver & Data Structures: Benjamin Silbaugh

Objective

The main objective of this task is to develop a generalized MLFMM solver with supporting <math>2^d-tree</math> data structure that can be seamlessly integrated into the Matlab computing enviroment. The MLFMM solver will be integrated with the other Matlab-FMM library components described below to yield a comprehensive Matlab FMM toolbox. The solver supports generalized kernel definitions and variable spatial dimension.

The MatFMM Framework

The Matlab Fast Multipole Solver (MatFMM) has been designed as a generalized framework for exploring expansion and translation techniques for wide range of kernels, and as a tool for accelerating computations which require the evaluation of matrix-vector products with FMM compatible properties. An Object Oriented Programming (OOP) design philosophy as been followed to achieve a code base that is easy to extend, and easy introduce as a component in other Matlab solvers. The main component of MatFMM is the MLFMM object, which is directly used by the user to evaluate matrix-vector products. Two key subcomponents of the MLFMM object is the Tree object (abstraction of a <math>2^d</math>-Tree) and the kernel library. Note that the new Matlab OOP syntax introduced in version 7.5 has not been used for portability reasons; however, the new capability to subclass from the Matlab handle class and pass objects by reference should, in principle, improve the run-time performance. Further details regarding the implementation and usage of these components is provided in the sections that follow. A more comprehensive review of the implementation is provided in the report MatFMM.

Tree Class

The MatFMM Tree class represents an abstraction of a <math>2^d</math>-tree data structure. The construction of a Tree object is quite simple:

>> T = Tree(Lmax, d);

or

>> T = Tree(tag, var, d, Xset);

In the first case a <math>2^d</math>-tree with Lmax levels is constructed with the child, neighbor and E4-neighbor sets fully populated. In the second case, the source/target point set is automatically parsed with respect to each box at Lmax and the non-empty child, neighbor and E4-neighbor sets for each box is constructed. Note that if tag = 'Lmax' then var is interpreted to be the maximum level. If tag='s' then var is interpreted to mean the grouping parameter, and Lmax is computed to ensure that there are no more that s-points in each box at the finest level.

Data associated with box n at level l is accessed through subscript reference operators:

>> p = T(n,l,'parent');  % returns parent box

>> ch = T(n,l,'children');  % returns set of children boxes

>> ne = T(n,l,'neighbors'); % returns set of neighbor boxes

>> E4 = T(n,l,'E4');  % returns set of E4-neighbors

>> xc = T(n,l,'center');  % returns box center

If the Tree is an lvalue and ';' is omitted from the expression, the display method of the Tree will be invoked and the entire contents of the tree will be displayed.


Kernel Library

The kernel library consists of a set of directories, each of which contain a set of functions which construct and evaluate the S and R expansions, and (S|S), (S|R), and (R|R) translations. Since Matlab is dynamically typed, and the construction and evaluation of S, R, (S|S), (S|R), and (R|R) are performed by the user defined functions, the user is free to choose an appropriate data structure to represent the respective expansions and translations (e.g. choose 1D vector storage instead of 2D array).

For example, when the solver needs to construct the multipole-to-multipole SS translation operator, it calls a buildSS function and receives a data object (e.g. array, structure, ...),

>> SS = buildSS(t,p); % Here SS could be a 1D vector, multi-dimensional array, structure, etc

To evaluate the (S|S)(t) operator constructed by buildSS, the companion function, applySS, is called to construct the translated vector of coefficients:

>> C = applySS(SS,C); % Computes translated S-coefficient vector

Translation Class

A number of numerical algorithms require multiple matrix-vector product evaluations (e.g. iterative linear solvers). For cases where FMM may be used, a significant portion of the total CPU time is consumed by re-computing translation operators. Even if only one matrix-product evaluation is needed, only a handful of the SS, SR, and RR translation operations are unique. Hence, it is desirable to introduce a kind of memory to the computation of translation operators. This is the purpose of the Translation class. In essence a Translation object serves as a wrapper around the kernel library and dynamically generates a look-up table of translation matrices. Thus, when a translation is repeated an instance of the Translation class will simply re-use the previously computed value.

An instance of the Translation class may be created using the following constructor

>> T = Translation(type, p)

where type = 'SS', 'SR' or 'RR' and p is the truncation number. To evaluate a translation, simply invoke the translate method; e.g. an SS translation would be evaluated as

>> [C, SS] = translate(SS,t,C)

A new value for SS will be computed only if the the translation vector does not match a previously encountered value.

MLFMM Class

To use the MLFMM, begin by placing the main MatFMM directory in the Matlab path:

>> addpath /user/name/somedirectory/MatFMM

Then create an instance of the MLFMM object

>> A = MLFMMsolver(Xset, Yset, s, p, kernelName);

where for the time being kernelName = 'Cauchy'. To evaluate a matrix-vector product, simply use the conventional matrix-vector product operator

>> V = A*U;

Thus, the MLFMM object may be embedded into another Matlab solver which uses the * operator. Note that the * operator is simply invoking the evalProduct(A,U) method of the MLFMM object. So, if a solver requires a handle to a function, @evalProduct should suffice.

A demonstration using the Cauchy kernel is provided in MatFMM/demo.

Note that the MLFMM object is currently named MLFMMsolver. This is a bit of a misnomer, and will be changed in the future to MLFMM.

Current Status

The current release of MatFMM should be fully functional and ready for use with arbitrary kernels. However, as of time of writing (05/21/2008) the Laplace3D and Helmholtz2D kernels were still under development. These kernels have been included in the latest distribution for archival purposes.

Downloads

Release Date Description Download
05/21/2008 Renamed MLFMMsolver class to MLFMM MatFMM_20080521
05/14/2008 Introduced Laplace 3D kernel and translation class MatFMM_20080514
05/02/2008 Minor bug fixes in Tree Object MatFMM_20080502
04/28/2008 Bug fixes (mostly due to Xset and Yset transposition) + addition of Laplace 2D kernel MatFMM_20080428
04/24/2008 Switched Xset and Yset from dxN and dxM arrays to Nxd and Mxd arrays MatFMM_20080424
04/19/2008 First MatFMM release MatFMM_20080419
04/10/2008 Original Fortran 95 implimentation - Cauchy 1D only FMMlib_20080410

Adaptive FMM: Gordon Rubin

Computing and storing coefficients for translation and evaluation on an empty box will not impact the result of the FMM. Furthermore, empty hyperboxes at one level of the FMM will always yield empty children boxes. The adaptive FMM tries to reduce these costs for a particular data distribution.

Click here for more details on this.

Incorporating FMM in MATLAB: Sebastian Thomas

FMM can be Incorporated into the following MATLAB functions

PCG:

The function X = pcg(A,b) solves the linear system A*x = b. A can be a function Afun such that Afun(x) returns A*x. It is in the efficient computation of this Matrix Vector product that FMM can be introduced.

GMRES:

gmres(A,b) attempts to solve A*x = b using the Generalized Minimum Residual Method. Again A can be a function that returns the matrix vector product.

Shown below is a representative plot illustrating the variation of CPU time with Matrix Dimension for the MATLAB function GMRES. The blue dashed line indicates the straightforward Matrix Vector Product while the green solid line shows the variation of CPU time for the method in which FMM is incorporated. The break-even point is close to 460.

Variation of CPU time with Matrix Dimension (GMRES).

EIGS:

eigs(A) returns a vector of A's six largest magnitude vectors.

Shown below is a representative plot illustrating the variation of CPU time with Matrix Dimension. This is for the MATLAB function eigs. The blue dashed line indicates the straightforward Matrix Vector Product while the green solid line shows the variation of CPU time for the method in which FMM is incorporated. The break-even point is close to 390.

Variation of CPU time with Matrix Dimension (EIGS).

Time Integrators

For particle dynamics related problems (Vortex Blob Methods, Stellar Dynamics), the velocity of each member of the particle set is a function of time and the spatial location of the remaining members of the set:

<math> \frac {d[x]}{dt} = f([x],t) </math>

The right hand side of the above equation is usually a matrix vector product, where the matrix is a function of the spatial location of the particle set and the vector is a function of the particle strengths. It is in this step that FMM can be incorporated.

Euler Method:

Let an initial value problem be specified as follows.

<math> y' = f(t, y), \quad y(t_0) = y_0. </math>

Then, the Euler method for this problem is given by the following equations:

<math> y_{n+1} = \quad y_n + h*k1 </math>
<math> t_{n+1} = \quad t_n + h. </math>


where <math> \quad y_{n+1}</math> is the Euler approximation of <math> \quad y(t_{n+1}) </math>, and

<math> k_1 = \quad f \left( t_n, y_n \right) </math>


The gallery below illustrates the evolution of two vortex sheets having vorticity distributions of opposite signs. This was done using the Euler Integration Method with the Matrix Vector Product executed using FMM.

T = 0
T = 0.008
T = 0.016
T = 0.024
T = 0.032
T = 0.040
T = 0.048
T = 0.056
T = 0.064
T = 0.072
T = 0.080
T = 0.088
T = 0.096
T = 0.104

The figure below shows a variation of runtime with Matrix Size for the Euler Method.

CPU time vs N (Euler)

Fourth Order Runge Kutta

Let an initial value problem be specified as follows.

<math> y' = f(t, y), \quad y(t_0) = y_0. </math>

Then, the RK4 method for this problem is given by the following equations:

<math> y_{n+1} = \quad y_n + {h \over 6} \left(k_1 + 2k_2 + 2k_3 + k_4 \right) </math>
<math> t_{n+1} = \quad t_n + h </math>


where <math> \quad y_{n+1}</math> is the RK4 approximation of <math> \quad y(t_{n+1}) </math>, and


<math> k_1 = \quad f \left( t_n, y_n \right) </math>
<math> k_2 = \quad f \left( t_n + {h \over 2}, y_n + {h \over 2} k_1\right) </math>
<math> k_3 = \quad f \left( t_n + {h \over 2}, y_n + {h \over 2} k_2\right) </math>
<math> k_4 = \quad f \left( t_n + h, y_n + h k_3 \right) </math>



The same initial value problem as above was solved using the RK-4 method. Shown below is a variation of runtime with Matrix Size.

CPU time vs N (RK-4)

Approximate Inverse as a Preconditioner

In kernels that have a <math> R^{-1} </math> dependence, the contribution to the potential at a point decays with distance and the location of the 'large' entries in the matrix exhibits some structure. In addition, only a few entries have relatively large magnitudes (particles in the immediate neighborhood will have the highest contributions - the same particles that we do a direct summation over in the last step of FMM). The figure below illustrates this.

Magnitude Distribution in a Typical Cauchy Matrix. The highest magnitudes clearly appear along the diagonal

This means that a very sparse matrix is likely to retain the most relevant contributions to the actual inverse. However, it must be remembered, that with an increase in the number of particles, the ratio of the number of large entries to the number of smaller entries decreases and our approximation may no longer provide a good preconditioner.

Procedure

Each row of the matrix in a FMM matrix vector product can be associated with a single evaluation point. The idea behind the approximate preconditioner is to take a small group points (for the test case below, I have considered points within a neighborhood of chess radius = 1 ), and solve a small linear system, at <math> O(q^3) \quad </math> cost. We do this for all N points, and create an approximate inverse matrix by extending the q vector to a N vector by adding zeroes at the locations not considered.

For the test case, a random source set was created. This is the same as the receiver set. The effect of a source on itself was assumed to be a constant positive number. The plot below shows the variation of the reciprocal of the condition numbers of the left-preconditioned and right-preconditioned matrices compared with that of the original matrix. Clearly, the modified matrices are much better conditioned.

Variation of condition number with Matrix Size

But how expensive is this approximate inverse finding method? The plot below shows the variation of CPU Time with Matrix Size. It was found that the time taken by the FMM method scales as <math> N \quad </math> while for the straightforward method, the scaling was <math> N^3 \quad </math>. (It must be noted though, that in the range of Matrix Sizes investigated, the straightforward method is still faster than the FMM approximation method)

Variation of CPU time with Matrix Size

Improved Fast Gauss Transform (IFGT): Balaji Vasan Srinivasan

In this project, an easy-to-use Matlab handle was developed to access the Mex-files available to perform fast summations for Gaussian. Further, IFGT was used to improve the performance of Kernel Density Estimation and Gaussian Process Regression. Finally a Matlab based GUI (Graphical User Interface) was developed to provide a tutorial and demo of IFGT and its applications.

Click here for more details on this.

Translations: Kevin Brensike

The goal of this portion of the project is to provide S and R expansions along with S|S, R|R, and S|R translations for use by the MLFMM routine. The following kernels have been identified for implementation: Cauchy(1-D), Laplace (2-D and 3-D), and Helmholtz (2-D).

Problem Statement

First let <math>X=\{x_i\}_{i=1}^N</math> be the set of source points and <math>Y=\{y_i\}_{i=1}^M</math> be the set of target points.
And as before we use the FMM to implement a fast matrix vector product of the form <math> v = \Phi u </math> where
<math> \mathbf{\Phi} = \begin{bmatrix} \Phi_{11} & \Phi_{12} & ... & \Phi_{1N} \\

\Phi_{21} & \Phi_{22} & ... & \Phi_{2N} \\ ... & ... & ... & ... \\ \Phi_{M1} & \Phi_{M2} & ... & \Phi_{MN} \\\end{bmatrix} </math>, <math>u \in C^N</math>, <math>v \in C^M ,and \Phi_{ij}=\Phi (y_i-x_j)</math>

In other words we seek to solve the following,
<math>v_i=\sum_{j=1}^N \Phi_{ij}u_j</math>

Kernels

Below is the list of kernels we were tasked to investigate on our project.
Cauchy: <math>\Phi (y-x)={{1} \over {y-x}}</math>
Laplace 2D: <math>\Phi (y-x)={{1} \over {2\pi}}{ln{{1} \over {|y-x|}}}</math>
Laplace 3D: <math>\Phi (y-x)={{1} \over {4\pi |y-x|}}</math>
Helmholtz 2D: <math>\Phi (y-x)={{i \over 4}H_0^{(1)}(k|y-x|)}</math>
Where <math>H_0^{(1)}(k|y-x|)</math> is the Hankel function of the first kind with order 0.

Cauchy Kernel 1D

S and R Expansions

S Expansion
Let <math>|y-x_*|>|x_i-x_*|</math>. Now
<math>\Phi (y-x_i)={1 \over {(y-x_*)-(x_i-x_*)}}={1 \over {(y-x_*)}}{1 \over {1-{(x_i-x_*) \over {(y-x_*)}}}}=\sum_{k=0}^\infty {(x_i-x_*)^k \over {(y-x_*)^{k+1}}}=\sum_{k=0}^\infty b_k(x_i,x_*) S_k(y-x_*)</math>
Where
<math>b_k(x_i,x_*)={(x_i-x_*)^k} \mbox{ for }k \ge 0</math>
<math>S_k(y-x_*)={1 \over {(y-x_*)^{k+1}}} \mbox{ for }k \ge 0</math>


R Expansion
Let <math>|y-x_*|<|x_i-x_*|</math>. Now
<math>\Phi (y-x_i)={1 \over {(y-x_*)-(x_i-x_*)}}=-{1 \over {(x_i-x_*)}}{1 \over {1-{(y-x_*) \over {(x_i-x_*)}}}}=\sum_{k=0}^\infty- {{(y-x_*)^k} \over {(x_i-x_*)^{k+1}}}=\sum_{k=0}^\infty a_k(x_i,x_*) R_k(y-x_*)</math>
Where
<math>a_k(x_i,x_*)=-{1 \over {(x_i-x_*)^{k+1}}} \mbox{ for }k \ge 0</math>
<math>R_k(y-x_*)={(y-x_*)^k} \mbox{ for }k \ge 0</math>

S|S and R|R Translations

R|R Translation
<math>R_m(y-x_{*1})=((y-x_{*2})+(x_{*2}-x_{*1}))^m=\sum_{n=0}^m {{ m!} \over {n!(m-n)!}}{(x_{*2}-x_{*1})^{m-n}{(y-x_{*2})^n}}=\sum_{n=0}^m {(R|R)_{nm}(x_{*2}-x_{*1})}{R_n(y-x_{*2})}</math>
Thus we have
<math>(R|R)_{nm}(t)=\begin{cases}\frac{ m!}{n!(m-n)!}{t^{m-n}} & \mbox{ for } m \ge n \\ 0 & \mbox { for } m<n \end{cases}</math>
where <math>t=x_{*2}-x_{*1}</math>
S|S Translation
<math>{S_m(y-x_{*1}) }= {1 \over {(y-x_{*1})^{m+1}}} = {1 \over {((y-x_{*2})-(x_{*1}-x_{*2}))^{m+1}}}={1 \over {(y-x_{*2})^{m+1}}}{1 \over (1- {{{(x_{*1}-x_{*2})}} \over {(y-x_{*2})}})^{m+1}}=... </math>
<math>= {1 \over {(y-x_{*2})^{m+1}}} {\sum_{n=0}^\infty}{{(-1)^n(m+n)!{(x_{*2}-x_{*1})^n}} \over {m!n!(y-x_{*2})^n}}= {\sum_{n=0}^\infty}{{(-1)^n(m+n)!{(x_{*2}-x_{*1})^n}} \over {m!n!(y-x_{*2})^{m+n+1}}}=...</math>
<math> ={\sum_{n=m}^\infty}{{(-1)^{n-m}n!{(x_{*2}-x_{*1})^{n-m}}} \over {m!(n-m)!(y-x_{*2})^{n+1}}}=\sum_{n=m}^\infty {(S|S)_{nm}(x_{*2}-x_{*1})}{S_n(y-x_{*2})}</math>
Thus we have
<math>(S|S)_{nm}(t)=\begin{cases}\frac{ (-1)^{n-m} n!}{m!(n-m)!}{t^{n-m}} & \mbox{ for } m \le n \\ 0 & \mbox { for } m>n \end{cases}</math>
where <math>t=x_{*2}-x_{*1}</math>

S|R Translation

For <math>|x_{*2}-x_{*1}| > |y-x_{*2}|</math>
<math>{S_m(y-x_{*1}) }= {1 \over {(y-x_{*1})^{m+1}}} = {1 \over {((x_{*2}-x_{*1})-(x_{*2}-y))^{m+1}}}={1 \over {(x_{*2}-x_{*1})^{m+1}}}{1 \over (1- {{{(x_{*2}-y)}} \over {(x_{*2}-x_{*1})}})^{m+1}}=... </math>
<math>= {1 \over {(x_{*2}-x_{*1})^{m+1}}} {\sum_{n=0}^\infty}{{(-1)^n(m+n)!{(y-x_{*2})^n}} \over {m!n!(x_{*2}-x_{*1})^n}}= {\sum_{n=0}^\infty}{{(-1)^n(m+n)!{(y-x_{*2})^n}} \over {m!n!(x_{*2}-x_{*1})^{m+n+1}}}=...</math>
<math> =\sum_{n=m}^\infty {(S|R)_{nm}(x_{*2}-x_{*1})}{R_n(y-x_{*2})}</math>
Thus we have
<math>(S|R)_{nm}(t)={{(-1)^{n}(m+n)!}\over {m!n!{t^{m+n+1}}}}</math>
where <math>t=x_{*2}-x_{*1}</math>

Laplace's Equation and Green's Function in 2D

In n-dimensional space the Laplace Equation is defined:
<math>\nabla^2\varphi=0</math> where <math>\nabla^2 f = \sum_{i=1}^n \frac {\partial^2 f}{\partial x^2_i}</math>
The Laplace equation is a special case of the Helmholtz Equation
<math>\nabla^2\varphi+k^2\varphi=0</math>
where k=0. This will be discussed in detail in following sections. Before we solve for the operators we need to perform the FMM we will note that the complex Log is defined by the following:
<math>ln(x)=lx|x|+iArg(x)</math>. Thus if we solve for ln(x) and take the real part we would be solving for the Laplace kernel in 2D.

S and R Expansions

S Expansion
Let <math>|y-x_*|>|x_i-x_*|</math>. Now
<math>G(y-x_i)=ln{1 \over {y-x_i}}=ln{{1 \over {y-x_*}}{1 \over {1-{{x_i-x_*} \over{y-x_*}}}}}=ln{1 \over {y-x_*}}+ln{1 \over {1-{{x_i-x_*} \over{y-x_*}}}}=ln{1 \over {y-x_*}}-ln({{1-{{x_i-x_*} \over{y-x_*}}}})=ln{1 \over {y-x_*}}+\sum_{k=1}^\infty{1 \over k}{{(x_i-x_*)^k} \over{(y-x_*)^k}}=\sum_{k=0}^\infty b_k(x_i,x_*) S_k(y-x_*)</math>
Where
<math>b_k(x_i,x_*)=\frac{(x_i-x_*)^k}{k!} \mbox{ for }k \ge 0</math>
<math>S_k(y-x_*)=\begin{cases}ln{1 \over {y-x_*}} & \mbox{ for } k=0 \\ \frac{(k-1)!}{(y-x_*)^k} & \mbox { for } k>0 \end{cases}</math>
R Expansion
Let <math>|y-x_*|<|x_i-x_*|</math>
<math>G(y-x_i)=ln{1 \over {y-x_i}}=ln{-1 \over {x_i-x_*}}{1 \over {1-{(y-x_*) \over {(x_i-x_*)}}}}=ln{-1 \over {x_i-x_*}}-ln(1-{{{{(y-x_*) \over {(x_i-x_*)}})}}}=ln{-1 \over {x_i-x_*}}+\sum_{k=1}^\infty {1\over k}{{(y-x_*)^k} \over{(x_i-x_*)^k}}=\sum_{k=0}^\infty a_k(x_i,x_*) R_k(y-x_*)</math>
Where
<math>a_k(x_i,x_*)=\begin{cases}ln{\frac{-1}{x_i-x_*}} & \mbox{ for } k=0 \\ \frac{(k-1)!}{(x_i-x_*)^k} & \mbox { for } k>0 \end{cases}</math>
<math>R_k(y-x_*)={{1} \over {k!}}(y-x_*)^k \mbox{ for }k \ge 0</math>

S|S and R|R Translations

R|R Translation
<math>R_m(y-x_{*1})={{1}\over {m!}}((y-x_{*2})+(x_{*2}-x_{*1}))^m=\sum_{n=0}^m {{1} \over {n!(m-n)!}}{(x_{*2}-x_{*1})^{m-n}{(y-x_{*2})^n}}=\sum_{n=m}^\infty (R|R)_{nm}(x_{*2}-x_{*1})S_n(y-x_{*2})</math>
Thus we have
<math>(R|R)_{nm}(t)=\begin{cases}\frac{1}{(m-n)!}{t^{m-n}} & \mbox{ for } m \ge n \\ 0 & \mbox { for } m<n \end{cases}</math>
where <math>t=x_{*2}-x_{*1}</math>
S|S Translation
<math>m = 0</math>
<math>S_0(y-x_{*1})=ln{1 \over {(y-x_{*2})-(x_{*1}-x_{*2})}}=ln{1 \over {(y-x_{*2})}}{1 \over {1-({{x_{*1}-x_{*2}} \over {y-x_{*2}}}})}=ln{1 \over {(y-x_{*2})}}-ln({{1-({{x_{*1}-x_{*2}} \over {y-x_{*2}}}})})=ln{1 \over {(y-x_{*2})}}+ \sum_{n=1}^\infty {(-1)^n\over n}{{(x_{*2}-x_{*1})^n} \over {(y-x_{*2})^n}}</math>
Thus we have
<math>(S|S)_{n0}(t)= {{(-1)^nt^n} \over n!}\mbox{ for } n \ge 0</math>
<math>m > 0</math>
<math>S_m(y-x_{*1})={(m-1)! \over{((y-x_{*2})-(x_{*1}-x_{*2}))^m}}={(m-1)! \over {(y-x_{*2})^m}}{1 \over (1-{{(x_{*1}-x_{*2})} \over {((y-x_{*2})}})^m}={1 \over {(y-x_{*2})^m}}{\sum_{n=0}^\infty}{{(-1)^n(m+n-1)!(x_{*2}-x_{*1})^n} \over {n!(y-x_{*2})^n}}=...</math>
<math>={\sum_{n=0}^\infty}{{(-1)^n(m+n-1)!(x_{*2}-x_{*1})^n} \over {n!(y-x_{*2})^{n+m}}}={\sum_{n=m}^\infty}{{(-1)^{n-m}(n-1)!(x_{*2}-x_{*1})^{n-m}} \over {(n-m)!(y-x_{*2})^{n}}}=\sum_{n=m}^\infty (S|S)_{nm}(x_{*2}-x_{*1})S_n(y-x_{*2})</math>
Thus we have for all m
<math>(S|S)_{nm}(t)=\begin{cases}\frac{(-1)^{n-m}}{(n-m)!}t^{n-m} & \mbox{ for } n \ge m \\ 0 & \mbox { for } n < m \end{cases}</math>

S|R Translation

We use the S|S operator derived above to find this translation operator. Note that <math>|y-x_{*2}|<|x_{*2}-x_{*1}|</math>.
<math>S_m(y-x_{*1})=S_m((y-x_{*2})+(x_{*2}-x_{*1}))=\sum_{n=0}^\infty (-1)^nS_{n+m}(x_{*2}-x_{*1})R_{n}(y-x_{*2})=\sum_{n=0}^\infty (S|R)_{nm}(x_{*2}-x_{*1})R_{n}(y-x_{*2})</math>
Thus we have for all m
<math>(S|R)_{nm}(t)=\begin{cases}ln{1 \over {t}} & \mbox{ for } m+n=0 \\ \frac{(-1)^n(m+n-1)!}{(t)^{m+n}} & \mbox { for } m+n>0 \end{cases}</math>

Helmholtz Equation 2D

We will using the following Hankel addition theorem in our derivations:
If <math>|\rho'|<|\rho|</math>
<math>H_m^{(1)}(k|\rho -\rho '|)e^{im\theta_{\rho -\rho '}}=\sum_{n=-\infty}^\infty J_{n-m}(k|\rho'|)e^{-i(n-m)\theta_{\rho'}} H_n^{(1)}(k|\rho|)e^{in\theta_\rho}</math>
If <math>|\rho'|>|\rho|</math>
<math>H_m^{(1)}(k|\rho -\rho '|)e^{im\theta_{\rho -\rho '}}=\sum_{n=-\infty}^\infty H_{n-m}^{(1)}(k|\rho'|)e^{-i(n-m)\theta_{\rho'}} J_n(k|\rho|)e^{in\theta_\rho}</math>

S and R Expansions

Let <math>|y-x_i|>|x_i-x_*|</math>. Now
<math>\phi(y-x_i)={i \over 4} H_0^{(1)}(k|y-x_i|)= {i \over 4}H_0^{(1)}(k|(y-x_*)-(x_i-x_*)|)={i \over 4}\sum_{n=-\infty}^{\infty}J_n(k|x_i-x_*|)e^{-in\theta_{x_i-x_*}}H_n^{(1)}(k|y-x_*|)e^{in\theta_{y-x_*}}</math>
Thus we have:
<math>S_n(y-x_*)=H_n^{(1)}(k|y-x_*|)e^{in\theta_{y-x_*}}</math>
<math>b_n(x_i,x_*)={i\over4}J_n(k|x_i-x_*|)e^{-in\theta_{x_i-x_*}}</math>
Similarly we can find the R expansion using the addition theorem.
Thus we have:
<math>R_n(y-x_*)=J_n(k|y-x_*|)e^{in\theta_{y-x_*}}</math>
<math>a_n(x_i,x_*)={i\over4}H_n^{(1)}(k|x_i-x_*|)e^{-in\theta_{x_i-x_*}}</math>

SS, RR, and SR Translations

The following translation derivations can be found in Fast Multipole Method and Multilevel Fast Multipole Algorithm in 2D [Song and Chew]
<math>(S|S)_{nm}(t)=(R|R)_{nm}(t)=J_{m-n}(k|t|)e^{i(m-n)\theta_t}=R_{m-n}(t)</math>
<math>(S|R)_{nm}(t)=H_{m-n}^{(1)}(k|t|)e^{i(m-n)\theta_t}=S_{m-n}(t)</math>

Laplace Equation 3D

S and R Expansions

Let <math>|y-x_i|>|x_i-x_*|</math>. Now
<math>\phi(y-x_i)={1 \over{4\pi|y-x_i|}}={1 \over{4\pi|(y-x_*)-(x_i-x_*)|}}=...</math>
<math>={1 \over {4\pi \sqrt{|y-x_*|^2-2|y-x_*||x_i-x_*|cos(\theta_{y-x_i})+|x_i-x_*|^2}}}=...</math>
<math>={1 \over {4\pi|y-x_*|}}{1 \over {\sqrt{1-2{{|x_i-x_*|} \over {|y-x_*|}}cos(\theta_{y-x_i})+{{|x_i-x_*|^2} \over {|y-x_*|^2}}}}}=...</math>
Now using the formula for Legendre Polynomials we see that:
<math>={1 \over {4\pi|y-x_*|}}{1 \over {\sqrt{1-2{{|x_i-x_*|} \over {|y-x_*|}}cos(\theta_{y-x_i})+{{|x_i-x_*|^2} \over {|y-x_*|^2}}}}}=...</math>
<math>{1\over{4\pi}}{\sum_{n=0}^\infty P_n(cos(\theta_{y-x_i})){{|x_i-x_*|^n} \over {|y-x_*|^{n+1}}}}</math>
Using Spherical Harmonics we know that:
<math>{\sum_{n=0}^\infty P_n(cos(\theta_{y-x_i}))}={{4\pi} \over {2n+1}} \sum_{m=-n}^n Y_n^{-m}(\theta_{x_i-x_*},\phi_{x_i-x_*})Y_n^m(\theta_{y-x_*},\phi_{y-x_*})</math>
Putting this altogether we see that,
<math>{\sum_{n=0}^\infty \sum_{m=-n}^n {{1} \over {2n+1}}Y_n^{-m}(\theta_{x_i-x_*},\phi_{x_i-x_*})Y_n^m(\theta_{y-x_*},\phi_{y-x_*}){{|x_i-x_*|^n} \over {|y-x_*|^{n+1}}}}</math>
Thus we have,
<math>S_n^m(y-x_*)={1 \over {|y-x_*|^{n+1}}}Y_n^m(\theta_{y-x_*},\phi_{y-x_*})</math>
<math>b_n^m(x_i,x_*)={{|x_i-x_*|^n} \over {2n+1}}Y_n^{-m}(\theta_{x_i-x_*},\phi_{x_i-x_*})</math>
Similarly,
<math>R_n^m(y-x_*)={|y-x_*|^n}Y_n^m(\theta_{y-x_*},\phi_{y-x_*})</math>
<math>a_n^m(x_i,x_*)={{1} \over {(2n+1)|x_i-x_*|^{n+1}}}Y_n^{-m}(\theta_{x_i-x_*},\phi_{x_i-x_*})</math>

S|S, R|R, and S|R Translations

The following translations were obtained from Comparison of the efficiency of translation operators used in the fast multipole method for the 3D Laplace equation [Gumerov and Duraiswami].
<math>(R|R)_{n'n}^{m'm}(t)={{{\alpha_{(1)n'}^{m'}}\alpha_{(1)n-n'}^{m-m'}} \over {\alpha_{(1)n}^{m}}}R_{n-n'}^{m-m'}(t)</math>
<math>(S|R)_{n'n}^{m'm}(t)={{{\alpha_{(1)n'}^{m'}}\beta_{(1)n+n'}^{m-m'}} \over {\beta_{(1)n}^{m}}}S_{n+n'}^{m-m'}(t)</math>
<math>(S|S)_{n'n}^{m'm}(t)={{{\beta_{(1)n'}^{m'}}\beta_{(1)n-n'}^{m-m'}} \over {\beta_{(1)n}^{m}}}R_{n-n'}^{m-m'}(t)</math>
Where
<math>\alpha_{(1)n}^m=(-1)^n i^{-|m|} \sqrt{{4\pi} \over {(2n+1)(n-m)!(n+m)!}}</math>
<math>\beta_{(1)n}^m=i^{|m|} \sqrt{{4\pi(n-m)!(n+m)!} \over {(2n+1)}}</math>