Data merging for the cultural heritage imaging based on Chebfun approach

Cultural heritage imaging has specific needs with regards to the analysis of images that require the manipulation of a single digital object that combines the images obtained from different instruments probing different scales at different wavelengths, with the further possibility of selecting two or three dimensional representations. We propose a unified imaging data processing approach based on the "Chebyshev Technology" using the open source software Chebfun which, by mapping data processing to simple polynomial transformations, brought considerable improvements over already existing procedures. Within that same data processing framework we may further investigate how to merge images originating from different acquisition devices since all images are expressed in the same basis (an approximate Chebfun polynomial basis ) before being merged. In the end, we hope to map all imaging data processing to simple polynomial operations. Our massive data-sets required parallelizing some Chebfun functions on GPUs, allowing about 100 times faster polynomial evaluation and up to 12 times faster on CPUs when parallelizing the whole algorithm.


Introduction
The Institut de Recherche de Chimie Paris (IRCP 6 ) and Centre de Recherche et de Restauration des Musées de France (C2RMF 7 ) have set up a mixed research team, Physico Chimie des Matériaux Témoins de l'Histoire (PCMTH), in order to bring new solutions to the challenges facing the analysis, conservation and restoration of cultural heritage objects, and one of the team's ambitious research projects concerns the latter's digital images : how to best acquire, store, analyze and combine them.The C2RMF tools of the trade include optical and X-ray photography [16], and, thanks to the expertise brought by the team's IRCP component in Electron Paramagnetic Resonance (EPR) who pioneered the application of EPR imaging (EPR-I) to exobiology [14], has decided to use EPR-I in order to image specific chemical species which X-rays and other traditional techniques are unable to specifically target, like the different layers of carbon-related material found inside paintings.
However, this EPR-I information needs to be merged with the one gathered from other sources, be it X-rays or optical photography in order to provide a single object to which we may attach an interface for subsequent manipulations.This merging would be greatly simplified if we could unify the different pipelines that take raw data and transform them into images.At the moment, there are different pipelines (programs and associated algorithms) for each imaging technique, and even the data formats are different.This actually stands as a major global challenge : many domain specific techniques exist to solve particular problems in imaging, yet the final results obtained after applying each technique are difficult if not impossible to combine.The PCMTH team, in association with the C2RMF imaging team, is thus developing a generic approach which would allow a single pipeline to process all the different kinds of data, with a single unifying data structure and mathematical model founded on Chebyshev technology [9], as championed by the Oxford University Numerical Analysis Group through the development of Chebfun [13], an open-source software system for numerical computing with functions.The mathematical basis of Chebfun is piecewise polynomial interpolation and in this paper, we first describe the role Chebfun plays in our unifying approach to image data processing8 , and more specifically how we managed to fit the key tomographic process of backprojection into the Chebfun paradigm ; we show that this approach provides a promising imaging data processing unification without having to pay a performance cost : it is a zero-cost abstraction 9 .Secondly, we also describe how we managed, by working both on the algorithmic and hardware aspect, to accelerate our Chebfun paradigm application : the execution time speed was scaled down by orders of magnitude compared to our original straightforward implementation on general purpose hardware.

The mathematical model of EPR imaging
EPR is a technique that basically finds the value of a magnetic field at which chemical species absorb oncoming microwave radiations, as we can see in figure 1 : this is the signature of the chemical species.An EPR spectrum with its Chebfun representation.The absorption (arbitrary units) is measured as the magnetic field is varied (abscissa, in Gauss) and the microwave frequency of the oncoming radiation kept constant (around 9.5GHz for X band EPR); EPR traditionally measures the derivative of that signal, explaining the bumps and troughs.The maximum absorption is here around 3500 G Now, the method of EPR imaging is similar to the well known one of Magnetic Resonance Imaging (MRI) : a magnetic field gradient maps the diffrent position of atoms sensitive to EPR to different positions on a spectrum, so if a species has a resonance spectrum s, then, given a magnetic gradient G and global magnetic field B, the linear density of the species is mapped to a spectrum in the following way : where the integral has the form of convolution, and is applied on the sample.The linear density c is itself related to the volume density ρ by a surface integral : c(x) is the integral of ρ on the plane orthogonal to the direction of the magnetic field gradient at the position x on that direction.In order to simplify the problem, we shall consider our sample as being a 2D flat surface, thus allowing us to represent the surface integral as a line integral, yielding to the Radon transform R representation of c as being c = Rρ, which allows us to rewrite equation 1 more abstractly as If we accept that simplification then the process of EPR imaging is depicted in figure (2).
Radon transform geometry : OC is the direction of the magnetic gradient, every spectrum is given by the choice of θ, and on a given spectrum, every data point has an abscissa t and for each t the ordinate on the spectrum is the integrated density along the direction BC.That geometry is used in our backprojection equations starting with 5, and underlies the more qualitative description given in figure 3 Figure 3.An EPR spectrum is the Radon transform of the density of EPR sensitive chemical species computed for all (sampled) lines orthogonal to the magnetic field gradient.The information is blurred by a convolution with the absorption spectrum of the specific species under study, the reference lineshape, to which is added some unavoidable noise The problem is now to find ρ, a flat surface density which thus depends on the variables (x, y), given r and s : it is a typical inverse problem, involving the well-known Fredholm integral equation of the first kind which the convolution equation 1 is an example of, together with the Radon transform, thus requiring two inversions, the second one being the backprojection operator B. As for the convolutional part, this blurring of the density is traditionally dealt with using Fourier transforms directly on r, but it turns out that the deconvolution can be performed on the final image thanks to the linearity property of the backprojection and convolution operators : This commutativity property allows us to postpone the deconvolution and apply it only on the blurred image, instead of applying it on the blurred linear density.But for this paper, we shall even further simplify the problem and suppose that the EPR spectra are a direct mapping of the linear density, a simplification fully justified by the property expressed in equation 3, and we shall thus simply inverse the simplified equation r = Rρ (4) but with the requirement that we want that inversion to be compatible with the Chebfun approach : we would like to find an inverse transform that maps to simple operations on the chebfun10 , without having to mediate those transformations through sampling.

The Chebfun compatible backprojection
In order to inverse the Radon transform, the common practice is to filter the backprojection, which we describe by the operator B, and in the end we can retrieve the density ρ by using the well known Fourier form of the filtered backprojection [19] ρ (x, y) which uses the parametrization described in figure 2. There exists an equivalent formulation that uses the Hilbert transform [19] ρ which thanks to the commutativity of the Hilbert transform with the derivative becomes If we now recall that the Rρ are the EPR spectra, we can consider each of those to be a chebfun, which we shall call P θ , and if we define we obtain that in equation ( 7), we can now express Q θ in the basis of the Chebyshev polynomials of the first kind and from equations 9, and 10 we obtain Because we study physical objects which have a finite extension, and under the hypothesis that the corresponding chebfun will also be of finite support, our Hilbert transform becomes a Tricomi transform [22] T , which allows us to rewrite [17] equation 11 as We now use a very useful property of the Tricomi transform, which is crucial in understanding how the filtered backprojection can indeed be mapped to simple operations on chebfuns : which allows us to rewrite equation 12 as and finally equation 7 becomes This last expression reveals the simple relationship that exists between the spectrum chebfuns, obtained directly from the raw (sampled) spectra, and the reconstructed image.It is quite straightforward to transform Equation 15 into an algorithm amenable to a Chebfun compatible implementation : 1. Prefactor the raw spectra r(t) by 1/ √ 1 − t 2 ; 2. transform each prefactored spectrum into a chebfun with the FUNQUI function ; 3. apply the square brackets part of equation 15 ; 4. apply the naive backprojection B that constitutes the remaining part of equation 15.
Part 4 of our algorithmic implementation above requires the computation of the naive backprojection, and here also we took advantage of the Chebfun approach.
Figure 4 shows that our reconstruction is at least as good as the standard procedure using the vanilla Matlab imaging toolbox solution that uses the IRADON function.
Standard Fourier reconstruction.
Figure 4. We tested our approach on the Shepp-Logan phantom (up).The other two figures show the phantom reconstruction using only the information provided by the Radon transform of the phantom sampled on the angles, cf.figure 2. We can see that our approach is at least as good as the standard one We have also applied our approach to real data [18], as we see in figure 5 : our approach does not yet procede to any noise filtering, yet the result is already an improvement on the traditional approach using a black-box Fourier approach as provided in the software suite that comes with the EPR imaging spectrometer [1].
Standard Fourier reconstruction using the BRUKER Xepr software.It took few seconds for the construction but after a heavy manual work that took few hours.

Accelerating code execution time
Because we target real time imaging, we need to decrease the processing time.The accelerations currently only rely on the vectorization of the algorithms, and in order to reap the full benefits we decided to buy Graphical Processing Units (GPU) hardware (Figure 7) and we describe that in sections 3.4 to 3.2.The implementation of the most important part of the code is described in section 3.1 and the whole code will be made available through a Gitlab repository as a literate program [20] written with the inhouse developed Vim literate programming plugin11 called Kosmogram [6].Further improvements are expected with the use of a novel programming language, Pony [10], that puts actor programming to the forefront as we shall explain in section 3.3.

Backprojection implementation
The key step in the imaging pipeline is the backprojection [19], which we express in the following form : So we need to integrate on θ and we must therefore have the function that appears in the integrand, i.e. h as a function of t = x • cos (θ) + y • sin(θ) which we have thanks to having transformed each spectrum into a chebfun : this means that for each θ i we have a chebfun h i which we compute at the value t, and we therefore need to construct the matrix of values of t.Once this is done, we obtain a matrix in which each line is the computation of a particular polynomial (chebfun) h i on each t corresponding to a particular θ i and all values of x and y in the image (Figure 8).Therefore, if we look at the matrix column-wise (Figure 8), we have a set of values obtained for a particular t this time and all values of θ: we can now compute the chebfun for each column of values (by using POLYFIT), thus building a matrix of column chebfuns.Each chebfun is then summed to compute the backprojected value at all x and y: we sum (with the overloaded Chebfun sum) on each chebfun, we obtain a vector of size x • y which is the flattened image (Figure 9), and we can just RESHAPE it to an image.
For each line i of the matrix t we apply the polynomial h i to obtain the matrix t : Algorithm 1 Construction of the matrix t f o r i = 1 : N _ t h e t a t ( i , : ) = h _ i ( t ( i , : ) ) Data merging for the cultural heritage imaging based on Chebfun approach 78 Supercomputing Frontiers and Innovations We then transform each column of the matrix t into a chebfun objects and on each column we apply the function SUM, which is the integral of the chebfun :

MATLAB Distributed Computing Server
The image construction is done pixel by pixel, which is an embarrassingly parallel process which quiet naturally led us to use the MATLAB Distributed Computing Server [12] on the cluster HPU4Science which we designed accordingly as we shall see in section 3.4.The construction of the pixels is distributed on the appropriate target machines which we selected thanks to the MATLAB Job Scheduler (MJS) -Table 1 explains the MJS configuration for each construction.We then replace each FOR loop by a PARFOR loop.

Speedup results
In the beginning, the execution time took between 8 to 12 hours (depending on the sample under study) and thanks to the code vectorization (explained in subsection 3.1) we managed to significantlygoing down to 6.6 hours on CPU.Then, using the CPU multi-core power, we further reduced the computation time to 33 minutes (about 14 times faster), and finally reached 4 minutes on GPUs in double precision (more than 100 times faster, or "orders of magnitude" as we claimed at the beginning of the paper).In table 1 we explain the MJS configuration used to get those results.
anyone should be able assemble a similar, albeit scaled down, system and freely use the software and code developed by the HPU4Science team [4] to perform high-end scientific computing.
The HPU4Science cluster's construction went through multiple stages.We started by sharing an existing cluster room in Chimie-ParisTech (Figure 10 on the left) but because of electrical power constraints and heat control we had to move to another room that belongs to the PCMTH's team (Figure 10).We have recently (second half 2016) added new nodes in the cluster, with the goal of testing the actor programming framework as championed by the Pony programming language.

HPU4Science in the beginning
Servers room For the operating system we use Linux for its stability, the ability to pick from a wide variety of file systems, the large existing code base for high performance computing, and the ease of tailoring the OS to the specific hardware requirements.Availability of open-source projects with code that can be configured for highly parallelized processing was also an important advantage.
The master and all workers run Ubuntu server edition, installed with the minimal OpenSSH server profile.All machines are headless to minimize the OS memory footprint, so all interactions with the cluster happens at the command prompt through ssh.
The only closed source software used on the cluster is Matlab and it is because of the power of the Chebfun toolbox that we are re-coding some of its function using open source languages (3.3), as we explain in subsection 3.3.

Authors' contributions
Mariem El Afrit is a PhD candidate working on the development of novel approaches based on Chebfun for (tomographic) imaging and data fusion, and she worked specifically on the implementation of the algorithms and on the details of the mathematical derivations and their algorithmic formulation, together with the subsequent parallelization, while at the same time being responsible for most of the cluster assembling and management.
Yann Le Du is a CNRS research scientist at Chimie-ParisTech specialized in Electron Paramagnetic Resonance Imaging, and he initiated the use of Chebfun for imaging, devised the Tricomi transform approach to backprojection and worked on the whole mathematical framework, while also contributing to the code and launched the work on the re-coding of the pipeline in the Pony language.
Rafaël Del Pino, now a PhD candidate in cryptography, was a summer intern in the team under the guidance of Yann Le Du, and worked on the algorithm implementation, helped with the mathematical derivations and specifically with the finalization of the mathematical form of the Tricomi expression used in the pipeline.
Guolin Zhang was a summer intern in the team, and worked on the acceleration of the code on the Matlab source code of the imaging pipeline under the guidance of Mariem El Afrit.
As for the paper itself, it was written by Mariem El Afrit and Yann Le Du.

Conclusion
Our goal is to unify imaging processing, especially tomographic imaging, by relying on a powerful data representation based on Chebfun [9].This requires the adaptation of existing algorithms to this data structure, and we have succeeded in doing so with the most fundamental of the tomographic processes, the backprojection.By reformulating it in terms of the Tricomi transform, and applying the Cbebfun paradigm at each of the underlying algorithmic steps, we have managed to vectorize it and not only obtain results of the same quality as with the traditional approach, but with speeds that makes it possible to use our approach on real time data generation, and further work is thus underway to make the parallel processing concurrent 13 .

Figure 5 .
Figure 5. "C(entre) N(ational) d' E(tudes) S(patiales)" (the French space institute) was laser printed (heated toner is very responsive to EPR) on a piece of paper (1cm by 5cm), hidden in an EPR tube filled with sand (EPR neutral) positioned in a Bruker EPR imaging spectrometer.The reconstructions do not (yet) include noise filtering

Figure 6 .
Figure 6.Our approach to tomography uses two mathematically equivalent yet implementation-wise different paths : in this paper, we describe the three nodes which are highlighted in blue and with doubleline contours.The key steps are the transformation of spectra into chebfuns, and the formulation of the inverse problem solution using the Tricomi transform

Figure 7 .
Figure 7.The cluster HPU4Science is composed of fours servers (Equestria, Master, Worker01, Worker02) and two desktops (Lasfede and Desktop01) with a total of 52 CPU cores and nine GPUs

Figure 8 .
Figure 8. Construction of pixels corresponding to each t and θ

Algorithm 2 Figure 9 .
Figure 9. Flattened picture In our code, wherever it was possible, we replaced the FOR loops by the function IRBSXFUN in the accelerated version.

Figure 10 .
Figure 10.The HPU4Science's current installation as of 2016 with 20 GPUs (4 GPU NVIDIA cards per node, for a total of 20 GPUs thanks to double GPU cards) with another node in the form of a 28-core blade (specifically for developing and testing a conconcurrent imaging pipeline) localized in a dedicated server room at Chimie-ParisTech