3x Acceleration of the QMR Algorithm

Boosting Performance of Geophysical
Inversion using Intel Xeon Phi:
3x Acceleration of the QMR Algorithm
Dr. Alastair McKinley, Engineering Manager, Analytics Engines
Dr Lucy MacGregor, CTO, Rock Solid Imaging
“ Acceleration of EMGeo on Xeon Phi has the
potential to dramatically reduce turnaround
times for 3D inversion projects, allowing
more efficient and robust analysis of our
clients' EM data.”
Lucy MacGregor, CTO, RSI
Customer Profile
RSI is an independent geoscience consulting
firm offering quantitative reservoir characterization with the goal of reducing exploration
drilling risk and optimizing reservoir appraisal
and development plans.
The Challenge
“ Accelerating the EMGeo algorithm using the
Xeon Phi processor resulted in a 3x performance
improvement and a 2x reduction in IT
infrastructure.''
Dr. Ben Greene, Chief Technology Officer
Benefits
Client reports produced in weeks
instead of months
Performance can be scaled up with a
very small hardware footprint
3x performance improvement
2x reduction in scalable hardware infraIn recent years the oil and gas industry have
structure costs
been de-risking exploration and making better
drilling decisions by obtaining greater intelligence on selected sites. The data for these decisions is derived from new techniques based on
electromagnetic data known as Control Source Electro Magnetic (CSEM) and Magnetotelluric (MT)
data. Electromagnetic studies are better at distinguishing between types of liquids underground
when compared to seismic studies. In conjunction with traditional seismic studies this enables
more accurate drilling decisions. Electromagnetic inversion for both CSEM and MT processes were
being carried out by RSI using a software package called Electro Magnetic Geological Mapper
(EMGeo) which runs on general purpose CPUs (Xeon’s) in a cluster using MPI.
Electromagnetic inversion is a computationaly intensive and therefore time consuming task. As RSI
are responsible for carrying out the project on behalf of their client a key goal in technology development has been to reduce turnaround times. RSI began working with AE to identify bottlenecks
in the process that would help them achieve this goal.
For more information contact [email protected]
The Solution
To achieve the desired performance improvement Analytics Engines
evaluated both the existing software and the hardware platforms.
The engineering team conducted extensive profiling and testing of
multiple EM inversion data sets from RSI. It was identified that the
main performance bottleneck in the EMGeo software was the Sparse
Matrix Vector multiplication (SpMV) operations inside the quasiminimal residual (QMR) subroutine. To combat this problem Analytics
Engines selected the Intel Xeon Phi coprocessor which is ideal for
high-density compute tasks. SpMV works well on Xeon Phi because of
the highly parallel memory architecture that supports efficient implementation.
To achieve maximum performance on the Xeon Phi, AE built a custom
kernel specifically tuned for the EMGeo QMR data structure and
memory access patterns. The kernel was integrated into the existing
QMR subroutine with EMGeo leaving the bulk of the original Fortran code unchanged.
Additionally, in order to effectively utilize EMGeo on Xeon Phi it was necessary to implement
hybrid MPI/OPenMP support in the QMR kernel which allowed for memory optimisation.
The Result
Project times using the EMGeo inversion code could be reduced from months to weeks as a result
of this implementation. The solution allowed for performance to be scaled up with multiple Phi
devices to gain a larger improvement in performance (3x improvement over cost equivalent hardware), whilst also achieving a smaller physical footprint (2x reduction) for hardware infrastructure.
This completion of the core goal coupled with an architecture that can be scaled at lower cost illustrates the benefits of optimizing the code for the Xeon Phi architecture. This optimization enables RSI to
accomplish their technological goals and overcome a challenging computational task.
Results Overview
Project times reduced from months to weeks
3x performance increase over original implementation
3x improvement over cost equivalent hardware
The graph shows performance gains showing that
iterations are achieving approximately 3x performance
improvement
For more information contact [email protected]