Performance analysis of the FDTD method applied to holographic volume gratings: multi-core CPU versus GPU computing

Francés, Jorge; Bleda, Sergio; Neipp, Cristian; Márquez, Andrés; Pascual, Inmaculada; Beléndez, Augusto

Performance analysis of the FDTD method applied to holographic volume gratings: multi-core CPU versus GPU computing

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/26220

Información del item - Informació de l'item - Item information
Título:	Performance analysis of the FDTD method applied to holographic volume gratings: multi-core CPU versus GPU computing
Autor/es:	Francés, Jorge \| Bleda, Sergio \| Neipp, Cristian \| Márquez, Andrés \| Pascual, Inmaculada \| Beléndez, Augusto
Grupo/s de investigación o GITE:	Holografía y Procesado Óptico
Centro, Departamento o Servicio:	Universidad de Alicante. Departamento de Física, Ingeniería de Sistemas y Teoría de la Señal \| Universidad de Alicante. Departamento de Óptica, Farmacología y Anatomía \| Universidad de Alicante. Instituto Universitario de Física Aplicada a las Ciencias y las Tecnologías
Palabras clave:	CUDA \| GPU Computing \| Gratings \| Holography \| OpenMP \| SEE \| SIMD \| Speed up
Área/s de conocimiento:	Física Aplicada \| Óptica
Fecha de creación:	11-abr-2011
Fecha de publicación:	1-mar-2013
Editor:	Elsevier
Cita bibliográfica:	FRANCÉS MONLLOR, Jorge, et al. "Performance analysis of the FDTD method applied to holographic volume gratings: multi-core CPU versus GPU computing". Computer Physics Communications. Vol. 184, No. 3 (2013). ISSN 0010-4655, pp. 469-479
Resumen:	The finite-difference time-domain method (FDTD) allows electromagnetic field distribution analysis as a function of time and space. The method is applied to analyze holographic volume gratings (HVGs) for the near-field distribution at optical wavelengths. Usually, this application requires the simulation of wide areas, which implies more memory and time processing. In this work, we propose a specific implementation of the FDTD method including several add-ons for a precise simulation of optical diffractive elements. Values in the near-field region are computed considering the illumination of the grating by means of a plane wave for different angles of incidence and including absorbing boundaries as well. We compare the results obtained by FDTD with those obtained using a matrix method (MM) applied to diffraction gratings. In addition, we have developed two optimized versions of the algorithm, for both CPU and GPU, in order to analyze the improvement of using the new NVIDIA Fermi GPU architecture versus highly tuned multi-core CPU as a function of the size simulation. In particular, the optimized CPU implementation takes advantage of the arithmetic and data transfer streaming SIMD (single instruction multiple data) extensions (SSE) included explicitly in the code and also of multi-threading by means of OpenMP directives. A good agreement between the results obtained using both FDTD and MM methods is obtained, thus validating our methodology. Moreover, the performance of the GPU is compared to the SSE+OpenMP CPU implementation, and it is quantitatively determined that a highly optimized CPU program can be competitive for a wider range of simulation sizes, whereas GPU computing becomes more powerful for large-scale simulations.
Patrocinador/es:	This work was supported by the “Ministerio de Economía y Competitividad” of Spain under projects FIS2011-29803-C02-01, FIS2011-29803-C02-02 and by the “Generalitat Valenciana” of Spain under projects PROMETEO/ 2011/021, ISIC/ 2012/013, and GV/ 2012/099.
URI:	http://hdl.handle.net/10045/26220
ISSN:	0010-4655 (Print) \| 1879-2944 (Online)
DOI:	10.1016/j.cpc.2012.09.025
Idioma:	eng
Tipo:	info:eu-repo/semantics/article
Revisión científica:	si
Versión del editor:	http://dx.doi.org/10.1016/j.cpc.2012.09.025
Aparece en las colecciones:	INV - GHPO - Artículos de Revistas INV - Acústica Aplicada - Artículos de Revistas

Archivos en este ítem:

Archivos en este ítem:
Archivo	Descripción	Tamaño	Formato
CPC_v184_n3_p469_2013.pdf	Versión final (acceso restringido)	1,39 MB	Adobe PDF	Abrir Solicitar una copia

Ver citas en Google Académico

Muestra el registro completo