Performance analysis of the FDTD method applied to holographic volume gratings: multi-core CPU versus GPU computing
Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/10045/26220
Título: | Performance analysis of the FDTD method applied to holographic volume gratings: multi-core CPU versus GPU computing |
---|---|
Autor/es: | Francés, Jorge | Bleda, Sergio | Neipp, Cristian | Márquez, Andrés | Pascual, Inmaculada | Beléndez, Augusto |
Grupo/s de investigación o GITE: | Holografía y Procesado Óptico |
Centro, Departamento o Servicio: | Universidad de Alicante. Departamento de Física, Ingeniería de Sistemas y Teoría de la Señal | Universidad de Alicante. Departamento de Óptica, Farmacología y Anatomía | Universidad de Alicante. Instituto Universitario de Física Aplicada a las Ciencias y las Tecnologías |
Palabras clave: | CUDA | GPU Computing | Gratings | Holography | OpenMP | SEE | SIMD | Speed up |
Área/s de conocimiento: | Física Aplicada | Óptica |
Fecha de creación: | 11-abr-2011 |
Fecha de publicación: | 1-mar-2013 |
Editor: | Elsevier |
Cita bibliográfica: | FRANCÉS MONLLOR, Jorge, et al. "Performance analysis of the FDTD method applied to holographic volume gratings: multi-core CPU versus GPU computing". Computer Physics Communications. Vol. 184, No. 3 (2013). ISSN 0010-4655, pp. 469-479 |
Resumen: | The finite-difference time-domain method (FDTD) allows electromagnetic field distribution analysis as a function of time and space. The method is applied to analyze holographic volume gratings (HVGs) for the near-field distribution at optical wavelengths. Usually, this application requires the simulation of wide areas, which implies more memory and time processing. In this work, we propose a specific implementation of the FDTD method including several add-ons for a precise simulation of optical diffractive elements. Values in the near-field region are computed considering the illumination of the grating by means of a plane wave for different angles of incidence and including absorbing boundaries as well. We compare the results obtained by FDTD with those obtained using a matrix method (MM) applied to diffraction gratings. In addition, we have developed two optimized versions of the algorithm, for both CPU and GPU, in order to analyze the improvement of using the new NVIDIA Fermi GPU architecture versus highly tuned multi-core CPU as a function of the size simulation. In particular, the optimized CPU implementation takes advantage of the arithmetic and data transfer streaming SIMD (single instruction multiple data) extensions (SSE) included explicitly in the code and also of multi-threading by means of OpenMP directives. A good agreement between the results obtained using both FDTD and MM methods is obtained, thus validating our methodology. Moreover, the performance of the GPU is compared to the SSE+OpenMP CPU implementation, and it is quantitatively determined that a highly optimized CPU program can be competitive for a wider range of simulation sizes, whereas GPU computing becomes more powerful for large-scale simulations. |
Patrocinador/es: | This work was supported by the “Ministerio de Economía y Competitividad” of Spain under projects FIS2011-29803-C02-01, FIS2011-29803-C02-02 and by the “Generalitat Valenciana” of Spain under projects PROMETEO/ 2011/021, ISIC/ 2012/013, and GV/ 2012/099. |
URI: | http://hdl.handle.net/10045/26220 |
ISSN: | 0010-4655 (Print) | 1879-2944 (Online) |
DOI: | 10.1016/j.cpc.2012.09.025 |
Idioma: | eng |
Tipo: | info:eu-repo/semantics/article |
Revisión científica: | si |
Versión del editor: | http://dx.doi.org/10.1016/j.cpc.2012.09.025 |
Aparece en las colecciones: | INV - GHPO - Artículos de Revistas INV - Acústica Aplicada - Artículos de Revistas |
Archivos en este ítem:
Archivo | Descripción | Tamaño | Formato | |
---|---|---|---|---|
CPC_v184_n3_p469_2013.pdf | Versión final (acceso restringido) | 1,39 MB | Adobe PDF | Abrir Solicitar una copia |
Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.