BEGIN:VCALENDAR
VERSION:2.0
X-WR-TIMEZONE:America/Chicago
PRODID:-//Apple Inc.//iCal 3.0//EN
CALSCALE:GREGORIAN
X-WR-CALNAME:Parallel Fast Gauss Transform
METHOD:PUBLISH
BEGIN:VTIMEZONE
TZID:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
TZNAME:CDT
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
TZNAME:CST
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
SEQUENCE:2
DTSTART;TZID=America/Chicago:20101116T110000
DESCRIPTION:ABSTRACT: We present fast adaptive parallel algorithms to compute the sum of N Gaussians at N points. Direct sequential computation of this sum would take O(N<sup>2</sup>) time. The parallel time complexity estimates for our algorithms are O(N/n<sub>p</sub>) for uniform point distributions and O( (N/n<sub>p</sub>) log (N/n<sub>p</sub>) + n<sub>p</sub>log n<sub>p</sub> ) for non-uniform distributions using n<sub>p</sub> CPUs. We incorporate a plane-wave representation of the Gaussian kernel which permits "diagonal translation". We use parallel octrees and a new scheme for translating the plane-waves to efficiently handle non-uniform distributions. Computing the transform to six-digit accuracy at 120 billion points took approximately 140 seconds using 4096 cores on the Jaguar supercomputer.  Our implementation is "kernel-independent" and can handle other "Gaussian-type" kernels even when explicit analytic expression for the kernel is not known. These algorithms form a new class of core computational machinery for solving parabolic PDEs on massively parallel architectures.
UID:pap294@sc10.supercomputing.org
SUMMARY:Parallel Fast Gauss Transform
DTEND;TZID=America/Chicago:20101116T113000
LOCATION:393
END:VEVENT
END:VCALENDAR
