BEGIN:VCALENDAR
VERSION:2.0
X-WR-TIMEZONE:America/Chicago
PRODID:-//Apple Inc.//iCal 3.0//EN
CALSCALE:GREGORIAN
X-WR-CALNAME:Monitoring of High Performance Computing Systems
METHOD:PUBLISH
BEGIN:VTIMEZONE
TZID:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
TZNAME:CDT
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
TZNAME:CST
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
SEQUENCE:2
DTSTART;TZID=America/Chicago:20101117T121500
DESCRIPTION:ABSTRACT: As HPC systems grow to reach petascale and exascale proportions\, so does the complexity of monitoring these systems.  Many supercomputing sites have evolved over time using a mixture of task-specific monitoring applications\, monitoring protocols\, and a lot of home brew scripts.  While these methods often get the job done\, there is no "one size fits all" solution to monitoring HPC systems\, and development efforts to fill in the gaps are often duplicated across sites.  The purpose of this BOF is to bring together HPC system administrators to discuss system monitoring in all aspects of the cluster.  These monitoring needs include the health of the hardware itself and hardware failures\, environmentals\, node and cluster status\, queue status\, and job performance.  We will share ideas that work at the various sites to help spread best practices amongst the group\, and then focus time on what is lacking in our tools and methods.  Moving beyond the BOF\, a group of HPC site administrators has recently organized this year to share best practices and collaborate on building better monitoring systems.  We will be looking to expand this collaboration across interested attendees.
UID:bof120@sc10.supercomputing.org
SUMMARY:Monitoring of High Performance Computing Systems
DTEND;TZID=America/Chicago:20101117T131500
LOCATION:398
END:VEVENT
END:VCALENDAR
