SC is the International Conference for
 High Performnance Computing, Networking, Storage and Analysis

• Overview• Registration• Schedule• Keynote Speaker• Papers• Tutorials• Panels• Posters• Workshops• Birds-of-a-Feather• Doctoral Showcase• Awards• Disruptive Technologies• SC10 Challenges• Masterworks• Submissions Website

SC10 Disruptive Technologies

Each year, the SC Conference seeks out new technologies with the potential to disrupt the HPC landscape as we know it. Generally speaking, “disruptive technology” refers to drastic innovations in current practices such that they have the potential to completely transform the high-performance computing field as it currently exists — ultimately overtaking the incumbent technologies or software tools in the marketplace. For SC10, Disruptive Technologies examines new computing architectures and interfaces that will significantly impact the high-performance computing field throughout the next five to 15 years, but have not yet emerged in current systems. The Disruptive Technologies exhibits, located in the SC10 exhibit hall, will showcase technologies ranging from storage, programming, cooling and productivity software through presentations, demonstrations and an exhibit showcase.

Selected technologies for SC10 will be on display during regular exhibit hall hours. Please stop by the booth for more information on the presentations and demonstrations schedule.

SC10 Disruptive Technologies Participants

Ateji: A language technology makes HPC accessible to all applications developers,

Brocade: Virtual cluster switching revolutionizes Layer 2 Ethernet,

Cloud Era Ltd.:Elastic-R, a ubiquitous Google-Docs-like environment for scientific computing in the Cloud

Convey Computer: – Application specific computing: navigating an inflection point in HPC system architectures,

Cray, Inc.: Advanced data analytics on the Cray XMT,

Green Revolution Cooling: Fluid submersion cooling for HPC,

Luxtera: Silicon photonics: displaces current solutions to support 100Gbps,

MBA Sciences, Inc.: SPM.Python: scalable, parallel version of the Python language,

MIT and Bentley University: Solar-powered supercomputing

SGI: Essential technology for productivity at exascale,

Virident: Novel Flash memory card system ensuring sustainable high performance and high capacity,

VMware Inc., System Research Team (SRT) at Oak Ridge National Laboratory, Ohio State University, UnivaUD, Inc., and Deopli, Inc. - Virtualization for HPC,,,,, Exhibit Descriptions

Ateji - A language technology makes HPC accessible to all applications developers,

Exhibit Description:
" Ateji PX is an extension of the Java language with parallel primitives inspired from pi-calculus. It makes parallel programming simple and intuitive, efficient, provably correct, and accessible to the vast majority of application developers that are not experts in HPC. The technology is easy to adopt and will readily replace more complex approaches such as threads, which are a nightmare for most Java developers. AtejiPX requires minimal learning, and is compatible with existing source code and development environments. Ateji PX does not address the perceived needs of today's HPC experts, who are usually fighting for the last percents of performance increase. But it appeals to mainstream application developers who will be more than happy with a solution that scales reasonably well on recent hardware. Ateji PX works today on multicore SMP architectures, and is expected to target most parallel hardware architectures in the coming years, as well as other base languages such as C# or JavaScript. Together with the generalization of multicore PCs and cloud offerings, Ateji PX has the potential of changing HPC by making it accessible to a whole new community, orders of magnitude larger than the existing one.

Brocade - Virtual Cluster Switching Revolutionizes Layer 2 Ethernet

Exhibit Description:
"Brocade Virtual Cluster Switching (VCS) is a revolutionary Layer 2 Ethernet technology that improves network utilization, maximizes application availability, increases scalability, and dramatically simplifies the network architecture in next-generation virtualized data centers. With VCS, Brocade is bringing its proven fabric technology, deployed in 90% of the Global 1000 data centers, to Ethernet. The Ethernet fabrics will flatten network architectures and efficiently scale low-latency switched Layer 2 environments. With the elimination of Spanning Tree Protocol (STP) in a Layer 2 network, VCS ensures all links are active, provides sub-250ms link reconvergence, and minimizes broadcast traffic as the network scales. The fabric does not dictate any specific topology, placing no restriction on subscription ratios or desired network architectures. To minimize operational overhead, the fabric is self-forming and behaves, and is managed, as a single Logical Chassis. As switches are added into the VCS cluster, the cluster configuration is automatically sent from one of the corresponding switches in the fabric. The first switching platform delivering VCS technology will be demonstrated in the Brocade booth at SC10. Combining the value of a VCS fabric with a ground-up data center design, wire-speed switching, and port-to-port latency of 600ns, the Ethernet fabrics created using these switches are ideal for high-performance computing workloads that run on hundreds of compute nodes.” .

Cloud Era Ltd - Elastic-R, a Ubiquitous Google-Docs-like Environment for Scientific Computing in the Cloud

" Elastic-R makes it possible to use mainstream computing environments such as R, Scilab, Sage, SciPy, etc. as a service in the cloud. The full capabilities of the environments are exposed to the end user from within a simple browser. The scientist/educator/student can for example issue commands to a statefull remote R engine, install and use new packages, generate and interact with graphics, upload and process files, download results, etc. using high-capacity virtual machines that he/she starts and stops on-demand. Features include real time collaboration, sharing and re-using virtual machines, sessions, data, functions, spreadsheets, dashboards, and automatically generated Word documents and Excel workbooks which can be synchronized in real-time with R engines on the cloud. Computationally intensive algorithms can easily be run on any number of virtual machines that are controlled from within a standard R session. Elastic-R is also an applications platform that allows anyone to assemble statistical/numerical methods and data with interactive user interfaces for the end user. These interfaces and dashboards are created visually, and are automatically published and delivered from a simple URL. Elastic-R makes HPC and the cloud accessible to everyone. it delivers a combined e-Learning/e-Research environment with collaboration and social networking capabilities that may become "the facebook of computational sciences". Elastic-R will accelerate the current shift towards a more distributed, more open, and more participatory science. By lowering the barrier for everyone to create and publish services, it defines a new set of Web artifacts. Portal url: accounts are created upon request Slides and various documentation :

Convey Computer – Application Specific Computing: Navigating an Inflection Point in HPC System Architectures

" The path to exascale computing necessarily requires one or more disruptive hardware and software technologies. The limitations of commodity processors are well exposed, and building larger and more power-hungry data centers is impractical. The Convey disruptive technologies exhibit will propose that the industry is at an architectural inflection point—more x86 cores (regardless of packaging) is not the answer. Some form of heterogeneous computing is inevitable; we submit that application-specific logic is the only way to overcome the power/performance barrier imposed by the use of commodity instruction set architectures (ISAs). Because different HPC algorithms require different computer architectures for optimal efficiency, the ultimate solution is a system that is reconfigurable—from CPU logic gates to memory access methods, with supporting OS and compiler optimizations. The Convey disruptive technologies exhibit will feature a hypothetical solution to on-the-fly memory reconfiguration in an HPC system. This solution is already partially implemented in current production Convey systems—this exhibit will extend the notion to future memory architectures.”

Cray, Inc. - Advanced Data Analytics on the Cray XMT

" There is growing interest in the data analysis community in graph-oriented databases. The interactions between proteins in the human body, linked blog posts on the Internet, and intelligence data about the communications and movements of potential adversaries are all examples of data sets that have been represented graphically, because revealing the relationship pattern between several data items may be at least as important as the data items themselves. Scalability has been a critical problem for graph-oriented databases, and has prevented their broad market acceptance. The Cray XMT changes that picture. Its large shared memory and ability to perform well in spite of a large percentage of remote data references makes it ideal for large-scale graph analytics. We will demonstrate graph-oriented queries against large databases, performing an order of magnitude faster than they do on comparable-scale conventional MPPs. If the Cray XMT is accepted by the database market as the preferred platform for graph-oriented databases, it will transform both the database industry, which will now have a compelling reason to use supercomputers, and HPC, which will be entering a new application domain.”

Green Revolution Cooling - Fluid Submersion Cooling for HPC

"Fluid submersion cooling offers four advantages: 90% reduction in cooling energy, superior cooling, vastly reduced infrastructure costs, and flexibility • 90% less cooling energy. Cooling energy is typically 1/3 or more of the power of a data center, and server fans take 10%+ of a server’s power. Both the cooling energy and server fan power can be nearly eliminated to reduce overall power by 40%. • Superior Cooling. Allows high rack power densities up to 100kw/rack and high socket power dissipation. Thanks to a recently-awarded NSF SBIR grant, GR Cooling is testing over-clocking of current generation Intel Nehalem server CPUs. Our superior cooling allows CPU frequencies 30-50% higher than manufacturer specs while still keeping the CPU temperature within operating specs. • Low Infrastructure Costs: Because of the simple design (e.g. not sealed, low part count), our system costs are lower than competitive air cooling designs, even before counting energy, power infrastructure, or other savings. • Flexible: Our system does not require custom motherboards nor require SSDs instead of hard drives, unlike other submersion systems.”

Luxtera - Silicon Photonics: Displaces Current Solutions to Support 100Gbps

" As HPC applications require more processor computing power and processor performance increases, large scale computing clusters are faced with an impending interconnect performance bottleneck. To meet these needs, InfiniBand is transitioning to 100G EDR rates and Ethernet is transitioning to 100 Gigabit Ethernet, surpassing the performance of conventional technologies. Due to technological limitations, directly modulated Vertical-Cavity Surface-Emitting Lasers (VCSELs) and multimode fiber will be challenged to deliver cost effective and reliable solutions past 14Gbps. Copper is low cost but reach limited at 10Gbps and will only enable a few meters of interconnect at 25Gbps. Silicon Photonics, however, utilizes on chip waveguide modulation and photo-detection with single mode fiber to offer practically unlimited reach and performance at 25Gbps and costs per gigabit lower than current generation 10Gbps. Already disruptive at 10G by delivering the lowest cost 4x10G optical interconnect, Silicon Photonics is potentially the only solution for 4x25G connectivity, moving the $1/Gbps price performance of 4x10G to sub $0.50/Gbps at 25G. Luxtera has already successfully demonstrated Silicon Photonics-based 30G transmitter/receiver technology, paving the way for new Silicon Photonics commercial applications and wider deployment of low cost optical connectivity for EDR InfiniBand, 100G Ethernet and 28G Fibre Channel. Upon entering the market, Luxtera’s 30G transceivers/receivers will be disruptive by displacing existing VCSEL optics and short reach copper interconnects. The technology will offer low power consumption, low bit error rate, high reliability and longer interconnect reach than conventional solutions. Luxtera will demonstrate its 30G transceiver/receivers, detailing Silicon Photonics support for 100Gbps.”

MBA Sciences, Inc. - SPM.Python: Scalable, Parallel Version of The Python Language

" MBA Sciences has introduced SPM.Python, a commercial product based on patent-pending SPM technology that enables users to exploit parallelism in Python. Simply put, SPM.Python augments the traditional serial Python language with powerful parallel primitives for managing both coarse and fine grained tasks. This robust solution can be leveraged across a broad range of problem domains, without requiring the developers to become proficient in parallel programming. Just as we all drive our cars without worrying about the fine points of automotive engineering, SPM.Python lets developers program for modern hardware without worrying about the fine points of parallel programming. The parallel primitives are designed around one underlying narrative arc: developers should be able limit their focus on, and therefore author and maintain, serial component of their application without having to write a single line of parallel code. The underlying technology in SPM.Python incorporates solutions to several, what were, open problems. It extends the traditional notion of exception handling (within the serial context) across many compute resources. Parallel primitives are designed in a way so that serial (and hence customizable) components are delineated from parallel (and hence centralized) components. Thanks to the robust assumption tracking infrastructure and the coupling of parallel managers with communication primitives, the root causes for the vast majority of parallel deadlocks are simply eliminated. And, finally, the traditional notion of (serial) sequence point is enhanced so that the declaration and definition of parallel primitives may be done in a deterministic, and hence race free, manner.”

MIT and Bentley University - Solar-powered Supercomputing

" We will demonstrate state-of-the-art low-power supercomputing clusters. One is a cluster running LINPACK on Android. This has been in the field for some time and the numbers are impressive for an architecture that constrained. We will demonstrate a Linux cluster built out of Marvell PlugComputers. This is the next generation in a line of ultra-low-power embedded Linux devices designed for general applications. They are energy-efficient, and well-suited to specific tasks especially at single-precision. We will demo a Sony PS3 cluster, where we have made modifications that improve energy efficiency and cooling. This is modeled after the well-known UMass PS3 cluster. These designs will move the S-curve from where HPC is now; with incumbent legacy technologies and predictable growth patterns, to another paradigm where performance improvements will outpace what is predicted by Moore's Law. It will further reduce the cost of HPC to open new markets in the low-level cloud. This technology can be realized in five years if cell phone manufactures keep on their pace towards device convergence. Many developments in ARM processor technology have come from the smartphone industry, and the computational cluster business has benefited from this commodity processor market. Tensilica, in particular, has processors targeted at the cell-phone business that could bring about a number of interesting HPEC breakthroughs. The Power Wall problem has forced HPC users to reassess every aspect of their DataCenter design. Now they take power consumption into account and many applications can be run on these disruptive architectures more efficiently.”

SeaMicro Inc. – Ultra-efficient 12U server using 512 Intel Atom Embedded Cores

" SeaMicro will demonstrate an Internet-optimized x86-server that reduces by 75 percent the power and space used by servers. In development for three years, the SM10000 is the ultimate re-think of the volume server. Specifically optimized for the workloads and traffic patterns of the Internet, SeaMicro’s SM10000 integrates 512 IntelAtom processors with Ethernet switching, server management and application load-balancing to create a “plug and play” standards-based server that dramatically reduces power draw and footprint without requiring any modifications to existing software. The key benefits of the SM10000 include:

• using one-quarter of the power and taking one-quarter of the space to do the same work as the best-in-class volume server,
• industry leading density: 2,048 central processing units (CPUs) per standard rack,
• drop–in adoption by running off-the-shelf OSs and applications without change,
• flexible architecture that can support any CPU.

Reports from Google show that if current power trends continue, the cost of energy consumed by a server during its lifetime could surpass the initial purchase cost. In addition, the Environmental Protection Agency reports that volume servers consume more than one percent of the total electricity in the US—representing billions of dollars in wasted operating expense each year.”

SGI - Essential Technology for Productivity at Exa-scale

" The top research priorities to achieve Exascale this decade include power/cooling and resilience. SGI is well engaged in working on them. However, there are two other critical priorities that SGI can offer as disruptive and essential to a productive Exascale solution: communications across a giga-thread system and usability. In the communications area, we believe the interconnect architecture needs to be smarter for an Exascale system to work productively. With the classical architectural approaches: shared memory will be too inclusive and distributed memory too exclusive for Exascale. Therefore, one solution is a Global Address Space architecture that can exhibit ultra lightweight synchronization and expedient data movement, yet with giga-thread scalability. In the usability area, we envision the next generations of Altix UV as the front end of every Exascale system that’s built. Consider a system running across millions of cores, and generating massive datasets. The analysis of outputs from such systems could be as demanding as running the Exascale jobs themselves – requiring a system with extreme data-intensive handling capability.”

University of South Wales – BioFab

Exhibit Description:
"The realization of photonic integrated circuits from discrete components is often hampered by incompatibilities between materials and by the lack of a universal substrate for achieving monolithic integration. Here we report a novel method for substrate independent integration of dissimilar optical components by way of biological recognition directed assembly, dubbed "Biofab". Bonding in this scheme is achieved by locally modifying the substrate with a biological protein receptor and the optical component with a biomolecular ligand, or vice versa. The key features of this new technology include: substrate independent assembly, cross-platform vertical scale integration and selective registration of components based on complementary biomolecular interactions. At the booth we will demonstrate two concept devices based on Biofab: (i) a light emitting diode which incorporates III-V nitrides, silicon & II-VI compounds and (ii) silicon photonic structures assembled on a range of disparate substrates."

Virident - Novel Flash Memory Card System Ensuring Sustainable High Performance and High Capacity
Exhibit Description:

" The technology we would like to exhibit is embodied in our recently launched tachIOn drive product, which is a high performance NAND Flash-based storage device packaged in the PCIe low profile form-factor. By virtue of its (1) low power and space footprint; (2) high sustained performance on random small-block accesses in the presence of continuous flash management operations (300K IOPs on 4K reads, and 200K IOPs on 70:30 mixed workloads), and (3) large capacity (up to 800 GB of usable capacity using current generation devices), the tachIOn technology promises to fundamentally reshape the memory and storage hierarchy of HPC server architectures. The current tachIOn product is the first in a roadmap of novel hybrid memory and storage devices, which aims to bring together multiple non-volatile memory technologies to create a new "persistent memory" tier with multi-TB capacities. The disruptive nature of the technology lies in the impact on computer systems architecture and operating systems of integrating such a tier, on programming languages and application data structures to effectively harness the multi-millions of IOPs this tier is capable of, and the corresponding implications for HPC scale out architectures. One can imagine that within five to ten years, the fundamental server building block will look quite different with a many-core processor subsystem directly interacting with a persistent memory tier, and be programmed very differently with programming languages that combine high levels of parallelism with explicit management of random access vs. block access state and treat persistence as a first class construct.“

VMware Inc., System Research Team (SRT) at Oak Ridge National Laboratory, Ohio State University, UnivaUD, Inc., and Deopli, Inc. - Virtualization for HPC

Exhibit Description:
" Server virtualization -- the ability to run multiple virtual machines on a single physical system -- has the potential to radically change the HPC landscape as we know it. These changes will span the range from incremental mitigation of current pain points to entirely new ways of thinking about distributed, parallel computing. It is live-migration -- transferring running virtual machines and their applications from one physical system to another -- that will underpin the disruptive aspects of virtualized HPC. In the near-term this will enable dynamic migration of workload for power management and the ability to move beyond current distributed resource management (DRM) capabilities by allowing workload placement decisions to be revisited and updated in light of subsequent job arrivals. Longer term, live migration can be used to proactively shift pieces of parallel, distributed applications from failing nodes. According to Christensen, disruptors typically appeal to a new, emerging customer because they value aspects of the new technology that current customers do not. In the case of virtualization, this new capability has several facets which are all related to dynamic resource utilization, including power management, efficient use of capital resources, and application resilience. To accrue these benefits, these customers are willing to give up some performance, a tradeoff not traditionally embraced within HPC -- though the previous shift to clustered systems does share some similarities with this case. Virtualization has radically reshaped the enterprise datacenter and it has the capacity to do so for HPC as well over the coming years.”


   Sponsors    IEEE    ACM