agenda

The 2015 PBS Works User Group includes 2 days of meetings and presentations Tuesday and Wednesday, Sept. 15-16 (at the Computer History Museum in Mountain View, CA) — plus optional hands-on training workshops on Monday and Thursday, Sept. 14 & 17 (at Altair’s office in Sunnyvale, CA).

Watch presentation videos and review the slides from PBSUG here.

Monday, September 14, 2015

Location: Altair office in Sunnyvale, CA

Join us for our Welcome Reception!

Attend our Welcome Reception on Monday, September 14, 2015 at the Altair office in Sunnyvale from 5:00 – 7:00 p.m. Meet and get to know fellow users, Altair engineers and partners.


Tuesday, September 15, 2015

Location: The Computer History Museum

Join us for our Birthday Celebration!

Attend our Altair/PBS Birthday Party on Tuesday, September 15, 2015 at the Computer History Museum from 5:00 – 8:30 p.m. Help us celebrate 20 years of PBS and 30 years of Altair.


Wednesday, September 16, 2015

Location: The Computer History Museum


Thursday, September 17, 2015

Location: Altair office in Sunnyvale, CA

Abstract:

When looking to buy a used car, you kick the tires, make sure the radio works, check underneath for leaks, etc. You should be just as careful when deciding which nodes to use to run job scripts.

At the NASA Advanced Supercomputing Facility (NAS), our prologue and epilogue have grown almost into an extension of the O/S to make sure resources that are nominally capable of running jobs are, in fact, able to run the jobs. This presentation describes the issues and solutions used by the NAS for this purpose.

Abstract:

Computer based analysis are widely adopted along the cycles of product development in industry. A challenging requirement requested by HPC users is the ability to support a large number of simultaneous users while reducing the mean waiting time. This talk discusses a strategy of resource allocation that prioritizes the quantity of users served. Queue parameters such as service rate, queue size, and typical resource load are discussed. Next, an adaptive control strategy aiming idleness avoidance is presented. Parameters supporting the reduction of the waiting time are analyzed. It concludes that the ability in dealing with large amount of job submissions, as well as adaptation to CPU load deviation, were achieved.

Abstract:

The Tokyo Tech. TSUBAME2 supercomputer is one of the world’s leading supercomputer, ranked as high as #4 in the world on the Top500 and recognized as the “greenest supercomputer in the world” on the Green 500. With the GPU upgrade in 2013, it still sustains high performance (5.7 Petaflops Peak) and high usage (nearly 2000 registered users). However, such performance levels have been achieved with pioneering adoption of latest technologies such as GPUs and SSDs that necessitated non-traditional strategies in resource scheduling. Moreover, unlike some mission oriented supercomputers such as those in DoE or certain commercial sectors, TSUBAME2 usage portfolio is tremendously diverse, not only in the application portfolio, but also the usage patterns, expertise of users, etc. aggravating the scheduling challenge. Furthermore, external mandates such as the power shortage after the 3/11 Fukushima accident resulted in societal mandate to control power usage, thus strict power-aware scheduling strategies had to be brought forth to production. All of these combined were technical adventures that perhaps no other supercomputers had deployed, but solutions were made possible with PBS Professional as the underlying resource scheduler.

Abstract:

An overview of how PBS has been and is currently being used at NCI, Australia's national research computing facility. NCI hosts Australia's first petaflop system. We will discuss the progression of scheduling at NCI from the past of the OpenPBS based custom PBS scheduler on the NCI's previous supercomputers, to PBS Pro on the current peak system Raijin. We will also discuss the features and changes the have been made in PBS over this time and what we are working on for the future of PBS and scheduling at NCI including suspend/resume based scheduling, cgroups and more.

Abstract:

As HPC resource requirements continue to increase, the need for finding economical solutions to handle the rising requirements increases as well. There are numerous ways to approach this challenge, each of which have varying return on investment (ROI); unfortunately, some options that involve a higher ROI are often unknown or overlooked. For example, leveraging existing equipment, adding new or used equipment, and handling uncommon peak usage dynamically through cloud solutions managed by a central job management system can prove to be highly available and resource rich, while remaining economical. In this presentation we will discuss how Wayne State University implemented a combination of these approaches to dramatically increase our compute resources for the equivalent cost of only a few new servers.

Abstract:

At the Center for Computational Sciences, University of Tsukuba, we have been operating a large scale GPU cluster HA-PACS with 332 computation nodes equipped with 1,328 GPUs managed by PBS Professional scheduler. The users are spread out across a wide variety of computational science fields with widely distributed resource sizes from single node to full-scale parallel processing. There are also several categories of user groups with paid and free scientific projects. It is a challenging operation of such a large system keeping high system utilization rate as well as keeping fairness over these user groups. We have successfully been keeping over 85%-90% of job utilization under multiple constraints. In this talk, I will provide a case study of how PBS Pro works in our operation as well as some proposals on more flexible and valuable system operation.

Abstract:

The National Supercomputing Center supports research projects at the University of Nevada, Las Vegas by providing a full-service supercomputing facility, plus available training and services, to academic and research institutions, government and private industry. NSCEE’s focus is on R&D related to energy, the environment, medical informatics and health care delivery. In this presentation, Joe Lombardo will discuss specific NSCEE projects that leverage HPC workload management to increase speed and performance for compute-intensive work. Lombardo will highlight results from an Alzheimer’s research project that benefited from using PBS Professional. He will then describe the NSCEE’s new system at the Supernap and how this system can be used to advance research for HPC users in both academia/R&D and commercial industry. Lombardo will also highlight two emerging projects; the New School of Medicine and new Technology park.

Abstract:

A university environment can be a challenge in many ways, with a wide variety of differing demands from more than a hundred different research groups, so how can a High Performance Computing group hope to meet the requirements of everyone? In this presentation I’ll explore some of the drivers for the HPC services we run at Imperial College London and how this maps onto our PBS Professional configuration. My talk will also cover how we use different features of PBS Pro and what advantages and benefits they give to us.

Abstract:

Orbital ATK, an Aerospace & Defense Manufacturing & Technologies company, has consolidated several localized HPC systems across the US and now provides a centralized HPC service available to any of its US sites. While there are cultural and technical challenges in consolidation, there have been significant advantages to the business, as well. This presentation will address both, especially from the viewpoint of a medium size business (SMB) seeking to improve its level of service and system utilization while providing value to the business.

Abstract:

Historically cyber security in HPC has been limited to detecting intrusions rather than designing security from the beginning in a holistic, layered approach to protect the system. SELinux has provided the needed framework to address cyber security issues for a decade, but the lack of an HPC and data analysis eco-system based on SELinux and the perception that the resulting configuration is “hard” to use has prevented SELinux configurations from being widely accepted. This presentation will discuss the eco-system that has been developed and certified, debunk the “hard” perception, and illustrate approaches for both government and commercial applications. The presentation includes discussions on SELinux architecture and features, Altair PBS Professional Queuing System, Scale-out Lustre Storage, Applications Performance on SELinux (Vectorization and Parallelization), Relational Databases, and Security Functions (Auditing and other Security Administration actions).

Abstract:

The commercial world uses significant HPC resources for simulation and product design. An increasing number of HPC systems are deployed in the commercial space and their scale is getting larger and larger. These advanced systems push limits in every aspect of Enterprise IT. Accommodating such systems within the enterprise is a challenge, and there have been many recent changes to enterprise IT infrastructures and architectures resulting from the need to support HPC.

Although Commercial HPC is part of the Supercomputing family, Commercial HPC has unique focus areas and challenges. Cloud, Data Lake, Industrial Internet, Internet of Things and Big Data have created a huge impact on HPC systems. Each of these initiatives offers new opportunities to both HPC itself and to the enterprise. HPC systems must change themselves to accept and accommodate these changes.

This presentation covers how HPC transforms science in a corporate information technology ecosystem. It also explains the differences between Research HPC and Commercial HPC in terms of architecture, usage, resource planning and challenges.

Abstract:

In this session, we will open a roundtable to let attendees discuss needs and requirements for HPC on the cloud. The following topics can be discussed:

  • Performance
  • Security
  • Cost optimization
  • Eligible workloads
  • Internal process
  • Public Cloud (AWS, Azure, …)
  • Private Cloud (OpenStack, VMware,…)
  • Cloud bursting

By better understanding your needs for the Cloud, you will help us to make PBS better on the Cloud.

Privacy | © Copyright 2015 Altair Engineering, Inc. All Rights Reserved. PBS Works is a division of altair

Cray

Supercomputing leader Cray builds innovative systems and solutions enabling researchers in any discipline to meet existing and future simulation and analytics challenges. Leveraging 40 years of experience developing and servicing the world’s most advanced supercomputers, Cray brings you a comprehensive portfolio of high performance computing (HPC), storage and data analytics solutions delivering unrivaled performance, efficiency and scalability. With a solution for every budget and need, Cray makes it easy to take advantage of HPC advancements.

Visit www.cray.com

HP

HP offers a full portfolio of Servers that helps customers reduce cost and complexity, accelerate IT service delivery and enable business growth. With HP technology features and workload-optimized design, HP’s server portfolio advances HP’s vision for compute and the future of data center technology.

Visit www.hp.com/go/servers

Intel

Intel® is a world leader in computing innovation. The company designs and builds the essential technologies that serve as the foundation for the world’s High Performance Computing solutions. From workstations to the world’s most powerful supercomputers, Intel provides ever-higher performance for your technical computing applications to speed time to results, handle today’s unprecedented growth in data volumes and improve the accuracy & precision of modeling and simulation applications.

Visit www.intel.com/content/www/us/en/high-performance-computing/server-reliability.html

SGI

SGI is a global leader in large-scale clustered computing, high performance storage, HPC and data center enablement and services. SGI is focused on helping customers solve their most demanding business and technology challenges.

Visit www.sgi.com

insideHPC

Founded on December 28, 2006, insideHPC is a blog that distills news and events in the world of HPC and presents them in bite-sized nuggets of helpfulness as a resource for supercomputing professionals. Written and edited by supercomputing professionals with the help of readers and occasional contributors, insideHPC sifts through all the news so you don’t have to!

Visit www.insidehpc.com

HPCwire

HPCwire is the leader in world-class journalism for HPC. With a legacy dating back to 1986, HPCwire is recognized worldwide for its breakthrough coverage of the fastest computers in the world and the people who run them. For topics ranging from the latest trends and emerging technologies, to expert commentary, in-depth analysis, and original feature coverage, HPCwire delivers it all, as the industry’s leading news authority and most reliable and trusted resource.

Visit www.hpcwire.com today!