fluent.com home page

   
 

Collaboration with Silicon Graphics Delivers Breakthrough Parallel Scaling for High-End CFD Computations

 

by Stan Posey, Technical Marketing Engineer, CAE Applications,
SGI and Mark Kremenetsky, Ph.D., CAE Applications Specialist, SGI

For several years, Silicon Graphics Inc. (SGI) and Fluent have pursued a cooperative development strategy to ensure top performance of all Fluent software on SGI/Cray Research systems. In a recent collaborative effort, we have achieved breakthrough levels of parallel scalability for FLUENT/UNS.

Parallel Design Issues

Parallel execution of FLUENT/UNS is based on domain decomposition, in which the flow field is divided into multiple partitions of roughly equal size in terms of required computational work. Each partition is solved on an independent processor, with information transferred between partitions through explicit message passing in order to maintain the coherency of the global solution.

The term "parallel scaling" refers to how parallel speedup remains linear or degrades as the number of processors is increased. There are several factors that can inhibit a high degree of parallel scaling. Solver and model algorithms, and their implementation, impact the frequency and amount of information that must be shared across partitions. Similarly, efficient planning of what data to share (the message content), and when to do so, is important. Optimum domain decomposition tools are critical, as they affect the load balancing between processors and determine the size of partition boundaries and consequent message passing requirements. Finally, the computer hardware and communications subsystems, along with the message passing tools used to access them, will impact parallel scaling. Both Fluent and SGI have made significant investments to address these software and hardware issues. In addition, it is noteworthy that individual CFD applications will scale differently, depending on model size and flow physics.

FLUENT/UNS Performance Enhancements

Initial benchmarking by SGI indicated that FLUENT/UNS gave linear scaling on up to 16 processors for typical test cases, confirmation that the issues noted above were handled well in the code design. For parallel scaling beyond 16 processors, it became apparent that the message passing system was the limiting issue. The team improved performance by implementing the latest SGI proprietary message passing interface, called MPI3.0, in FLUENT/UNS. This greatly improved performance compared to the public domain message passing library already incorporated into Fluent's code design.

Additional enhancements were required for scaling on the non-uniform memory architecture (NUMA) of the Origin 2000. These enhancements to MPI3.0 were made in order to enforce "processor-memory affinity" -- or to ensure that data reside in memory that is local to the process using the data. FLUENT/UNS was used as a test program during the MPI tuning project at SGI, in a good example of development collaboration between the two companies.

"One of our goals with application of CFD at Chrysler is to develop a rapid assessment tool for the early design and development phase. The scaling capability demonstrated by FLUENT/UNS on the Origin 2000 system is a very relevant step for us to achieve that goal."

Dr. Richard Sun, Supervisor of Core CFD, Chrysler

Industrial Testing

As part of the parallel scaling study, Chrysler Corporation provided an automotive underhood thermal management model that includes a coarse treatment of external aerodynamics and contains more than 1M cells. Calculations were run on an Origin 2000 system with 128 processors and 4 GBytes of memory. FLUENT/UNS achieved high parallel efficiency and a remarkable level of scaling -- nearly linear up to 64 processors (see figure 1.).

Figure 1. Parallel Speedup on 1M Cell Underhood Study

Towards Higher Resolution and More Complex Modeling

These recent performance achievements set the stage for CFD to expand beyond current modeling practices. Solution turnaround times have been reduced to where CFD can influence the design process. Model sizes can be increased to include higher resolution, yielding simulations of increasing complexity, realism, and accuracy. Finally, today's "grand challenges" to CFD, including applications like transient external aerodynamics or large eddy simulation (LES) turbulence modeling that have been limited by solution turnaround times, are more approachable. deflections of the structure take place.

Figure 2. Color-coded parallel partitions in the one-million-cell underhood thermal management benchmark study. FLUENT/UNS achieved outstanding parallel scalability.

Previous Article FluentNEWS Next Article