fluent.com home page

   
 

Parallel Computing on a Linux Cluster

 

By Diana L. Collier, Fluent Inc.

In the Spring 2003 issue of Fluent News, the Support Corner featured a section on Parallel Computing on a Windows® Cluster. Since that time, many calls have come in with questions about how to run FLUENT on a Linux cluster. Answers to some of the most frequently asked questions are summarized in this article.

View the pdf of this article

Q. What are the benefits of running FLUENT in parallel?

A. The primary goals of running FLUENT in parallel are to reduce calculation turnaround time for complex problems by using multiple processors (CPUs) and to run large cases that cannot run on a single 32-bit processor. (A 32-bit processor can handle up to 2 – 3 million-cell cases).
There are two ways to run FLUENT in parallel in the Linux environment. In one way, multiple processors on the same machine (a shared memory machine) are used, and in the other way, multiple machines (distributed memory) in a cluster are used.

Q. What is a shared memory machine?

A. Each processor in a multi-processor, or shared memory machine accesses and shares the same memory.

Q. What is a distributed memory machine?

A. A distributed memory machine, or cluster, is a group of two or more machines that are inter-connected for network processing. Each processor, or node in the cluster has access to its own local memory. To access another processor’s memory, it must send messages through the network.

Q. I have a dual processor computer (shared memory machine). How do I launch FLUENT to run on both processors?

A. Once FLUENT is installed, open up a terminal window, change directory to your working directory. Type: fluent 2d (or 3d, 2ddp, 3ddp) –t2. The flag t2 starts up the two-process version of FLUENT. The flags 2ddp and 3ddp make use of the double precision solvers.

View Larger Image
Running FLUENT on a cluster

Q. How do I configure my cluster so that I can run FLUENT on it?

A. The following steps are required before you can run FLUENT on a cluster:

  1. Make sure that all nodes in the cluster have access to FLUENT. This is usually (and most efficiently) done by installing FLUENT on one system and exporting the file system to the other nodes.
  2. Make sure that you can use the commands rsh or ssh to communicate between nodes without using a password. These commands are used to open a remote shell (rsh) or a secure remote shell (ssh) on another computer on the network.
  3. Make sure that your home or working directory is shared by all nodes in the cluster.

(Contact your local System Administrator if you need assistance with any of these tasks.)

Q. Once my network is configured properly, how do I launch FLUENT to run on the cluster?

A. A multi-processor FLUENT job is launched from a “host node,” which may or may not be used as a “compute node.” The graphical user interface (GUI) is run on the host node, and you can start a parallel FLUENT job by opening up a terminal window and typing the following:

fluent version –tnprocs –pcomm –cnf=hosts.txt

where:

version specifies the version of FLUENT you want to run (2d, 3d, 2ddp, or 3ddp);
–tnprocs specifies the number of processors on which to run the
FLUENT job; -t4 indicates that you want to use four processors, for example;
-pcomm specifies the network communicator you are using; see the table on page 42 for the correct pcomm command based on your installed software; and
hosts.txt specifies the name of the hosts (text) file that lists the compute nodes on the cluster.

For example, you would type the following to start a 3D job on four processors using the net communicator and the hosts file hosts.txt:

fluent 3d –t4 –pnet –cnf=hosts.txt

See Chapter 30 of the FLUENT User’s Guide for additional instructions.

Q. What are network communicators and which ones are available for use?

A. Network communicators are used to pass data among processors. Four communicators are supplied when you install FLUENT software. One of these (net) was developed at Fluent, and the other three make use of the publicly available MPICH communication libraries. The communicator nmpi is a network message-passing interface, so is designed for use on a cluster. The communicator smpi, on the other hand, is only used for shared memory machines. The fourth communicator offered with FLUENT, ssh, is for secure shell message passing. Three vendor-supplied communicators, Myrinet, Scampi, and Scyld, can also be used with FLUENT.

In general, one of these seven options will be chosen as the Default communicator in the File/Run/Select Solver panel mentioned above, so that you will have the best overall performance on the hardware you are using. (If you have purchased and installed one of the vendor- supplied systems, it will become the Default communicator.) Alternatively, you can select one of these using the –pcomm qualifier when you start a parallel FLUENT job from a terminal window.

View Larger Image

For comprehensive documentation on running FLUENT in parallel please visit the FLUENT User’s Service Center by logging onto www.fluentusers.com. If you have questions or encounter any difficulties, please do not hesitate to contact your local installation support engineer.


Previous Article FluentNEWS Next Article