コンテンツにスキップ

3. Environment

3.1. Storage

In TSUBAME4.0, a home directory and two types of group disks (fast storage area and large storage area) are available.

The home directory and fast storage area are built on SSD shared storage, and the large storage area is built on HDD shared storage.

TSUBAME4.0 Storage Mount point Capacity Filesystem
High-speed storage area
Home directory
(SSD)
/gs/fs
/home
372TB Lustre
Large-scale (Big) storage area
Shared application deployment
(HDD)
/gs/bs
/apps
44.2PB Lustre
Local scratch area (SSD) /local 1.92TB/node xfs

The local scratch area is located on the NVMe SSD of each compute node and can be used for temporary files, etc. during the computation.

Info

The capacity of the available local scratch area is determined by the resources acquired.
The shared scratch area (BeeOND) that was available in TSUBAME3 has been discontinued. For details, see Appendx.4. Storage for details.

Resource type Local scratch area (GB)
node_f 1920
node_h 960
node_q 480
node_o 240
gpu_1 240
gpu_h 120
cpu_160 96
cpu_80 48
cpu_40 24
cpu_16 9.6
cpu_8 4.8
cpu_4 2.4

3.2. Compute nodes

The compute node for TSUBAME 4.0 is a 4th generation AMD EPYC 9654 on the Zen4 architecture, with more than 6 times more cores per node than TSUBAME 3.0.

The compute node has 4 NVIDIA H100 Tensor Core GPUs.

TSUBAME3.0 TSUBAME4.0
Computing Unit Compute node HPE SGI ICE-XA 540 nodes Compute node HPE Cray XD665 240 nodes
Components (per node)
CPU Intel Xeon E5-2680 v4 2.4GHz x 2 Socket AMD EPYC 9654 2.4GHz x 2 Socket
Cores/Threads 14cores / 28threads x 2CPU 96cores / 192threads x 2CPU
Memory 256GiB 768GiB (DDR5-4800)
GPU NVIDIA TESLA P100 for NVlink-Optimized Servers x 4 NVIDIA H100 SXM5 94GB HBM2e x 4
SSD 2TB 1.92TB NVMe U.2 SSD
Interconnect Intel Omni-Path HFI 100Gbps x 4 InfiniBand NDR200 200Gbps x 4

Info

TSUBAME4.0 calculation nodes are from r1n1 to r23n11. r: 1 to 23 n: 1 to 10 or 11

3.3. Job Scheduler

TSUBAME4.0 uses the Altair Grid Engine (AGE), the successor to the UNIVA Grid Engine (UGE) of TSUBAME3.0.

The resource types in TSUBAME4.0 are as follows.

The number of resource types has increased, and the number of cores available for each resource type has also increased.

Resource type Physical CPU cores Memory (GB) GPUs Local scratch area (GB)
node_f 192 768 4 1920
node_h 96 384 2 960
node_q 48 192 1 480
node_o 24 96 1/2 240
gpu_1 8 96 1 240
gpu_h 4 48 1/2 120
cpu_160 160 368 0 96
cpu_80 80 184 0 48
cpu_40 40 92 0 24
cpu_16 16 36.8 0 9.6
cpu_8 8 18.4 0 4.8
cpu_4 4 9.2 0 2.4

3.3.1. Subscription Job

TSUBAME4.0 introduces a "subscription" that allows quasi-exclusive use of computation nodes on a monthly basis.

Only intramural users and joint use (academic) users can use this service.

To submit a job under the subscription system, add -q prior. Other options are the same as the pay-as-you-go system.

$ qsub -q prior -g [TSUBAME group] SCRIPTFILE
Option Description
-g Specify the TSUBAME group name.
Please add as qsub command option, not in script.
-q prior Subscription job.
Wait one hour at most until execution.

For more details about compute node subscription, check here.

Warning

Even if a job for the subscription group, note that if -q prior is not specified, the job will be processed as a pay-as-you-go job.

3.4. Software

3.4.1. Commercial application

The differences between commercial applications available in TSUBAME4.0 and TSUBAME3.0 can be found here.

Each application fee is required for the use of some commercial applications. For more details, please refer to Fare Overview Commercial Applications (Partially charged in TSUBAME4.0).

3.4.2. Freesoft

The difference between the free software available for TSUBAME4.0 and TSUBAME3.0 can be found here.

3.4.3. Applications used in TSUBAME 3.0

TSUBAME4.0 and TSUBAME3.0 have different compilers, MPI, and various libraries, so they cannot be run as they are. It is necessary to recompile the program on TSUBAME4.0.