Applications¶
Commercial applications (Mathematica, Gaussian, etc.) do not work. Command not found.¶
TSUBAME4.0 requires a fee to use some commercial applications.
Detail : Fare overview
Initially, the target commercial application is not available, so it appears that the application is not installed.
$ module load gaussian
$ g16 inputfile
-bash: g16: command not found
To use the target commercial application, go to the TSUBAME Portal and perform Application Activation.
After application activation, run it again.
$ module load gaussian
$ g16 inputfile
Please note that app purchases must be made on an account-by-account basis.
Please also check I purchased a commercial application, but it is not available. .
When using Mathematica 14.1 or later¶
Mathematica has changed the names of GUI executable commands since version 14.1.
Version | Execution command name |
---|---|
14.0 | mathematica |
14.1 and later | WolframNB |
See Mathematica for details.
I purchased a commercial application, but it is not available.¶
Commercial applications must be purchased by account, not by group.
For example, if a group admin makes a purchase, only the group admin's account can use it.
Please complete the purchase process separately for all accounts you wish to use.
A commercial application I purchased suddenly becomes unusable.¶
Purchased commercial applications are available from the time of purchase until the end of the specified month.
The period of purchase may have exceeded. See Application Activation, and check the status.
How to use Python packages on PyPI (e.g. Theano)¶
You can install modules into your home directory (Example: Theano case)
$ pip install --user theano
X application (GUI) doesn't work¶
In this page, X applicatoin indicates the application that is installed in TSUBAME4.0 and can work on the X environment, that is GUI application.
Please check the troubleshooting below.
X server application is installed and active on the client PC¶
Windows
There are a lot of X server applications for Windows.
Please confirm that one of them is installed and on active.
Mac
Please confirm XQuartz is installed and configured.
https://support.apple.com/en-us/100724
Linux
Please confirm both of the X11 server application and its libraries are installed.
The X transfer option in the terminal is enabled.¶
A Terminal on Windows (Except for Cygwin)
The setting method differs depending on your terminal and X server application.
Please check the manual of each application.
Linux/Mac/Windows(Cygwin) Please confirm the ssh command contains the option -y and -c (these are the options for X transfer)
$ ssh <account_name>@login.t4.gsic.titech.ac.jp -i <key> -YC
$ ssh gsic_user@login.t4.gsic.titech.ac.jp -i ~/.ssh/t4-key -YC
$ man ssh
Error reproduces in another terminal/Xserver¶
There are various free terminal softwares/X server applications for Windows.
Please check the same error occurs another terminal/X server.
It may be due to compatibility between terminal and X server.
It may be compatible with commercial application.
If it does not reproduce in other applications, there is a possibility that it is an application specific problem.
In that case we can not respond even if you contact us, please understand.
In addition, depending on the X application, command options may be required.
Please check the manual of X application you want to use.
Some GL applications that do not work with normal X forwarding/VNC connection may work with VirtualGL, so please give it a try if needed.
For the detail of VirtualGL, please refer to User's Guide.
Operation check¶
If it is in an interactive node, the standard terminal emulator of X Window System is started with the following command. Please confirm whether to start.
$ xterm
Example of failure
xterm: Xt error: Can't open display:
xterm: DISPLAY is not set
Application use¶
Do not execute programs that occupy the CPU at login nodes. Please use compute nodes for Full-scale use including visualization.
Please refer to the FAQ below for information on using the GUI application at the compute node. Reference: FAQ "How to transfer X with qrsh"
When using node_f, X transfer can be performed with the ssh -Y command.
Please inform us of the following when you inquire
- Operating System you use (Example: Windows10,Debian12,macOS 14.4.1)
- Terminal environment that the error occurs (Cygwin, PuTTY/VcXsrv, Rlogin/Xming)
- Version
For Windows, the both versions of the terminal and X server application.
see the manuals for applications for checking versions.
Please inform the version of SSH in case of using Linux/Mac with the command below.
$ ssh -V
- Please send us the contents you tried so far, or if you get an error, please describe the error.
I would like to use an application not provided by TSUBAME 4.0¶
Installing applications not provided by TSUBAME 4.0¶
Please check if it applies to the following items.
If applicable, you can install it freely at your own risk.
Please check the installation manual and the license agreement of the application.
- Works with OS installed in TSUBAME4.0(Red Hat Enterprise Linux 9.3). Software requiring Windows or Mac OS won't work.
- Not requiring administrator privilege (root) to install it.
- Possible to install it to your own home directory or group disk. (It is not allowed to install it to any specified nodes' local disk.)
- With a valid license.
- Not requiring the change to the settings for the kernel, libraries or the system itself.
- If only under these conditions, you can install it and use it on your own responsibility.
- No need for CII support.
0.Notes
As described above, CII will not help anything about the applications brought by users, as we do not know anything about it.
In case of problems, users themselves must distinguish whether it comes from the application itself or the general issue of TSUBAME, and ask application vendors for application-specific problems.
The versions of libraries and drivers may be changed at the time of the regular maintenance of TSUBAME etc. In that case, you might need to reconfigure the application you had used. Please be aware of the risk of losing compatibility in the future.
1.Installation directory
You can install in the following two places.
Please choose suited according to your operation.
If you need to share within the TSUBAME group such as members of the laboratory, please use the high speed storage area.
Even if you change the permissions by chmod or some commands in the home directory, you can not share that.
- Home directory
- Group disk
Reference: TSUBAME 4.0 User's Guide "Strage system"
2.Installation method
Please install application according to the manual or README or community forum of the application to be installed.
Depending on the application, it is necessary to compile the library or module or something from the source file by yourself.
Below are some notes and a typical installation.
- Application management software such as zypper can not be used, you have to compile from the source file basically.
- If CUDA is used, it must be compiled on the compute node. (since the login node does not have a GPU).
- The TSUBAME4.0 OS is not a Debian distribution, so the apt command is not available.
- dnf commands cannot be used because they require root privileges. If you want to use them, please use Use Containers.
Example 1) executing configure script, generating makefile, then make, make test and make install:
$ ./configure --prefix=$HOME/install
$ make && make test
$ make install
$ mkdir build && cd build
$ cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/install
$ make install
$ ./install.sh
There is a problem with the operation of the distribution application¶
Since most of the troubles arising with regard to the distribution application are caused by the environment, we do not support them individually. Please solve yourself as you accepted at the time of application.
Even if you contact us, we can not respond.
Pre/post processing of commercial applications¶
When using the commercial application on TSUBAME4.0, there are the following two cases.
Perform all processing of pre / solver / post in TSUBAME4.0 Perform pre / post processing on client and perform solver processing with TSUBAME4.0
1.When performing processing of pre / solver / post in TSUBAME4.0
In TSUBAME 4.0, basically all the functions of pre / solver / post are introduced, and in case of execution at interactive node, it is possible to perform all processing of "pre, solver, post".
How to run in interactive jobs and how to use each process depends on commercial application. Please check the manual and user's guide of each application.
2.When performing pre / post processing on client and perform solver processing with TSUBAME4.0
Operation on TSUBAME may be unstable due to compatibility of X server. This problem can be avoided by performing pre-post processing on the client, so we distribute software.
Software is provided for improved convenience. Please note that distribution may be canceled depending on the situation.
The following procedure is necessary to perform pre / post processing on client and perform solver processing with TSUBAME4.0
Step 1: Apply for software usage and obtain it
Step 2: Install the software on the client
Step 3: Perform pre processing with software on the client
Step 4: Transfer the data created in Step 3 to TSUBAME
Step 5: Create a batch script for submitting the job scheduler
Step 6: Execute the qsub command in TSUBAME and execute the batch script created in Step 5
Step 7: Transfer the result data of Step 6 to the client
Step 8: Perform post processing with software installed on the client
The range of support by T4 Helpdesk about the program error such as segmentation fault¶
General
Please check the following related FAQ first
About common errors in Linux
"Disk quta exceeded" error is output
Error handling for each commercial application
1.For commercial applications
Supported without ABAQUS/ABAQUS CAE. Please inform the following information through inquiry.
-
Application name
Eg)Abaqus/Explicit -
Error message
Eg)buffer overflow detected -
JOB_ID
Eg)181938 -
Host name where the error occurred
Eg)r6n5 -
The situation in detail
Eg)The error occured when logged into r6n5 interactively with qrsh and executed the following command. Details are as follows:
$ module load abaqus intel
$ abq2017 interactive job=TEST input=Job1 cpus=6 scratch=$TMPDIR mp_mode=mpi
#Error#
Run package
*** buffer overflow detected ***:
/pathto/package terminated
======= Backtrace: =========
/lib64/libc.so.6(+0x721af)[0x2aaab0c001af]
...
(The rest is ommited)
It is necessary to register on the SIMULIA documentation site and resolve it yourself.
For information on the documentation site, please contact us from "Contact Us".
2.For the application compiled yourself
Not supported. Please resolve it yourself. See "I would like to use an application not provided by TSUBAME 4.0".
Error information is output when compiling with the traceback option.
Error handling for each commercial application¶
General
The following error occurs immediately after the program runs:
unable to connect to forwarded X server: Network error: Connection refused
Error: Can't open display: localhost:13.0.
Application name: Xt error: Can't open display:
Application name: DISPLAY is not set
The GUI program suddenly terminates
Please check the keep alive setting in the terminal you use. See the FAQ "Session suddenly disconnected while working on TSUBAME4.0."
A job abruptly aborted
Although various reasons can be considered, please check the gollowing.
* Check of batch error file (usually script_name.e.$JOBID )
* Check program-specific log file
* Check the free space of the directory
reference: FAQ
About common errors in Linux
"Disk quta exceeded" error is output
The range of support by T4 Helpdesk about the program error such as segmentation fault
How to install numpy, mpi4py, chainer, tensorflow, cupy etc. using python/3.9.18¶
If you want to install numpy, mpi4py, chainer etc. using python/3.9.18, do as follows.
$ module load intel cuda openmpi
$ python3 -m pip install --user python_modules
$ python3 -m pip install --user python_modules==version
License restriction on commercial application usage¶
Please refer to "License restriction on commercial application usage'
I want to install and use my library in R¶
In TSUBAME4.0, R-4.3.2 is available.
In addition to the basic package, the libraries available as default are as follows.
Rmpi, rpud
If you wish to use a library other than the above, you will need your own installation operation.
Since the installation directory of R is impossible due to the permission relationship, you can install / manage your own library after specifying the library path. The procedure is as follows.
Assuming that the library path is $HOME/Rlib, the library name is testlib, and the testlib.tar.gz is the source package, and operate as follows.
Load modules:
$ module load cuda openmpi R
$ mkdir ~/Rlib
$ wget https://cran.r-project.org/src/contrib/testlib.tar.gz
$ R CMD INSTALL -l $HOME/Rlib testlib.tar.gz
$ export R_LIBS_USER=$HOME/Rlib
$ R
> library(testlib)
An error occurs when mpi4py.futures.MPIPoolExecutor with openmpi is called¶
Sometimes like the following error occurs when mpi4py.futures.MPIPoolExecutor with openmpi is called.
[r5n2:26205] [[60041,0],0] ORTE_ERROR_LOG: Not found in file orted/pmix/pmix_server_dyn.c at line 87
1. mpirun -np
- use mpi4py with intel MPI
Port forward configuration for each terminal software¶
How to configure port forwarding for each terminal software as follows.
Please try the followings with allocating a compute node by qrsh/qsub.
As an example, suppose a compute node r7n7 is allocated, and connect local PC port 5901 to r7n7 port 5901.
1. MobaXterm
Tunneling -> New SSH Tunnel -> My computer with MobaXterm, input 5901 into "Forwarded port", in SSH server, input login.t4.gsic.titech.ac.jp into "SSH server", input username into "defaultuser", input 22 into "SSH port", in Remote server, input r7n7 into "Remote server", input 5901 into "Remote port" and save、choose key icon under Settings tab, and start the configured tunnel
2. OpenSSH/WSL
$ ssh -L 5901:r7n7:5901 -i <private key> -f -N <uesrname>@login.t4.gsic.titech.ac.jp
3. PuTTY
PuTTY Configuration -> Connection -> SSH -> Tunnels, input 5901 into "Source Port", r7n7:5901 into "Destination" and click "Add" and Open
4. teraterm
Setup->SSH forwarding->Add->input 5901 into "Forward local port", input r7n7 into "to remote machine", and input 5901 into "port" then click "OK"
file output stops by doing mpirun ... >& log.txt & with intel MPI¶
With intel MPI, output might stop by doing background execution like the following.
mpirun ... ./a.out >& log.txt &
mpirun ... ./a.out **< /dev/null >**& log.txt &
I want to link Intel oneAPI Math Kernel Library, what are the link options?¶
If you want to link Intel oneAPI Math Kernel Library、please fill the appropriate contents into Intel oneAPI Math Kernel Library Link Line Advisor, and get the link opthion from "Use this link line”.
How to use VNC from MobaXterm¶
If you have a GUI application on TSUBAME and X forwarding fails to draw or the performance is insufficient, TurboVNC may improve the situation.
Since MobaXterm has a built-in VNC client function, it is relatively easy to use.
Please refer to User's Guide. for how to start a VNC server on a compute node and how to connect it from MobaXterm.
An error such that "X fatal error. ***ABAQUS/ABQcaeG rank 0 terminated by signal 6 " occurs at modeling¶
An error such that "X fatal error. ***ABAQUS/ABQcaeG rank 0 terminated by signal 6 " occurs at modeling.
The error seems to occur when ABAQUS CAE is started, and modeling is performed in X transfer with MobaXterm, etc.
You can avoid this error by using VNC+VirtualGL, so please use it.
Please refer here for more information on how to use VNC from MobaXterm.
Please refer here for more information on how to use VNC via noVNC.
Please refer here for more information on how to use VirtualGL from VNC.
icc, icpc commands cannot be used with Intel OneAPI¶
Starting with Intel oneAPI 2024, the icc and icpc commands are no longer available.
Please use the icx and icpx commands.
We also recommend using the ifx command instead of the ifort command.
https://www.intel.com/content/www/us/en/developer/articles/release-notes/oneapi-fortran-compiler-release-notes.html
C++17 related errors occur when using Intel oneAPI.¶
Since Intel oneAPI 2024, the default standard for C++ has changed from C++14 to C++17.
As a result, the following error may occur
- error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]
- error: ISO C++17 does not allow dynamic exception specifications [-Wdynamic-exception-spec]
For more information, see here.
If this error occurs, do one of the following
- Modify the source code to conform to the C++17 standard.
- Specify the option -std=c++14 during translation to compile with the C++14 standard.
I'm having trouble building and running VASP.¶
Even if you inquire about VASP, we may not be able to respond due to licensing restrictions.
VASP Forum has been established as official user support.
Please also use the VASP Forum.
And, we have also posted some FAQs for your reference.
I would like to know the procedure to build a VASP with TSUBAME4.0.
"UCX ERROR failed to insert region" error occurs when running VASP
C++17 related errors occur when using Intel oneAPI.
I would like to know the procedure to build a VASP with TSUBAME4.0.¶
This is the procedure when VASP6.4.2 was built on TSUBAME4.0.
-
module load
$ module load nvhpc openmpi intel
-
Copy arch/makefile.include.nvhpc_ompi_mkl_omp_acc file and modify the following
$ cp arch/makefile.include.nvhpc_ompi_mkl_omp_acc makefile.include $ vi makefile.include $ diff arch/makefile.include.nvhpc_ompi_mkl_omp_acc makefile.include 20,21c20,21 < FC = mpif90 -acc -gpu=cc60,cc70,cc80,cuda11.0 -mp < FCL = mpif90 -acc -gpu=cc60,cc70,cc80,cuda11.0 -mp -c++libs --- > FC = mpif90 -acc -gpu=cc60,cc70,cc80,cc90,cuda12.3 -mp > FCL = mpif90 -acc -gpu=cc60,cc70,cc80,cc90,cuda12.3 -mp -c++libs 81c81 < MKLROOT ?= /path/to/your/mkl/installation --- > MKLROOT ?= /apps/t4/rhel9/isv/intel/mkl/2024.0/ 87,88c87,88 < SCALAPACK_ROOT ?= /path/to/your/scalapack/installation < LLIBS_MKL = -L$(SCALAPACK_ROOT)/lib -lscalapack -Mmkl --- > #SCALAPACK_ROOT ?= /path/to/your/scalapack/installation > #LLIBS_MKL = -L$(SCALAPACK_ROOT)/lib -lscalapack -Mmkl
-
make
$ make DEPS=1 -j12 2>&1 |tee make.log
Disable HCOLL when executing.
$ export OMPI_MCA_coll=^hcoll
$ make test
"UCX ERROR failed to insert region" error occurs when running VASP¶
Add runtime option -mca coll_hcoll_enable 0.
Example:
mpirun -mca coll_hcoll_enable 0 vasp_std >vasp.log
Unable to module load python¶
Python 3 is available by default in TSUBAME4.0.
No module load required.
When running OpenMPI/Intel MPI, an hcoll-related error or segmentation fault occurs.¶
This may be improved by specifying the following environment variables.
OpenMPI:export OMPI_MCA_coll=^hcoll
Intel MPI:export I_MPI_COLL_EXTERNAL=0
Binaries compiled with the -lblas option in Intel Compiler terminate abnormally at runtime¶
Intel Compiler provides a numerical calculation library called Intel MKL.
When using BLAS with Intel Compiler, please change the option to link the Intel MKL library instead of -lblas. For details, please refer to How to link Intel MKL.
Note that a wide variety of numerical calculation libraries exist, including dedicated libraries such as Intel MKL.Some of them have dependencies or conflicts among libraries.
If an error occurs in a numerical calculation library, please consider checking for conflicts or using another library.
GPU not used when running TensorFlow¶
When running TensorFlow on a GPU, there are requirements for the combination of TensorFlow, python, cudnn, and cuda.
Please refer to Tensorflow for details.
How to specify CPU number (%cpu) when using Gaussian?¶
In TSUBAME4.0, one node is used by multiple users. Therefore, when using anything other than node_f, the CPU number may start with a number other than 0.
Please be careful when specifying CPU numbers.
If you use Gaussian ( module load gaussian ) prepared in TSUBAME4.0 and are not aware of CPU number¶
When using Gaussian provided by TSUBAME4.0( module load gaussian ), CPU number is automatically specified in the environment variable GAUSS_CDEF according to the specified resource type. Therefore, please do not specify %cpu and the environment variable:GAUSS_CDEF.
If you want to specify the CPU number to be used by yourself. Or, when you use Gaussian prepared by yourself¶
There are three ways to set the CPU number appropriately.
Info
In all procedures, be sure to perform unset GAUSS_CDEF first. The CPU number setting must be initialized.
- Use node_f. Always start with CPU number 0 because it occupies a node.
- Specify the “number of CPUs” to be used instead of the CPU number. Specify the “number of CPUs” you want to use in %NProcShared or in the environment variable: GAUSS_PDEF. In this case, you cannot specify the CPU number to be used.
-
Check the dynamically assigned core number and specify it as a CPU number. The allocated core number can be confirmed in one of the following ways
- See /sys/fs/cgroup/cpuset/AGE/{JOB_ID}.1/master/cpuset.cpus file
- See the numactl -s command physcpubind.
Here is an example using the numactl -s command
For example, if 8 cores are allocated for cpu_8 and the following is displayed when the numactl -s command is executed, specify %CPU=80-87.
(It is recommended that you do not specify this because the cores from 272 to 279 are logical cores due to hyperthreading, and specifying this will reduce calculation efficiency.)
$ numactl -s | grep physcpubind physcpubind: 80 81 82 83 84 85 86 87 272 273 274 275 276 277 278 279
"No space left on device" error when running Gaussian¶
The free space in the scratch directory specified by Gaussian may have been exhausted.
Please change the scratch directory using the environment variable GAUSS_SCRDIR with reference to 環境変数 GAUSS_SCRDIR.
Info
The example scripts in the TSUBAME3 and early TSUBAME4.0 user's guides specified the following
export GAUSS_SCRDIR=$TMPDIR
Currently, this specification is no longer necessary because a larger capacity area (local scratch area) is set by default.
Please delete this designation if it remains, as it increases the risk of free space depletion.
I want to use Alphafold related databases¶
We have prepared databases on TSUBAME for use with the following software. These database files are large in size, so please avoid downloading them individually if at all possible.
Please refer to the respective links for details on how to use them.
- Alphafold2 database
- Alphafold3 database
- LocalColabfold database