Job Execution (Scheduler)¶

I get an error when submitting a job, but I do not know which option is bad¶

The correspondence varies depending on the error.

qsub: Unknown option¶

The "qsub: Unknown option" error also occurs when there is an error in the line description starting with "#$" in the job script, besides the option of the qsub command. A common mistake is putting a space before and after the character "=". Please try deleting the space around "=".

Job is rejected, h_rt can not be longer than 10 mins with this group¶

If you do not specify the TSUBAME group with the -g option or newgrp, it is considered "Trial run".
"Trial run" has limitation only within 10 minutes, this error occurs when the specified h_rt option specifies more than 10 minutes.

In case of "Trial run" please modify the h_rt option to 0:10:0.
If you want to execute it other than "Trial execution", specify the TSUBAME group with the -g option or newgrp.

Info

In this case, please confirm that you are participating in the appropriate TSUBAME group and that the TSUBAME group has points. For the TSUBAME group please check TSUBAME portal usage guide

Unable to run job: Job is rejected. h_rt must be specified.¶

It can not be executed because there is no description of h_rt option. Please set time and execute.

Unable to run job: the job duration is longer than duration of the advance resavation id AR-ID.¶

This error occurred because you specified a time longer than the reserved time.
Please refer the reservation of FAQ below.

error: commlib error: can't set CA chain¶

This error occurs when the home directory of TSUBAME3.0 is migrated (copied) to TSUBAME4.0 as it is.The error occurs because the certificate remains in the ~/.sge directory used by the TSUBAME3.0 scheduler and conflicts with the certificate used by the TSUBAME4.0 scheduler.

Since ~/.sge directory is not used by TSUBAME4.0, please delete it as follows.

$ cd $HOME
$ mv .sge .sge.back

If you remove ~/.sge and a job cannot be submitted even though the .sge directory does not exist, Contact us.

Unable to run job: job rejected: Only nnnn jobs are allowed per user¶

nnnn is a number.
This error occurs when the number of simultaneous job submissions reaches Number of jobs submitted at the same time per user.
Either finish the submitted jobs or delete them using the qdel command to reduce the number of submitted jobs to the limit.

Info

"Number of jobs submitted at the same time per user" is a setting to prevent system failure, and does not mean that jobs can be submitted up to this number.
Please help reduce the load on the scheduler by consolidating jobs to the extent possible.

Unable to run job: job rejected: Only nnnnn jobs are allowed per cluster¶

nnnnn is a number.
This error occurs when the number of simultaneous job submissions reaches Number of jobs submitted at the same time per cluster.
If this error is detected, we apologize for the inconvenience, but please contact us via Contact Us. The administrator will check the situation.

The job status is "Eqw" and it is not executed.¶

There is a possibility of a system failure, but it may be due to a job script mistake.

Please confirm with the following command.

$ qstat -j <job ID> | grep error

Please confirm the following points. After checking, delete jobs with "Eqw" status with qdel command.

Example)
When there is a problem with file permission.

error reason    1:           time of occurrence [5226:17074]: error: can't open stdout output file "<File of the cause>": Permission denied

When there is no line feed code problem, directory does not exist, or job script is invalid.

error reason    1:          time of occurrence [5378:990988]: execvp(/var/spool/age/<hostname>/job_scripts/<jobID>, "/var/spool/age/<hostname>/job_scripts/<jobID>") failed: No such file or directory

1. The line feed code of the job script is not in UNIX format(LF).

If the line feed code is set to CR + LF on windows, it will also occur, so please confirm with the actual script together.
You can confirm with the file command.

$ file <script file name>

Output in case of CR + LF

<Script file name>: ASCII text, with CRLF line terminators

Output in case of LF
```
<Script file name>: ASCII text
```

You can also confirm cat command.

$ cat -e <script file name>

Output in case of CR + LF

The end of line is displayed as ^M$

#!/bin/bash^M$
#$ -cwd^M$
#$ -l node_f=1^M$
#$ -l h_rt=0:10:00^M$
module load intel^M$

Output in case of LF

The end of line is displayed as $

#!/bin/bash$
#$ -cwd$
#$ -l node_f=1$
#$ -l h_rt=0:10:00$
module load intel$

The measures for line feed code are as follows

Do not edit scripts on Windows
When editing a script on Windows, make sure to check the line feed code by using an editor corresponding to the line feed code.
Correct the line feed code to LF with nkf command.

In case of other than LF, execute below command.

$ nkf -Lu file1.sh > file2.sh

Warning

file1.sh is an original file (before conversion) and file2.sh is a converted file, respectively. Both file names must be different. If their names are identical, it will be corrupted.

2. There is no such directory to be executed

Occurs when the execution directory described in the job script does not exist.
Please confirm with the following command.

$ qstat -j  <Job ID> | grep ^error

error reason    1:          09/13/2024 12:00:00 [2222:19999]: error: can't chdir to /gs/bs/test-g/user00/no-dir: No such file or directory

3. The job script is described in the background job (with "&")

It will not be executed if it is entered and submitted as a background job (with "&") as shown below.
Example）

#!/bash/sh
#$ -cwd
#$ -l node_f=1
#$ -l h_rt=1:00:00
#$ -N test

module purge
module load intel

./a.out &

4. When there is no file permission

Please set permissions appropriately.
Example) Grant read and execute permission to myself

$ chmod u+rx <script_file>

5. Disk Quota

Please check the group disk quota.
It is about 2 million inodes per 1 TB.

Please refer to FAQ below.

Checking the usage of group disks with command

How to terminate the job submitted to batch job scheduler¶

See "How to terminate the programs executed accidentally" for the deletion of the processes running on the login nodes.

1. When the job-ID is known

Terminate the job with qdel command as follows.

Example : If job-ID is 10056

$ qdel 10056

2. When the job-ID is unknown

Confirm the job-ID with qstat command, then incompleted jobs of the user are displayed.

Example: When GSIC user confirms the incompleted jobs, displayed as follows.

$ qstat
job-ID  prior  name user  state submit/start at     queue jclass slots ja-task-ID 
------------------------------------------------------------------------------------------
10053 0.555 ts1     GSIC   r     08/28/2024 22:53:44 all.q          28
10054 0.555 ts2     GSIC  qw     08/28/2024 22:53:44 all.q         112
10055 0.555 ts3     GSIC  hqw    08/28/2024 22:53:45 all.q          56
10056 0.555 eq1     GSIC  Eqw    08/28/2024 22:58:42 all.q           7

Tips

Delete jobs with Eqw by yourself. See here for the cause of it.
Refer to here if you want to change the status of a job to hqw.

State	Explanation
r	Running
qw	Waiting in order
hqw	Waiting for other jobs to finish because of the dependency
Eqw	Error for some reason
t	In transition from qw to r

I'd like to check the congestion status of compute node¶

Please check a stacked line chart in　Monitoring - [Job Scheduler Node Status] page. To check whether there are free compute nodes, see the green area on the chart.

Color	Status
■	Idle Nodes
■	Running Nodes
■	Reserved Waiting Nodes
■	Reserved Running Nodes

How to use scrath area¶

TSUBAME 4.0 provides the following scratch area. For details, please refer to Storage use on Compute Nodes in the TSUBAME4.0 User's Guide

1. Local scratch area

A local scratch area allocated only on the compute node.

For more information, please see here.

Info

You can not write directly under /local.

2. /tmp directory

For /tmp directory, please see here.

SSH login to compute nodes¶

For SSH login to compute nodes only node_f is possible.
Please use node_f when executing applications that use SSH when doing MPI communication.

For details, please check the section SSH login of "TSUBAME 4.0 User's Gude".

Submission of dependent job¶

If you want to execute batch job A-2, as soon as the batch job named A-1 finishes, please use the -hold_jid option to submit the job as shown below.

$ qsub -N A-1 MM.sh
$ qsub -N A-2 -hold_jid A-1 MD.sh

If you issue the qstat command afer submission, the status will be "hqw".

Want to execute multiple calculations at once in a batch job¶

If you want to multiple calculations in one job by executing batch, for example executing the four commands exec1, exec2, exec3, exec4 at onece, write the batch script as follows.

#!/bin/sh
#$ -cwd
#$ -l node_f=1
#$ -l h_rt=1:00:00

module load cuda
module load intel

exec1 &
exec2 &
exec3 &
exec4 &
wait

The above is only an example.

If you want to execute programs located in different directories at onece, you need to write the executable file from the path.
For example, if you want to directly execute a.out in folder1 of the home directory, you specify as below.

~/folder1/a.out &

If you need to the directory of the executable file and execute it.

cd ~/folder1
./a.out &

Or,

cd ~/folder1 ; ./a.out &

Warning

If the last line of the script file ends with "&", the job wil not run.
Do not forget to write the last wait command of the script.

Calculation starts on the login node before executing the qsub command¶

Typing the commands as described in the manual, the calculation starts on the login node before executing the qsub command.

GSICUSER@login1:~> #!/bin/bash
GSICUSER@login1:~> #$ -cwd
GSICUSER@login1:~> #$ -l node_f=2
GSICUSER@login1:~> #$ -l h_rt=0:30:0
GSICUSER@login1:~> module load matlab
GSICUSER@login1:~> matlab -nodisplay -r AlignMultipleSequencesExample

This is because you are directly executing commands on the shell that you need to write in the batch script.
Instead of executing them directly, create a batch script file and specify it with the qsub command.

If you are not familiar with terms such as batch script and shell, please see "1.Beginners of UNIX/LINUX" at "I'm a beginner, I don't know what to do."

The error when executing the qrsh command¶

Explain the error when running qrsh.

1.Your "qrsh" request could not be scheduled, try again later.

The error above indicates that there are no available vacant resource for interactive job.
Please retry it after the resource become available.

See "I'd like to check the congestion status of compute node" for the status of the use of compute node.

2.Job is rejected. You do NOT have enough point to finish this job

This error indicates that there is no TSUBAME points required for assuring the node.
Please check the point balance.

Reference: FAQ "How long will it take for TSUBAME points to be returned?"

3.Unable to run job: unable to send message to qmaster using port 6444 on host "jobconX": got send error. Exiting.

This error occurs when the AGE server side is under heavy load.
Please wait a while and try again.

Check the detail of an error message printed the log file¶

The following message may be printed to the log file in some case.

/var/spool/age/hostname/job_scripts/JOB-ID: line XX: Process-ID Killed  Program_Name

In this case, type the qacct command to check the job in detail.

$ qacct -j JOB-ID

The following is an output example of the qacct command. (Excerpt)
Please check the details using the man command.

1. Example when the memory resource is exceeded

$ qacct -j 4500000
qname        all.q               
hostname     r1n1              
group        GSIC          
owner        GSICUSER00            
project      NONE                
department   defaultdepartment   
jobname      SAMPLE.sh
jobnumber    4500000             
taskid       undefined
account      0 0 1 0 0 0 3600 0 0 0 0 0 0
priority     0      
cwd          /path-to-current
submit_host  login1 or login2    
submit_cmd   qsub -A TSUBAMEGROUP SAMPLE.sh
qsub_time    %M/%D/%Y %H:%M:%S.%3N
start_time   %M/%D/%Y %H:%M:%S.%3N
end_time     %M/%D/%Y %H:%M:%S.%3N
granted_pe   node_o          
slots        7                   
failed       0    
deleted_by   NONE
exit_status  137                              
maxvmem      120.000G
maxrss       0.000
maxpss       0.000
arid         undefined
jc_name      NONE

you need pay atttention to exit_status, accont and maxvmem in the example.
exit_status provides the cause of the error by exit code. The exit_status 137 indicates 128 + 9, but since the status occurs in various problem, you may not determine.

Then check the granted_pe and maxvmem.

The maximum memory usage, respectively.
It is estimated that 120 GB of memory was about to be actually used although up to 96 GB is available in node_o according to the User's Guide.

In TSUBAME, the job is killed automatically if the job used more memory size than assigned.

2. Example when the reserved time is exceeded

$ qacct -j 50000000
qname        all.q               
hostname     r1n1              
group        GSIC          
owner        GSICUSER00            
project      NONE                
department   defaultdepartment   
jobname      SAMPLE.sh
jobnumber    50000000             
taskid       undefined
account      0 0 1 0 0 0 600 0 0 0 0 0 0
priority     0      
cwd          /path-to-current
submit_host  login0 or login1    
submit_cmd   qsub -A TSUBAMEGROUP SAMPLE.sh
granted_pe   node_f          
slots        7                   
failed       0    
deleted_by   NONE
exit_status  137
wallclock    614.711                               
maxvmem      12.000G
maxrss       0.000
maxpss       0.000
arid         undefined
jc_name      NONE

you need pay atttention to exit_status, wallclock in the example.
exit_status provides the cause of the error by exit code. The exit_status 137 indicates 128 + 9, but since the status occurs in various problem, you may not determine.

So I will focus on account, wallclock.
The seventh digit of the account space break indicates the time (sec) for securing resources.
In this example it is 600 seconds.

Wallclock shows the elapsed time, which is 614 seconds in this example.

Since the calculation did not end within the resource securing time, it can be inferred that the job was forcibly terminated.

How to transfer X with qrsh¶

This FAQ will explain how to transfer X with qrsh.
In this method, you can use GUI applications on other than node_f.
Please follow the procedure below.

(Preliminary Work)
Enable X forwarding and ssh to the login node.
Reference: FAQ "X application (GUI) doesn't work" section 1 and 2.

1. After logging in to the login node, execute the following command.
In the example below, TSUBAMEUSER will use node_o from login1 for one hour.

TSUBAMEUSER@login1:~> qrsh -g TSUBAMEGROUP -l node_o=1,h_rt=1:00:00

2.Run the X application you want to use.
The following is an example with imagemagick.

GSICUSER@r1n3:~> module load imagemagick
GSICUSER@r1n3:~> display

ImageMagic

Warning

Depending on the GUI application, there are applications that can not be activated or calculated by limiting memory or SSH.

Info

For memory, please use the appropriate resource type.
ANSYS Fluent can not be launched due to SSH restriction. To avoid this, use the -ncheck option (not supported by manufacturer).
Schrodinger can be launched but can not compute by SSH restriction. You can use on node_f only.
For OpenGL applications, export __GLX_VENDOR_LIBRARY_NAME=mesa is required

"Warning: Permanently added ECDSA host key for IP address 'XXX.XXX.XXX.XXX' to the list of known hosts." in the error log¶

It is the message that sytstem added the certificate of IP address XXX.XXX.XXX.XXX in the known_host file(SSH server certificate list) , when there is a node connected for the first time or when the certificate of the host which had previously connected has been changed. This is a normal operation, and it does not affect the calculation result and can be ignored.

Errors and remedies of qsub command execution¶

This section explains the error messages that occurs after executing qsub command and its remdy.

Unable to run job: Job is rejected because too few parameters are specified.
A required parameter is not specified. You need to specify resource type and number of resources, and execution time.

qsub: Unknown option
There is an roor in qsub option specification. Please refer to this.

Unable to run job: Job is rejected. core must be between 1 and 2.
3 or more resources per job can not be used for trial execution. Specify 1 or 2 as the number of resources.

Unable to run job: Job is rejected, h_rt can not be longer than 10 mins with this group.
For trial execution, you can not submit jobs whose execution time exceed 10 minutes. Please refer to this.

Unable to run job: Job is rejected. You do Not enough point to finish this job.
Points to secure the specified resources and time are insufficient.
Please check the point status from the TSUBAME portal page.

Unable to run job: failed receiving gdi request response for mid=1 (got syncron message receive timeout error).
or
Unable to run job: got no response from JSV script"/apps/t4/rhel9/age/customize/jsv.pl".
Communication with the job scheduler will time out and the above error message may be displayed if the management node becomes in a state of high load due to a large amount of job input in a short time. The state of high load is temporary. Please try again after waiting a while.

About specification of batch job scheduler¶

TSUBAME 4.0 uses the batch job scheduler

Resource Type¶

See here for more information on resource types.

Job submission method¶

See here for more information on job submission method.

Also, please check the related FAQs below.

How to use scrath area
Submission of dependent job
How to transfer X with qrsh

About job limit¶

Please check "Various limit value list" about the current limit. If the submitted job exceeds the per-user limit, it will be kept in wait state "qw" even though there are enough idle nodes in TSUBAME4.0. Once the other jobs terminate and the job fits in the per-user limit, it becomes running state "r", if there's enough idel nodes.

About reservation¶

Reservation can be set in units of one hour and the node can be used 5 minutes before the reservation end time.
When submitting a job, it needs to be executed with the following command. AR ID can be confirmed on the portal.

$ qsub -g [TSUBAME Group] –ar [AR ID] <YOURSCRIPTNAME>

Since it is used up to 5 minutes before the reservation end time, you need to devise the -l option of the job script.
Example) Resource specification when reservation period is 2 days

#$ -l h_rt=47:55:00

"Reservation" does not apply to the above "Job limits", and has the "Reservation" restriction. Please check "Various limit value list" about the current limit.

Coping with error¶

Please check the related FAQ below for coping with error.

I get an error when submitting a job, but I do not know which option is bad
The job status is "Eqw" and it is not executed.
The error when executing the qrsh command
Check the detail of an error message printed the log file

About troubleshooting at reservation execution¶

We summarize the troubleshooting when jobs can not be submitted during reservation execution.
The following command is an example where the GSIC group executes the AR number 20190108 which is used on 2days.

1.Forgot to add ARID
If the "ar" option is not attached, the job will be executed as a normal job.
(points will be consumed as usual).

Example of NG
When the following command is executed, it is executed as a normal job.

$ qsub -g GSIC hoge.sh

OK example
Be sure to use the -ar option when making reservation execution.

$ qsub -g GSIC -ar 20190108 hoge.sh

2.h_rt longer than reserved time
If the h_rt option time specification is longer than the reserved time, the job will not flow.
Also, because it is a specification that will be used 5 minutes before the reservation end time, please shorten the specified time by 5 minutes from the reservation time.

Example of NG
It is not executed because reservation time is full.

$ grep h_rt hoge.sh
#$ -l h_rt=48:00:00
$ qsub -g GSIC -ar 20190108 hoge.sh

OK example (end time is -5 minutes)

$ grep h_rt hoge.sh
#$ -l h_rt=47:55:00
$ qsub -g GSIC -ar 20190108 hoge.sh

When executing after the reservation start time, such as when the program terminates abnormally or when a job can not be submitted before the reservation start time, it is necessary to consider elapsed time.
For example, if you submit a job after 2 hours from the reservation start time, it will be the following script. (When one minute of internal processing time from qsub command execution to allocation of compute nodes)

$ grep h_rt hoge.sh
#$ -l h_rt=45:54:00
$ qsub -g GSIC -ar 20190108 hoge.sh

Related URL

About the difference between h_rt and actual execution time¶

The time specified by h_rt also includes the time for preparation processing to execute the job submitted by the user. Therefore, the time specified by h_rt does not become actual job execution time.
The points to be consumed are calculated based on the job execution time excluding the preparation processing time. And the preparation process time is not constant because it varies depending on the status of the node where the job is to be executed.

A "command not found" error occured in qrsh/job script¶

Info

In TSUBAME 3.0, the following description was required in batch scripts, but is no longer necessary in TSUBAME 4.0.
. /etc/profile.d/modules.sh

If "command not found" error occurred when executing a command which is installed by external installer such as pip, please try the following on the login node:

$ type <command>
<command> is hasehd (/path/to/<command>)

Then you can confirm the path, and add the following to the job script:

export PATH=$PATH:/path/to

Here, /path/to is the directory where the command is located.

Related URLs
About common errors in Linux

run some programs on different CPUs/GPUs in a job script¶

It is possible to run some programs on different CPUs/GPUs as follows.
In this example, a.out uses CPU0-47+GPU0, b.out uses CPU48-95+GPU1, c.out uses CPU96-143+GPU2, d.out uses CPU144-191+GPU3.

#!/bin/sh
#$ -cwd
#$ -V
#$ -l node_f=1
#$ -l h_rt=00:30:00

a[0]=./a.out
a[1]=./b.out
a[2]=./c.out
a[3]=./d.out

for i in $(seq 0 3)
do
    export CUDA_VISIBLE_DEVICES=$i
    numactl -C $((i*48))-$((i*48+47)) ${a[$i]} &
done
wait

Submitted job takes a long time to execute¶

If the utilization rate of computation nodes is high, it may take a long time to execute a submitted job because the resources of computation nodes cannot be allocated to the job. The utilization status of computation nodes can be viewed from "Job Scheduler Node Status" in Monitoring.
If a submitted job takes a long time to execute, please consider reviewing the job from the following perspectives.

Are the "resource type" and "execution time" too large?
The execution order of submitted jobs is automatically determined by the job scheduling system. Basically, it is FIFO (First In First Out), but there are cases where backfilling is performed to maximize the use of resources and the order of execution is switched. Since smaller resource types and execution times are more likely to be subject to backfilling, please double-check that you have not specified more than necessary.
( In particular, node_f is very unlikely to be the target of backfilling because it occupies a node. )
Use of compute node reservation
TSUBAME4.0 allows you to reserve compute nodes. By reserving compute nodes, you can ensure that jobs are executed at the specified date and time. For information on how to reserve a compute node, please refer to Compute Nodes Reservation.
However, please note that if you reserve a compute node, the amount of TSUBAME points consumed will be approximately 1.25 to 10 times as much as if you run it with normal priority via the job scheduler (this will vary depending on the reservation period and season). For details, please refer to "Charging rules of TSUBAME4.0 supercomputer, Institute of Science Tokyo" in Regulations.
Using the Premium Option
TSUBMAE4.0 provides a “Premium Option (-p)”. When this option is specified, the execution priority for the corresponding job in the job scheduling system will be changed. Please refer to Job Script for more information on the premium option.
There are two precautions when using the Premium Option.
- The Premium Option may not be effective when there are no compute nodes available because the job scheduling system is automated. When using the Premium option, the TSUBAME point consumption will be larger than when executing with normal priority.
- For details, please refer to "Charging rules of TSUBAME4.0 supercomputer, Institute of Science Tokyo" in Regulations.