Page tree

On this page

Overview

If your non-interactive job exceeds the combined size limit of 1 GB for its standard output and error streams, it will be killed and you may receive a message like the following in the job's .e file:

Job 1234.gadi-pbs has exceeded size limit of stdout and stderr on node gadi-cpu-clx-0001. Limit: 1.0GB, Used: 1.23GB. Job killed!

Why is there a size limit for stdout and stderr streams?

For a non-interactive job, PBS spools the job's stdout and stderr to a separate space in the allocated node's local hard disk. There's limited space on the local hard disk and this needs to be shared between all the jobs running on the node. If a job takes up all the allocated space for this, no more content will be able to be captured. It also means that other jobs won't be able to run on this node until a manual clean up is done. Hence, a combined size limit of 1GB has been set for a job's stdout and stderr streams.

What should I do?

For most jobs, a limit of 1GB for the standard output and error streams is more than sufficient. If a job uses more than this limit, it is likely to be unintentional. An easy way to fix this is by modifying your job script to redirect the output of your program to a file in your /scratch space, for instance.

Example:

# Redirect stdout and stderr to two separate files
./myprogram > /scratch/proj1/abc123/prog.out.${PBS_JOBID} 2> /scratch/proj1/abc123/prog.err.${PBS_JOBID}

# Combine stdout and stderr and redirect to a file
./myprogram &> /scratch/proj1/abc123/prog.out.${PBS_JOBID}