|Knowledge Center Contents Previous Next Index|
External Job Submission and Execution Controls
This document describes the use of external job submission and execution controls called
eexec. These site-specific user-written executables are used to validate, modify, and reject job submissions, pass data to and modify job execution environments.
Understanding External Executables
About esub and eexec
LSF provides the ability to validate, modify, or reject job submissions, modify execution environments, and pass data from the submission host directly to the execution host through the use of the
eexecexecutables. Both are site-specific and user written and must be located in LSF_SERVERDIR.
Validate, modify, or reject a job
To validate, modify, or reject a job, an
esubneeds to be written. See Using esub
Modifying execution environments
To modify the execution environment on the execution host, an
eexecneeds to be written. See Working with eexec
To pass data directly to the execution host, an
eexecneed to be written. See Using esub and eexec to pass data to execution environments
Interactive remote execution
Interactive remote execution also runs
eexecif they are found in LSF_SERVERDIR. For example,
esub, and RES runs
eexecbefore starting the task.
esubis invoked at the time of the
ls_connect(3) call, and RES invokes
eexeceach time a remote task is executed. RES runs
eexeconly at task startup time.
DCE credentials and AFS tokens
eexecare also used for processing DCE credentials and AFS tokens. See the following documents on the Platform Web site for more information:
esub, short for
external submission, is a user-written executable (binary or script) that can be used to validate, modify, or reject jobs. The
esubis put into LSF_SERVERDIR (defined in
lsf.conf) where LSF checks for its existence when a job is submitted, restarted, and modified. If LSF finds an
esub, it is run by LSF. Whether the job is submitted, modified, or rejected depends on the logic built into the
Any messages that need to be provided to the user should be directed to the standard error (
stderr) stream and not the standard output (
In this section
- Environment variables to bridge esub and LSF
- General esub logic
- Rejecting jobs
- Validating job submission parameters
- Modifying job submission parameters
- Using bmod and brestart commands with mesub
- Use multiple esub (mesub)
- How master esub invokes application-specific esubs
- Configure master esub and your application-specific esub
Environment variables to bridge esub and LSF
LSF provides the following environment variables in the
This variable points to a temporary file containing the job parameters that
esubreads when the job is submitted. The submission parameters are a set of name-value pairs on separate lines in the format "
The following option names are supported:
Option Description LSB_SUB_ADDITIONAL String format parameter containing the value of the
bsubThe value of
-ais passed to
esub, but it does not directly affect the other
bsubparameters or behavior. The value of
-amust correspond to an actual
esubfile. For example, to use
bsub -a fluent, the file
esub.fluentmust exist in LSF_SERVERDIR.LSB_SUB_ADDITIONAL cannot be changed in or added to LSB_SUB_MODIFY_FILE.
LSB_SUB_BEGIN_TIME Begin time, in seconds since 00:00:00 GMT, Jan. 1, 1970 LSB_SUB_CHKPNT_DIR Checkpoint directory.The file path of the checkpoint directory can contain up to 4000 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name. LSB_SUB_COMMAND_LINE bsub job command argumentLSB_SUB_COMMANDNAME must be set in lsf.conf to enable esub to use this variable LSB_SUB_CHKPNT_PERIOD Checkpoint period in seconds LSB_SUB_DEPEND_COND Dependency condition LSB_SUB_ERR_FILE Standard error file name LSB_SUB_EXCEPTION Exception condition LSB_SUB_EXCLUSIVE "Y" specifies exclusive execution LSB_SUB_EXTSCHED_PARAM Validate or modify
LSB_SUB_HOLD Hold job (bsub -H option) LSB_SUB_HOSTS List of execution host names LSB_SUB_HOST_SPEC Host specifier LSB_SUB_IN_FILE Standard input file name LSB_SUB_INTERACTIVE "Y" specifies an interactive job LSB_SUB_LOGIN_SHELL Login shell LSB_SUB_JOB_DESCRIPTION cription LSB_SUB_JOB_NAME Job name LSB_SUB_JOB_WARNING_ACTION Job warning action specified by
LSB_SUB_JOB_ACTION_WARNING_TIME Job warning time period specified by
LSB_SUB_MAIL_USER Email address used by LSF for sending job email LSB_SUB_MAX_NUM_PROCESSORS Maximum number of processors requested LSB_SUB_MODIFY "Y" specifies a modification request LSB_SUB_MODIFY_ONCE "Y" specifies a modification-once request LSB_SUB_NOTIFY_BEGIN "Y" specifies email notification when job begins LSB_SUB_NOTIFY_END "Y" specifies email notification when job ends LSB_SUB_NUM_PROCESSORS Minimum number of processors requested LSB_SUB_OTHER_FILES The value is SUB_RESET if defined to indicate a
bmodis being performed to reset the number of files to be transferred.The file path can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.
numberis an index number indicating the particular file transfer value is the specified file transfer expression.For example, for
bsub -f "a > b" -f "c < d", the following would be defined:LSB_SUB_OTHER_FILES_0="a > b"LSB_SUB_OTHER_FILES_1="c < d"
LSB_SUB_OUT_FILE Standard output file name LSB_SUB_PRE_EXEC Pre-execution command.The command path can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name. LSB_SUB_PROJECT_NAME Project name LSB_SUB_PTY "Y" specifies an interactive job with PTY support LSB_SUB_PTY_SHELL "Y" specifies an interactive job with PTY shell support LSB_SUB_QUEUE Submission queue name LSB_SUB_RERUNNABLE "Y" specifies a rerunnable job"N" specifies a nonrerunnable job (specified with bsub -rn). The job is not rerunnable even it was submitted to a rerunable queue or application profileFor bmod -rn, the value is SUB_RESET. LSB_SUB_RES_REQ Resource requirement string-
does notsupport multiple resource requirement strings
LSB_SUB_RESTART "Y" specifies a restart job LSB_SUB_RESTART_FORCE "Y" specifies forced restart job LSB_SUB_RLIMIT_CORE Core file size limit LSB_SUB_RLIMIT_CPU CPU limit LSB_SUB_RLIMIT_DATA Data size limit LSB_SUB_RLIMIT_FSIZE File size limit LSB_SUB_RLIMIT_PROCESS Process limit LSB_SUB_RLIMIT_RSS Resident size limit LSB_SUB_RLIMIT_RUN Wall-clock run limit LSB_SUB_RLIMIT_STACK Stack size limit LSB_SUB_RLIMIT_SWAP Virtual memory limit (swap space) LSB_SUB_RLIMIT_THREAD Thread limit LSB_SUB_TERM_TIME Termination time, in seconds, since 00:00:00 GMT, Jan. 1, 1970 LSB_SUB_TIME_EVENT Time event expression LSB_SUB_USER_GROUP User group name LSB_SUB_WINDOW_SIG Window signal number LSB_SUB2_JOB_GROUP Options specified by bsub -g LSB_SUB2_LICENSE_PROJECT LSF License Scheduler project name specified by bsub -Lp LSB_SUB2_IN_FILE_SPOOL Spooled input file (bsub -is) LSB_SUB2_JOB_CMD_SPOOL Spooled job command file (bsub -Zs) LSB_SUB2_JOB_PRIORITY Job priority (bsub- sp and bmod -sp)For bmod -spn, the value is SUB_RESET LSB_SUB2_SLA SLA scheduling options LSB_SUB2_USE_RSV Advance reservation ID specified by bsub -U LSB_SUB3_ABSOLUTE_PRIORITY For bmod -aps, the value equal to the APS string given with the bmod -aps. For bmod -apsn, the value is SUB_RESET. LSB_SUB3_APP Options specified by bsub- app and bmod -app. For bmod -appn, the value is SUB_RESET. LSB_SUB3_AUTO_RESIZABLE Defines the job autoresizable attribute.LSB_SUB3_AUTO_RESIZABLE=Y if
bmod -aris specified. LSB_SUB3_AUTO_RESIABLE=SUB_RESET if
bmod -arnis used.
LSB_SUB3_RESIZE_NOTIFY_CMD Define the job resize notification command.LSB_SUB3_RESIZE_NOTIFY_CMD=<cmd> if
bmod -rncis specified.LSB_SUB3_RESIZE_NOTIFY_CMD=SUB_RESET if
bmod -rncis used.
LSB_SUB3_JOB_REQUEUE String format parameter containing the value of the -Q option to bsub. For bmod -Qn, the value is SUB_RESET. LSB_SUB3_CWD Current working directory specified on on the command line with bsub -cwd LSB_SUB_INTERACTIVELSB_SUB3_INTERACTIVE_SSH If both are specified by "Y", the session of the interactive job is encrypted with SSH.
LSB_SUB_INTERACTIVELSB_SUB_PTYLSB_SUB3_INTERACTIVE_SSH If LSB_SUB_INTERACTIVE is specified by "Y", LSB_SUB_PTY is specified by "Y" and LSB_SUB3_INTERACTIVE_SSH is specified by "Y", the session of interactive job with PTY support will be encrypted by SSH.bsub -ISp LSB_SUB_INTERACTIVELSB_SUB_PTYLSB_SUB_PTY_SHELLLSB_SUB3_INTERACTIVE_SSH If LSB_SUB_INTERACTIVE is specified by "Y", LSB_SUB_PTY is specified by "Y", LSB_SUB_PTY_SHELL is specified by "Y", and LSB_SUB3_INTERACTIVE_SSH is specified by "Y", the session of interactive job with PTY shell support will be encrypted by SSH.bsub -ISs LSB_SUB3_POST_EXEC Run the specified post-execution command on the execution host after the job finishes. Specified by bsub -Ep.The command path directory can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name. LSB_SUB3_RUNTIME_ESTIMATION Runtime estimate spedified by bsub -We LSB_SUB3_USER_SHELL_LIMITS Pass user shell limits to execution host. Spedified by bsub -ul. LSB_SUB_INTERACTIVE LSB_SUB3_XJOB_SSH If both are specified by "Y", the session between the X-client and X-server as well as the session between the execution host and submission host are encrypted with SSH.bsub -IX
Example submission parameter file
If a user submits the following job:
bsub -q normal -x -P my_project -R "r1m rusage[dummy=1]" -n 90 sleep 10
The contents of the LSB_SUB_PARM_FILE will be:LSB_SUB_QUEUE="normal" LSB_SUB_EXCLUSIVE=Y LSB_SUB_RES_REQ="r1m rusage[dummy=1]" LSB_SUB_PROJECT_NAME="my_project" LSB_SUB_COMMAND_LINE="sleep 10" LSB_SUB_NUM_PROCESSORS=90 LSB_SUB_MAX_NUM_PROCESSORS=90
This variable indicates the value
esubshould exit with if LSF is to reject the job submission.
The file in which
esubshould write any changes to the job environment variables.
esubwrites the variables to be modified to this file in the same format used in LSB_SUB_PARM_FILE. The order of the variables does not matter.
esubruns, LSF checks LSB_SUB_MODIFY_ENVFILE for changes and if found, LSF will apply them to the job environment variables.
The file in which
esubshould write any submission parameter changes.
esubwrites the job options to be modified to this file in the same format used in LSB_SUB_PARM_FILE. The order of the options does not matter. After
esubruns, LSF checks LSB_SUB_MODIFY_FILE for changes and if found LSF will apply them to the job.
tip:LSB_SUB_ADDITIONAL cannot be changed in or added to LSB_SUB_MODIFY_FILE.
Indicates the name of the last LSF command that invoked an external executable (for example,
External executables get called by several LSF commands (
lsrun). This variable contains the name of the last LSF command to call the executable.
General esub logic
esubruns, LSF checks:
- Is the
esubexit value LSB_SUB_ABORT_VALUE?
- Yes, step 2
- No, step 4
- Reject the job
- Go to step 5
- Does LSB_SUB_MODIFY_FILE or LSB_SUB_MODIFY_ENVFILE exist?
Depending on your policies you may choose to reject a job. To do so, have
esubexit with LSB_SUB_ABORT_VALUE.
esubrejects the job, it should not write to either LSB_SUB_MODIFY_FILE or LSB_SUB_MODIFY_ENVFILE.
The following Bourne shell
esubrejects all job submissions by exiting with LSB_SUB_ABORT_VALUE:#!/bin/sh # Redirect stderr to stdout so echo can be used for # error messages exec 1>&2 # Reject the submission echo "LSF is Rejecting your job submission..." exit $LSB_SUB_ABORT_VALUE
Validating job submission parameters
One use of validation is to support project-based accounting. The user can request that the resources used by a job be charged to a particular project. Projects are associated with a job at job submission time, so LSF will accept any arbitrary string for a project name. In order to ensure that only valid projects are entered and the user is eligible to charge to that project, an
esubcan be written.
The following Bourne shell
esubvalidates job submission parameters:#!/bin/sh . $LSB_SUB_PARM_FILE # Redirect stdout to stderr so echo can be used for error messages exec 1>&2 # Check valid projects if [ $LSB_SUB_PROJECT_NAME != "proj1" -o $LSB_SUB_PROJECT_NAME != "proj2" ]; then echo "Incorrect project name specified" exit $LSB_SUB_ABORT_VALUE fi USER=`whoami` if [ $LSB_SUB_PROJECT_NAME = "proj1" ]; then # Only user1 and user2 can charge to proj1 if [$USER != "user1" -a $USER != "user2" ]; then echo "You are not allowed to charge to this project" exit $LSB_SUB_ABORT_VALUE fi fi
Modifying job submission parameters
esubcan be used to modify submission parameters and the job environment before the job is actually submitted.
The following example writes modifications to LSB_SUB_MODIFY_FILE for the following parameters:
In the example, user
userAcan only submit jobs to queue
userBmust use Bourne shell (
/bin/sh), and user
userCshould never be able to submit a job.#!/bin/sh . $LSB_SUB_PARM_FILE # Redirect stderr to stdout so echo can be used for error messages exec 1>&2 USER=`whoami` # Ensure userA is using the right queue queueA if [ $USER="userA" -a $LSB_SUB_QUEUE != "queueA" ]; then echo "userA has submitted a job to an incorrect queue" echo "...submitting to queueA" echo 'LSB_SUB_QUEUE="queueA"' > $LSB_SUB_MODIFY_FILE fi # Ensure userB is using the right shell (/bin/sh) if [ $USER="userB" -a $SHELL != "/bin/sh" ]; then echo "userB has submitted a job using $SHELL" echo "...using /bin/sh instead" echo 'SHELL="/bin/sh"' > $LSB_SUB_MODIFY_ENVFILE fi # Deny userC the ability to submit a job if [ $USER="userC" ]; then echo "You are not permitted to submit a job." exit $LSB_SUB_ABORT_VALUE fi
Using bmod and brestart commands with mesub
You can use the
bmodcommand to modify job submission parameters, and
brestartto restart checkpointed jobs. Like
mesub, which in turn invoke any existing
esubexecutables in LSF_SERVERDIR.
brestartcannot make changes to the job environment through
esub. Environment changes only occur when
mesubis called by the original job submission with
Use multiple esub (mesub)
LSF provides a master
LSF_SERVERDIR/mesub) to handle the invocation of individual application-specific
esubexecutables and the job submission requirements of your applications.
- Use the
bsubto specify the application you are running through LSF.
For example, to submit a FLUENT job:
bsub -a fluent
The method name
fluent, uses the
esubfor FLUENT jobs (
LSF_SERVERDIR/esub.fluent), which sets the checkpointing method
LSB_ECHKPNT_METHOD="fluent"to use the
To specify a mandatory
esubmethod that applies to all job submissions, you can configure LSB_ESUB_METHOD in
LSB_ESUB_METHOD specifies the name of the
esubmethod used in addition to any methods specified in the
LSB_ESUB_METHOD="dce fluent"defines DCE as the mandatory security system, and FLUENT as the mandatory application used on all jobs.
restriction:After LSF version 5.1, the value of -a and LSB_ESUB_METHOD must correspond to an actual esub file in LSF_SERVERDIR. For example, to use bsub -a fluent, the file esub.fluent must exist in LSF_SERVERDIR.
How master esub invokes application-specific esubs
mesubat job submission, which calls
esubprograms in this order:
esubprograms defined by LSB_ESUB_METHOD
- Any existing executable named
esubprograms in the order specified in the
In this example:
esub.dceis defined as the only mandatory
- An executable named
esubalready exists in LSF_SERVERDIR
- Executables named
esub.licenseexist in LSF_SERVERDIR
bsub -a fluent licensesubmits the job as a FLUENT job, and
mesubinvokes the following esub executables in LSF_SERVERDIR in this order:
-aoption submits the job, and
mesubinvokes only the mandatory
esub.dceand the existing
esubin LSF_SERVERDIR, not the application-specific
Configure master esub and your application-specific esub
esubis installed as
LSF_SERVERDIR/mesub. After installation:
- Create your own application-specific
- Optional. Configure LSB_ESUB_METHOD in
lsf.confto specify a mandatory
esubfor all job submissions.
Name your esub
- Use the following naming conventions:
- On UNIX,
- On Windows,
For FLUENT jobs, for example:
- UNIX: esub.fluent
The name of the
esubprogram must be a valid file name. It can contain only alphanumeric characters, underscore (
_) and hyphen (
Your existing esub does not need to follow this convention and does not need to be renamed. However, since mesub invokes any esub that follows this convention, you should move any backup copies of your esubs out of LSF_SERVERDIR or choose a name that does not follow the convention (for example, use esub_bak instead of esub.bak).
Working with eexec
eexecprogram runs on the execution host at job start-up and completion time and when checkpointing is initiated. It is run as the user after the job environment variables have been set. The environment variable LS_EXEC_T is set to START, END, and CHKPNT, respectively, to indicate when
If you need to run
eexecas a different user, such as root, you must properly define LSF_EEXEC_USER in the file
/etc/lsf.sudoers. See the
Platform LSF Configuration Referencefor information about the
eexecis expected to finish running because the parent job process waits for
eexecto finish running before proceeding. The environment variable LS_JOBPID stores the process ID of the process that invoked
eexecis intended to monitor the execution of the job,
eexecmust fork a child and then have the parent
eexecprocess exit. The
eexecchild should periodically test that the job process is still alive using the LS_JOBPID variable.
Using esub and eexec to pass data to execution environments
esubneeds to pass some data to
eexec, it can write the data to its standard output for
eexecto read from its standard input (
stdin). LSF effectively acts as the pipe between
Standard output (
stdout) from any
esubis automatically sent to
eexeccannot handle more than one standard output stream, only one
esubcan use standard output to generate data as standard input to
For example, the
esubfor AFS (
esub.afs) sends its authentication tokens as standard output to
eexec. If you use AFS, no other
esubcan use standard output.
Platform Computing Inc.
|Knowledge Center Contents Previous Next Index|