Source documentation

 I put this text here in case you feel like hacking the source code
 but need some help getting started. Feel free to send questions to
 kjems@bond.imm.dtu.dk.
    

 Here is a brief description of the life of a queued job.  Line
 number refer to jobserv-1.59-2. I hope you can extrapolate the
 line numbers to newer versions..

 1. The user runs the qr program on any node. The qr program parses
 jobd.conf to get the name of the master (which runs jobd).  qr
 connects to jobd on the master and delivers a REQQUEUE command. This
 command string contains parameters describing the options given by
 the user. The master acknowledges this and qr sends the script line
 by line. After the last line a line containing the string "VERIFY"
 is sent. (you can see this traffic with the -d option for qr)

 2. The main loop of jobd (lines 127-234) collects incoming commands
 for the jobd server. The main loop is capable of handling many
 simultaneous connections with good performance so it looks kind of
 complicated, but the functionality is simple: whenever someone
 submits a command, the function "handle_request" is called (l 225)
 This function parses the command and in this case calls the function
 handle_reqqueue to take care of the queue request. Note that the
 socket (global variable $sock) is still open and that until now only
 a single line with the REQQUEUE command has been transfered - not
 the actual script.

 3. The job for "handle_reqqueue" is to parse the arguments, check
 for a lot of possible error conditions, build a hash ("%rq")
 describing the queue parameters (user id, queue name, mail options
 etc.) and call the function "add_queued_resource" which takes care
 of actually downloading the script and creating the entry in the
 global queue.

 4. "add_queued_resource" calls "new_queue_entry" to make a new queue
 entry and possible construct a new queue (a queue entry can go into
 its own queue (call qr with the -n option to give a queue name) or
 wait at the end of an already existing one).

 "new_queue_entry" invents a unique key-the letter q followed by
 digits- ($qkey, e.g. "q17") which is used as the hash key for this
 queue entry (and also in the future as the job hash key when the job
 eventually runs). The key is added to a hash list of pending keys
 (%pendingscripts, line 1915), which means that the entry is pending
 to be down-(or should I say up?)-loaded to the jobd. At the same
 time, a unique file name is generated (line 1917) for temporary
 script storage in $TMPDIR (e.g. /var/spool/jobd).

 Line 1929 checks if the queue already exists. The global hash "%qn"
 lists all current queue names and contains information about owner
 and a list of keys for the entries in the queue. The hash is indexed
 using the queue name (which defaults to "def_joe" for user joe if he
 did not supply a queue name), i.e. $qn{"def_joe"}. The value of the
 hash is a reference to a hash holding information about the queue,
 e.g. $qn{def_joe}->{ownedby} is the userid of the owner of the
 queue. The list of the queue entries is in $qn{def_joe}->{inqueue}
 which is a perl list of the queue keys. The information for each
 queue enty is stored in the global "%queue" hash. Line 1962 inserts
 this information as a reference to a hash, e.g. $queue{q17}{workdir}
 is the workdir for queue entry q17.

 Back in add_queued_resource: The queue entry has been inserted in
 the database, and the entry has been marked as "pending"
 i.e. awaiting for the script to upload. The database does not have
 the time to wait for the script to arrive, so the process forks (the
 parent returns to the main loop to service further requests) and the
 child takes care of the actual uploading (lines 2029-2112). Once the
 script has been successfully uploaded the child process connects to
 the jobd and submits a TRANSFERMADE command, telling jobd that the
 queue is no longer pending.

 The jobd process (which has returned back to the main loop while the
 child downloads the script) receives the TRANSFERMADE command from
 the forked child and calls "handle_transfermade". Recall at this
 point that it is possible for a user to submit many jobs in one
 script file (each job is separated by the line "#NEXT" or
 "#NEXTSCRIPT" in the script file).  There are two arguments for
 TRANSFERMADE. The first is the key and the second tells the number
 of scripts that the child counted in the script. The function
 "add_multiqueue_entries" takes care of creating the extra entries in
 the queue (only the first one was allocated in "new_queue_entry"!)




 DESCRIPTION OF GLOBAL VARIABLES
-------------------------------------------------------------------
 (run "jobmgr dbg | less" as root to see a debugging dump the the
 variables):

 The "%qn" hash holds an entry for each existing queue, it is indexed
 bye the queue name. A queue exists as long as there are jobs in the
 queue or runnning jobs, i.e. the queue is deleted when the last job
 finishes. It holds information about owner, queue activity state
 etc.


%qn             Hash of hashes          $qn{def_joe}{inqueue}[2] is
                                        the third (first is 0) queue
                                        entry (e.g. "q17")

        $qn{def_joe}{submitted} = 970641657             # Time when submitted
        $qn{def_joe}{workdir} = /home/joe/here          # Workdir
        $qn{def_joe}{priority} = 10                     # queue priority
        $qn{def_joe}{ownedby} = 356                     # Uid of owner
        $qn{def_joe}{njob} = 0                          #
        $qn{def_joe}{groupid} = 100                     # gid of owner
        $qn{def_joe}{nicevalue} = 7                     # nice level to run job as
        $qn{def_joe}{inqueue} = ARRAY(0x837a038)        # list of queue keys (refer to %queue)
        $qn{def_joe}{active} = 1                        # State of queue (can be stopped or run-nights-only)
        $qn{def_joe}{zapthisjob} = 1                    # Whether to delete script and script output on
                                                        # node after computation finishes
        $qn{def_joe}{nsubmitted} = 0                    # Number of jobs that have been started
        $qn{def_joe}{nrunning} = 0                      # Number of jobs running (n.o. jobs started minus n.o. jobs finished)




The "%queue" hash holds descriptions of jobs waiting in the queue. 

   

        $queue{q17}->{qname}            #  Name of the queue this entry belongs to
        $queue{q17}->{groupid}          #  groupid of owner
        $queue{q17}->{userid}           #  userid of owner
        $queue{q17}->{need}             #  string specifying requirements for node
        $queue{q17}->{zapthisjob}       #  non-zero if the script and process output files on the
                                        #  node should be deleted when job finishes running
        $queue{q17}->{doneoutput}       #  Optional filename to redirect job output (stdout & stderr)
        $queue{q17}->{soleaccess}       #  flag for soleaccess (only job on node)
        $queue{q17}->{mem}              #  mem requirement (in MB)
        $queue{q17}->{load}             #  load requirement (in CPU's)
        $queue{q17}->{workdir}          #
        $queue{q17}->{mailfinished}     #  nonzero if mail should be sent when last job in queue finishes computation
        $queue{q17}->{mailoutput}       #
        $queue{q17}->{mailwhendone}     #


  %job is a hash of hashes indiced by the job keys. It stores
information about currently running jobs, including info about reserved
cpu, mem and actual load. Example:

	   job{j20}{mem_min} = 1                            # min reserved mem range. Attempt to make the process at least get this much.
	   job{j20}{mem_max} = 6.8046875                    # max reserved mem (in MB). Process is reniced if it uses more than this.
	   job{j20}{load_min} = 0                           #  - cpus. Attempt to make the process at least get this much.
	   job{j20}{load_max} = 1                           #  - cpus. Process is reniced if it uses more than this.
	   job{j20}{userid} = 113                           # User id
	   job{j20}{groupid} = 300                          # Group id
	   job{j20}{start} = "Thu12:10"                     # String with time job was started/discovered
	   job{j20}{cmd} = "xv"                             # cmd string for biggest process (%load+%mem)
	   job{j20}{stat_cnt} = 224                         # Number of stat. samples collected
	   job{j20}{host} = "bond001"                       # Where the job is running
	   job{j20}{pid} = "8922 8924 9100 9104 9105"       # String with pids of procs in job
	   job{j20}{stat_mem_current} = 6.8046875           # Last reported memory use (MB)
	   job{j20}{stat_mem} = ARRAY(0x83a5f74)            # LIFO buffer of last samples of mem
	   job{j20}{stat_mem_max} = 6.8046875               # Max mem used over last samples
	   job{j20}{stat_mem_long} = ARRAY(0x83a5450)       # LIFO buffer of last samples of mem
	   job{j20}{stat_load_current} = 0                  # Last reported CPU use (units of cpu load)
           job{j20}{stat_load_max} = 0                      # Max load reported over last samples
	   job{j20}{stat_load} = ARRAY(0x83a6608)           # LIFO buffer of last samples of load
	   job{j20}{stat_load_long} = ARRAY(0x83a5cd4)      #
	   job{j20}{stoppable} =                            # not used
	   job{j20}{soleaccess} = 0                         # True if this job has sole access to node
	   job{j20}{timestamp} = 972036537                  # time() in seconds, when job was discovered


To appear: description of the life of a "jr" job.


Ulrik Kjems
Last modified: Fri Oct 20 13:43:54 CEST 2000