Accounting - What happened with that job?
On genepool there are three options for accessing information on your past jobs:
- Genepool completed jobs webpage (genepool only)
- The UGE provided tool: qacct (genepool or phoebe)
- The NERSC provided tool: qqacct - Query Queue Accounting data (genepool or phoebe)
Everytime a job is completed - either failed or successful, the UGE batch system writes an entry into its accounting logs. These accounting logs contain a great deal of useful information about the job - the requested resources, UGE's best estimate of the used resources, submission, starting, and ending times, and many other important details. In many cases, these log entries can be used to help debug problems with jobs if they fail unexpectedly. The data written to the accounting logs are equivalent to the data used by the UGE system as inputs for the FairShare calculations when determining relative job priority. Thus, when a job fails for reasons that UGE assumes or determines to be system related, it can enter misleading (e.g. zeros) into the accounting logs, to reflect that the user/project wasn't "billed" for that job.
The actual accounting log data are rotated nightly, so that the UGE provided qacct tool can only easily access the current days' worth of accounting information. The past 90 days of accounting information are retained on disk; you can use qqacct to access all the accounting data on disk. qqacct also supports expressive querying of the logged data to help find exactly the data you are looking for, and for a few common cases, its partner tool qqplot.py can generate plots of the queried data.
The Genepool completed jobs page provides a great web interface to allow users to query-and-retrieve much of the accounting information. These data are stored in a database, and contains a complete record of all the accounting log data recorded for genepool (not just the past 90 days).