Cron jobs used for the Run7 Minimum Bias Data Reconstruction

http://www.hep.vanderbilt.edu/~maguirc/Run7/phenixVanderbiltMinBias.html

Contact Person: Charles F. Maguire
Creation Date: May 30, 2007

Database transfer and updating

The database updating sequence starts with the gridFTP transfer of three restore files from the rftpexp01 server machine at RCF. The rftpexp01 node is the only one of the rftexp nodes which is allowed by the firebird node. The cron jobs are as follows:

  1. 05 0 * * * /phenix/data11/maguire/Run7/gridUpdateDataBase.csh >& /phenix/data11/maguire/Run7/sshCommand.log &

    This cron job runs at 5 minutes past midnight EDT (11:05 PM CDT) in the maguire account at on rftexp1. It sends the three restore files to the /mnt/eon0/rhic/dataBaseUpdating/dbRun7 of the rhic account on the firebird node.

  2. 35 23 * * * /mnt/eon0/rhic/dataBaseUpdating/fdtUpdateServer.csh >& /mnt/eon0/rhic/dataBaseUpdating/fdtUpdateServer.log &

    This cron job using the rhic account on the firebird node sets up an FDT server process at 11:35 CDT. This server process will allow for FDT transfers to the vpac04 node only.

  3. 35 23 * * * /rhic2/pgsql/updateRun7FromRCF/fdtUpdateClient.csh >& /rhic2/pgsql/updateRun7FromRCF/fdtUpdateClient.log &

    This cron job using the postgres account on the vpac04 node sets up an FDT client process 30 seconds after 11:35 PM CDT to copy the three restore files from the firebird node to the local /rhic2/pgsql/dbRun7 disk area on vpac04.

  4. 05 1 * * * /phenix/data11/maguire/Run7/failureToSendDB.csh >& /phenix/data11/maguire/Run7/failureToSendDB.log &

    If the gridFTP job fails to send the three restore files to the firebird node, then this cron job at 1:05 AM EDT using the maguire account on rftpexp01 will mail a failure e-mail to the maguire e-mail address at Vanderbilt.

  5. #0 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17 /var/lib/pgsql/update/startRestoreFromDumpsABC.csh >& /var/lib/pgsql/startRestoreFromDumpsABC.log &

    The above cron job is not yet operational, nor has it been tested to work. It is a prototype of how to start the 3-cycle database update on vpac15. Since the 3-cycle database system may be changed to vpac04 instead, this cron job may be moved to vpac04 in the near future.

FDT for incoming PRDF files (all of these scripts run on the vpac15 node using the postgres account)

  1. 15 * * * * /var/lib/pgsql/monitoring/inputFireBirdOccupancyEon0.pl >& /var/lib/pgsql/monitoring/inputFireBirdOccupancyPerlEon0.log &

    Script to check the disk occupancy of /mnt/eon0 on the firebird node

  2. 20 * * * * /var/lib/pgsql/monitoring/inputFireBirdOccupancyEon1.pl >& /var/lib/pgsql/monitoring/inputFireBirdOccupancyPerlEon1.log &

    Script to check the disk occupancy of /mnt/eon1 on the firebird node

  3. 25 * * * * /var/lib/pgsql/monitoring/inputFireBirdNewFilesEon0.csh >& /var/lib/pgsql/monitoring/inputFireBirdNewFilesEon0.log &

    Script to check for the arrival of new input PRDF files on firebird /mnt/eon0 area

  4. 30 * * * * /var/lib/pgsql/monitoring/inputFireBirdNewFilesEon1.csh >& /var/lib/pgsql/monitoring/inputFireBirdNewFilesEon1.log &

    Script to check for the arrival of new input PRDF files on firebird /mnt/eon1 area

Job submission and monitoring

  1. 5,35 * * * * /gpfs3/RUN7PRDF/prod/run7/cautoRun7 >& /gpfs3/RUN7PRDF/prod/run7/cautoRun7.log

    Script runs on the vmps18 node using the phnxreco account. The script checks to see if the previous set of jobs has finished before launching a new set of jobs. Various conditions are checked to see if the script should exit immediately.
There are presently no cron jobs which monitor the progress of the reconstruction jobs sumbmitted to PBS. Such a cron job will be needed for the global monitoring WWW page.

FDT and gridFTP for outgoing nanoDST files

  1. 10,40 * * * * /var/lib/pgsql/monitoring/gridFtpNanoMonitor.csh >& /var/lib/pgsql/monitoring/gridFtpNanoMonitor.log &
    Script to check if the gridFTP transfer rate to RCF has dropped below 15 MBytes/second. This script runs on the vpac15 node and uses the postgres account

  2. 01,31 * * * * perl -w /phenix/data11/maguire/Run7/confirmAndEraseData63.pl >& /phenix/data11/maguire/Run7/confirmAndEraseData63Perl.log &
    Script to check if all the nanoDST files arrive correctly on the data63 disk at RCF. The script runs on the rftexp01 node at RCF using the maguire account since it uses gridFTP to send a file to the firebird node. This script function has to be updated to be more general in terms of knowing what is the data area at RCF which should be checked.