Documentation for Run7 Lambda-HYDJET++ Simulations

Contact Person: Charles F. Maguire
WWW location http://www.hep.vanderbilt.edu/~maguirc/Run7/run7LambdaHYDJET++Simulations.html
Creation Date: July 30, 2009
Last Revision: August 1, 2009, Note that all the input event file segments have been produced for PISA hits file production

Introduction

Dillon Roach in our group is doing the analysis of the Lambda and anti-Lambda production in Run7. This analysis make use of the new capabilities of the TOF-West detector to extend the reach of the analysis in pair transverse momentum. The Lambda are being identified as having a proton detected in the TOF-West, and that proton is combined with any negative particle to calculate an invariant pair mass. While one could restrict the negative particles to identified pions, it is believed that this added mass identification would not substantially reduce the combinatoric background which is dominated by actual pions.

In the course of his analysis Dillon uncovered some anomalies in the reconstructed pair mass spectrum which appeared to be correlated particles. An example of such an anomaly is shown in the mass spectrum below, which has a cut on the 7-9 GeV/c transverse momentum bin.



Anomaly at just below 1.1 GeV mass in the reconstruction Lambda pair mass spectrum for the 7-9 GeV/c transverse momentum bin. The true Lambda mass region is indicated by the dotted line in the right side, background subtracted histogram.

The cause of this anomaly could be some real particle decay masquerading as a proton-pion combination. Or it could be some systematic mistake in the analysis procedure for the identification of the particles. It is hoped that by simulating global events, such as produced by the HYDJET++ generator, that the explanation for this anomaly will be determined.

Event sources

The HYDJET++ generator is used as the source of global events in this simulation. The output from this code has been tested against several types of measurements published from the RHIC data. In particular, the PHOBOS dN/dEta charged particle multiplicity dependence as a function multiplicity has been reproduced. Similarly, the multiplicities and transverse momentum spectra of various identified particles are also predicted correctly. The code is designed to work at the LHC collision energies, making it a point of comparison between RHIC and LHC experiments.

For purposes of this simulation project, 100K HYDJET++ events of 0-10% centrality will be used. The HYDJET events, which are initially produced in ROOT files, will be "filtered" with an eta cut of +/- 0.7 in order not to waste computing time in regions where there will not likely be backgrounds tracked into the TOF-West. The HYDJET events are generated under the condition that weak decays are turned off, such that long-lived particles like the Lambda can be properly decayed in the magnetic field of the PHENIX detector by the PISA simulation program. The filtering is done by the ROOT macro hydjetToOscarConvert.C which uses a shared library libcglTrkPairs.so built in the PHENIX software framework. In order to access this libcglTrkPairs.so library the $LD_LIBRARY_PATH environment variable must include the search of the subdirectory /phenix/gh/maguireRoachLambdaRun7/install/lib at RACF.

The output of the filtering program is a standard OSCAR ROOT file used in PHENIX. The filtering code also does some conversions of PDG particle numbers to the older GEANT3 particle numbers which PISA recognizes. Finally, the filtering program will embed a separate stream of Lambda particles into the HYDJET events in order to ensure that there are a sufficient number of Lambda on which to check the simulation analysis cuts. The separate Lambda events, in the form of their own OSCAR ROOT files, are given a uniform transverse momentum distribution from 0 to 10 GeV/c, and are focused into the acceptance of the TOF-West detector. The Lambda events are generated by the PHENIX EXODUS single particle generator.

Event file segmentation

The original HYDJET++ files were generated by Dillon in ten 10K groups, with names like HYDJETnew01.root to HYDJETnew10.root. The "new" refers to the fact that the Vanderbilt group discovered a "repeated particles" bug in the HYDJET++ 2.0 version, and that bug was subsequently fixed by the authors in a private communication to us. As it turned out, 10K HYDJET events in 0-10% centrality occupies more than the 1.9 GByte upper limit for a ROOT file. So after about 7000 events, ROOT closed the originally named file and opened an overflow file with names like HYDJETnew01_1.root to HYDJETnew10_1.root.

For purposes of PISA simulation, the input event files need to be much smaller. This project uses HYDJET files in 250 event segments, meaning that there will be 400 such segments to make up the 100K events. The process of splitting the originally named HYDJET++ file into segments of 250 event files is accomplished with the ROOT hydjetFileSegmentProducer macro. This is a stand-alone ROOT macro which does not call any shared object library. The segmented files carry names such as Segment14MainFileHYDJETnew07.root or Segment9OverFlowHYDJETnew03_1.root. In either case, the actual parent original file is indicated in the name of the segment file. Because the break at about 7000 events resulted in an uneven multiple of 250 events, this segmentation procedure in its simplest form yielded only 39 segments per original set of 10K HYDJET events, or 97500 events total. Hence, an eleventh file of 2500 HYDJET events will produced (HYDJETnew11.root), and then segmented in to 250 event groups like the others. All told then, there are 400 HYDJET file segments of 250 events each.

Event file embedding

It has been decided to embed 2 Lambda events into each central HYDJET event in order to ensure enough statistics at high pair momentum. A file of 200K Lambda events was initially generated, with a uniform rapidity distribution of +/-0.4 and a uniform azimuthal distribution from -15 to +35 degrees to provide good efficiency into the TOF-West acceptance. This large file was likewise divided into 400 file segments of 250 events each, with 2 Lambda per event, for subsequent embedding into the HYDJET events. This segmentation is accomplished by the lambdaFileSegmentProducer,C stand-alone ROOT macro. Although this stand-alone ROOT macro and the hydjetFileSegmentProducer.C ROOT macro do qualitatively the same thing, there is one technical difference. The hydjetFileSegmentProducer.C macro reads and writes in the ROOT file format of the HYDJET++ program, while the lambdaFileSegmentProducer.C macro reads and writes in PHENIX's OSCAR ROOT file format.

The OSCAR root files which contain the 250 filtered HYDJET events along with embedded Lambda particles (at 2 Lambda per HYDJET events) have names such as FilteredWithEmbeddedLambdaSegment3OverFlowHYDJETnew01_1.root. Again, the long name is intended to be fully descriptive of the origin and content of the file. As you can deduce, the hydjetToOscarConvert.C ROOT macro which uses the shared library libcglTrkPairs.so reads one input file stream in HYDJET++ format and possibly a second input file stream in OSCAR format, and then writes the output in OSCAR format.

NOTE: As of August 1, 2009, all the input event file segments needed for PISA hits file production have been produced at RACF in the locations to be desribed below. The details on how they can be produced are included here for information purposes in case a repeat or altered events production becomes necessary.

Simulation project files locations at RACF

This simulation project is anticipated to produce as much as 500 GBytes of output files. Space for this project has been reserved on the /phenix/gh disks at RACF. A top directory /phenix/gh/maguireRoachLambdaRun7/ has been created with protection code drwxrwxr-x. This protection code is to enable either myself of Dillon to delete files in the area. All the subdirectories and files in those subdirectories should have the same protection code.

The following subdirectories are in the top directory /phenix/gh/maguireRoachLambdaRun7/

drwxrwxr-x   8 maguire rhphenix  2048 Jul 30 17:09 .
drwxrwxrwx  18 root    root      2048 Jul 27 11:59 ..
drwxrwxr-x   2 maguire rhphenix 10240 Jul 30 10:18 analysis
drwxrwxr-x   2 maguire rhphenix  8192 Jul 30 01:38 condorScripts
drwxrwxr-x   5 maguire rhphenix  2048 Jul 27 15:11 exodusEvents
drwxrwxr-x   5 maguire rhphenix  2048 Jun 14 11:32 install
drwxrwxr-x   5 maguire rhphenix  2048 Jul 28 17:10 pisaEvents
drwxrwxr-x   5 maguire rhphenix  2048 Jul 29 13:54 pisaToDSTEvents

The three subdirectories exodusEvents, pisaEvents, and pisaToDSTEvents follow the standard simulation project decomposition into input event files, PISA hits files, and simDST reconstruction files. The analysis directory contains preliminary analysis output of the simDST reconstruction files, to be explained below. The condorScripts file, as the name implies, contains the production files needed to launch Condor jobs for this project. The install directory contains a frozen copy of PISA libraries to ensure stability and reproducibility of the project. The simulation reconstruction for the simDST files is being done with the ana.148 frozen library set.

exodusEvents subdirectory

The exodusEvents subdirectory contains three subdirectories itself

drwxrwxr-x  2 maguire rhphenix 149504 Jul 29 20:45 filteredEvents
drwxrwxr-x  2 maguire rhphenix  75776 Jul 28 17:33 hydjetOriginalEvents
drwxrwxr-x  2 maguire rhphenix  77824 Jul 27 12:57 lambdaOriginalEvents
According to the description above, these three subdirectories contain copies of the original HYDJET files both in their large (~7000 and ~3000 events) and segmented (250 event) formats, and correspondingly for the Lambda events. The filteredEvents subdirectory contains the FilteredWithEmbeddedLambdaSegment... files which serve as the actual event input files in OSCAR ROOT format for the PISA jobs.

pisaEvents subdirectory

Similarly, the pisaEvents subdirectory contains three subdirectories:

drwxrwxr-x  2 maguire rhphenix 2048 Jul 29 20:43 pisaInputFiles
drwxrwxr-x  2 maguire rhphenix 2048 Jul 29 17:41 pisaInputFilesTest
drwxrwxr-x  2 maguire rhphenix 4096 Jul 30 00:48 results
The pisaInputFiles directory contains a standard set of PISA input files, including a version of the phnx.par file which has the HBD in the East Arm only. We learned on July 30 that there is a defect in the HBD simulation detector model in that the HBD support structures are not correctly included. These support structures are believed to be a source of significant backgrounds in the Central Arm, even in the West Arm when the HBD itself is not present. The correction of this defect is not anticipated to be available before the end of August, 2009. The files in the pisaInputFiles are soft-linked as copies for use in the production scripts.

The pisaInputFilesTest contains a another copy of the files in the pisaInputFiles area, for PISA testing purposes. This test area is not used by the production scripts.

The results directory serves as an output files destination area used by the production scripts. This particular area is the destination for the PISA hits files, and for the PISA job log files.

pisaToDSTEvents subdirectory

Likewise, the pisaToDSTEvents subdirectory contains three subdirectories:

drwxr-xr-x  2 maguire rhphenix 2048 Jul 29 13:58 afsFiles
drwxrwxr-x  2 maguire rhphenix 2048 Jul 29 13:44 pisaToDSTInputFiles
drwxr-xr-x  2 maguire rhphenix 4096 Jul 30 01:35 results
The afsFiles area contains a collection of files which copies of files that are on the AFS at BNL, and are used by the event reconstruction modules. Normally, reconstruction jobs run at RCF can access these files AFS directly, but at the Vanderbilt ACCRE farm, the AFS is not supported on the worker nodes. The simulation reconstruction scripts will use the direct AFS file versions. The copies in the afsFiles area to be used for test comparison purposes.

The pisaToDSTInputFiles area serves the same purpose in reconstruction as the pisaInputFiles area does in PISA hits files generation. There are three files needed to run a pisaToDST reconstruction, in addition to the AFS files.

The results directory here serves as an output files destination area used by the production scripts. This particular area is the destination for the simDST.root files which are the output of the reconstruction process, and for the pisaToDST job log files.

condorScripts subdirectory

The simulation jobs are launched into the Condor batch queues by four files contained in the condorScripts subdirectory. In turn these four files are actually pointers to other locations, three of them in CVS.

lrwxrwxrwx  1 maguire rhphenix   93 Jul 29 17:23 condorMultipleHYDJETLambdaSimulation -> /phenix/u/maguire/cvs/offline/analysis/simDSTCheck/macro/condorMultipleHYDJETLambdaSimulation
lrwxrwxrwx  1 maguire rhphenix   88 Jul 29 20:45 hydjetLambdaSimulation.txt -> /phenix/gh/maguireRoachLambdaRun7/exodusEvents/filteredEvents/hydjetLambdaSimulation.txt
lrwxrwxrwx  1 maguire rhphenix   88 Jul 28 17:48 simMultipleHYDJETLambda_exe.csh -> /phenix/u/maguire/cvs/offline/analysis/simDSTCheck/macro/simMultipleHYDJETLambda_exe.csh
lrwxrwxrwx  1 maguire rhphenix   94 Jul 28 20:52 submitCondorHYDJETLambdaSimulation.pl -> /phenix/u/maguire/cvs/offline/analysis/simDSTCheck/macro/submitCondorHYDJETLambdaSimulation.pl
drwxr-xr-x  2 maguire rhphenix 8192 Jul 30 17:45 testOutput

The PERL script which does the actual launching of the Condor jobs is submitCondorHYDJETLambdaSimulation.pl. The comment lines in the beginning of this script describe its operations. In particular, the script must be started in the /phenix/gh/maguireRoachLambdaRun7/condorScripts directory. In that directory there should be a file named hydjetLambdaSimulation.txt which contains the names of all the input event files to be processed. From the above description there could be as many as 400 such input event files. The PERL script has an internal check to stop after 400 jobs are submitted.

As shown above, the hydjetLambdaSimulation.txt file list input used by the PERL script will be typically a pointer to a file in the ../exodusEvents/filteredEvents directory. This file list can be generated in that directory with a simple shell ls command such as

ls -c1 Filtered*.root > hydjetLambdaSimulation.txt

The -c1 option ensures that there will be one file name per line in the piped output file.

The PERL script has two modes of operation. The default mode is that the jobs are submitted to the Condor batch processing system. That batch processing system will look first at the condorMultipleHYDJETLambdaSimulation Condor control file. This control file in turn references the simMultipleHYDJETLambda_exe.csh C-shell script which does the low-level job processing commands. These processing commands encompass three stages of operations, and all the operation stages are working in a dedicated /tmp/maguireRoach# area on the batch node. The "#" corresponds to the Condor job sequence number, such that two or more jobs running on the same batch node will not interfere with one another. Three three stages in sequence are:

The second mode of operation for the PERL script is to submit a test job directly on the interactive node. This mode of operation tests the simMultipleHYDJETLambda_exe.csh low-level job script, and in particular the file manipulations on the local /tmp area. The two modes of operation are controlled by the hard-coded $testCshellScript local variable in the PERL script. In this mode of operation, one wants to be testing only a single job, not 400.

analysis subdirectory

The analysis subdirectory at present as the following contents

-rw-rw-r--  1 maguire rhphenix  5278 Jul 29 10:31 massCutsSimTFW.root
-rw-r--r--  1 maguire rhphenix  1092 Jul 30 10:15 ntupleChainPairTracks.C
-rw-r--r--  1 maguire rhphenix  1113 Jul 30 10:15 ntupleChainSingleTracks.C
-rw-r--r--  1 maguire rhphenix   805 Jul 30 10:14 pairsFileList.txt
drwxr-xr-x  2 maguire rhphenix 10240 Jul 30 17:51 results
lrwxrwxrwx  1 maguire rhphenix    79 Jul 30 10:09 simDSTAnalysisChain.pl -> /phenix/u/maguire/cvs/offline /analysis/simDSTCheck/macro/simDSTAnalysisChain.pl
-rw-r--r--  1 maguire rhphenix   827 Jul 30 10:13 singlesFileList.txt
The analysis procedure currently consists of two steps. The first step processes a simDST.root file with the libcglTrkPairs.so library using the analyze_simDST.C ROOT macro. This library is also located in the /phenix/gh/maguireRoachLambdaRun7/install/lib directory at RACF. The analyze_simDST.C macro will produce an output NTUPLE file simDSTCheck.root output which consists of track candidates with their associated subdetector information. A second, stand-alone ROOT macro analyze_simDSTPairs.C will process the single track simDSTCheck.root file into a pairs output file simDSTCheckPairs.root which in principle can be used for any pairs analysis. The analyze_simDSTPairs.C macro has been upgraded to include options specifically designed for the Lambda analysis, such as the condition that protons (or anti-protons) are detected in the TOF-West, and these are combined with any opposite sign particle in the West to generate an invariant mass assuming the opposite sign particle is a pion.

Since the analysis procedure is also subject to improvement, all of the above contents are not considered permanent. Two of the contents at least are more stable than the others. First, the massCutsSimTFW.root file is used by the pairs analysis macro to define a first, graphical cut on a proton track candidate in the TOF-West detector. This is intended to be the first step in improving from a simple, fixed-window mass cut such as 0.6 to 1.2 units independent of proton momentum. Second, the simDSTAnalysisChain.pl PERL script is a utility script to chain together the analysis output singles and pairs ROOT files which are copied into the results subdirectory by the production scripts. This script uses the file lists contained in the singlesFileList.txt and to produce the stand-alone ROOT macros ntupleChainSingleTracks.C and ntupleChainPairTracks.C which do the actual chaining by ROOT.

Simulation production operation at RACF

The production operation should be simple in principle. One first generates a set of input event files using the hydjetToOscarConvert.C ROOT macro. A copy of this macro is linked in the /phenix/gh/maguireRoachLambdaRun7/exodusEvents/filteredEvents directory. By default this macro requires a file list hydjetInputFileList.txt for the set of HYDJET file segments, and a file list lambdaInputFileList.txt if Lambda events are being merged. At the beginning of each file list, there should be a number indicating how many files in the list should be processed. The number should match in both file list. This extra line of input is used in case one wants to generate a limited set of files. Once all the 400 files are generated, and this has to be done only once, then one can make the hydjetLambdaSimulation.txt file which is used by the Condor submission PERL script.

Condor links

The Condor batch job submission system is already adequately described at the PHENIX wiki site https://www.phenix.bnl.gov/WWW/offline/wikioffline/index.php/Condor , so there is no need to have an extended documentation of Condor details at this site. After the PERL Condor submission script is run, a command such as

condor_q | grep username

will show all the Condor jobs under the account username. The jobs will show up initially as being in the input wait state, and then in the run state when execution starts on a particular batch node. The Condor jobs are set up to provide three job output files, with the final extensions .error, .log, and .out. The prefixes of these three sets of files files will have a name like FilteredWithEmbeddedLambdaSegment8OverFlowHYDJETnew01_1.root+-, according to the naming convention described above. An additional detail on the naming convention is that the files carry a +- tag meaning that the +- field was used in the PISA jobs. It is assumed that a second generation of simulations will use the -+ field configuration in PISA to correspond to the second half of Run7 data acquisition. The +- field map tag is hard-coded into the current PERL Condor submission script.

Analysis testing

As of this date, there has been a submission of 11 Condor test jobs, meaning that 11 sets of 250 event input files have been processed through the three stages, for a total 2750 events. The 11 sets of single track and pair tracks NTUPLE files were chained as described above, in order to produce an aggregated pair tracks NTUPLE whose results could be plotted. One such plot, as produced with the stand-alone plotPairMassAndPairPt.C ROOT macro is shown below. So far the analysis of the simulation output does not indicate anything seriously wrong, at least at the PISA hits and the reconstruction stages which consume the most time.


The explanations for the histograms contained in the above figure are contained in the e-mail at the group's elog server.

Typical time and size statistics from this test run are as follows: