Offline SNR Optimizer/Coinc Generation


Overview

The SNR Optimizer in the offline mode can be used to produce coinc files. The purpose of this can be either to follow up an online event and create a coinc file to upload with different settings (e.g. adding a detector), or to create coinc files that contain SNR timeseries and PSD information for an offline analysis. The SNR Optimizer requires an input coinc file (i.e. the uploaded online coinc file or the coinc files without SNR timeseries/PSD extracted from the offline database), and the template bank used by the original analysis (the online or offline GstLAL analysis) in order to produce an output coinc.

Setup

If you’re using an online container, it will already have the manifold repo installed in it. Otherwise, clone the manifold repo, cd into the repo, run singularity in writable mode (singularity run --writable /path/to/container), and install the manifold repo. This can be done using

python3 setup.py install

If you want to override a previous manifold install, it’s best to erase the previous install and then install the new one, by doing

rm -rf /usr/local/lib/python3.6/site-packages/manifold* && python3 setup.py install

Running the SNR Optimizer in offline mode

The command you want to use is

/usr/local/bin/manifold_cbc_bank_snr_optimizer_online <channels> --data-source frames --psd-fft-length 4 --seed-bank <bank> --min-mismatch 0.001 --sample-rate 2048 --max-duration 256 --algorithm rectangle <input_coinc> --output-xml-file <output_coinc> --gps-start <start> --gps-end <end> --verbose --frame-segments-file <segments_file> --frame-segments-name datasegments --ht-gate-threshold 15 --original-template-only --frame-cache <frame_cache>

where

  • <channels> is the list of channels you want to analyze. It will look something like --channel-name H1=GDS-CALIB_STRAIN_CLEAN --channel-name L1=GDS-CALIB_STRAIN_CLEAN
  • <bank> is the location of the GstLAL template bank used by the original analysis (must be a manifold bank)
  • <input_coinc> is the path to the input coinc file
  • <output_coinc> is the file in which you want the output coinc to be written
  • <start> is the gps start time of the data you want the optimizer to analyze. It can generally be set to 300 seconds before the time of the event you are analyzing.
  • <end> is the gps end time of the data you want the optimizer to analyze. It can generally be set to 60 seconds after the time of the event you are analyzing.
  • <segments_file> is the relevant segments file for that time period. This is not necessary, but is recommended, in order to prevent triggering on glitches. Used in conjunction with --frame-segments-name datasegments
  • --original-template-only is the flag to prevent optimization over templates. Remove this if you want a better SNR than the input coinc.
  • --frame-cache <frame_cache> can be changed to a datafind server and frame type if you don’t want to create a frame cache.

Additional settings you can use if you want and know what you’re doing:

  • --reference-psd <filename> --track-psd: This reads in a PSD from the file provided and uses it as the initial PSD estimate, and then calculates the new PSD as more data is analyzed. The input coinc, if taken from an online upload contains a PSD and can be used for this purpose.
  • --template psd <filename>: This disables dynamic template whitening, and uses the PSD set here to whiten templates. This is used to mimic GstLAL. You probably never want to use this setting.

Running multiple jobs in a dag

To run multiple jobs in a dag, you’ll need to create a directory with all the input coinc files (make sure that directory doesn’t contain any other files). Grab the write_dag.py script from here: https://git.ligo.org/chad-hanna/manifold/-/tree/main/scripts/snr_optimizer?ref_type=heads. You’ll need to modify 3 variables at the top of the script:

  • path: The path to directory containing the input coincs
  • output_dir: The directory where output coincs will be written to
  • channel_dict: The dictionary of channels you want the optimizer jobs to analyze. This can be modified to also incorporate other ifo-specific variables like the state channel vector, etc

After you run the script, grab the snr_optimizer.sub file from the same location as before. Make sure to modify the variables in the sub file to match the exact command you want to run. After that, create a logs dir and the output_dir. You can then submit the dag

Known problems

  • The SNR Optimizer can sometimes trigger on glitches that are nearby in time to the input coinc. Using a ht-gate-threshold of 15, providing a segments file (with vetoes subtracted from it) helps. Nevertheless, be sure to compare the output SNR to the input SNR and make sure the output makes sense. Additional checks like comparing the end times, chisqs of input and output coincs can also be done.
  • Since the offline SNR Optimzier dag does not use condor file transfer, if a job fails, sometimes the logs get written to the file where the output coinc is supposed to be written. Make sure the outputs are coinc files!

Running Bayestar and generating fits files

To run Bayestar, grab the write_skymap_dag.py, calculate_fits_and_searched_area.py, calculate_fits_and_searched_area_wrapper.sh, and skymap.sub files from https://git.ligo.org/chad-hanna/manifold/-/tree/main/scripts/snr_optimizer?ref_type=heads. In the write_skymap_dag.py, modify the paths variable at the top to point to the directory(ies) contianing the optimizer’s output coinc files. After executing it, it will create a skymap.dag file. The skymap.sub file points to the calculate_fits_and_searched_area_wrapper.sh executable. This is a wrapper script that activates a conda environment, and runs the calculate_fits_and_searched_area.py executable with the correct arguments. Make sure the conda environment is available on the cluster you’re running on. Create the logs and fits_files directories, and then launch the dag.