Online Operations Manual

Introduction
Analysis diagram
Programs used in this analysis

gstlal_inspiral
gstlal_inspiral_marginalize_likelihoods_online
gstlal_ll_dq
gstlal_ll_inspiral_event_plotter
gstlal_ll_inspiral_event_uploader
gstlal_ll_inspiral_pastro_uploader
gstlal_ll_inspiral_trigger_counter
scald_event_collector
scald_metric_collector

Kafka topics
On disk layout
HTTP traffic
Important References and Resources
Software and service stack

Introduction

The low-latency GstLAL-based compact binary analysis implements a mixed time-domain, frequency domain filtering scheme to produce extremely low latency gravitational wave detection capabilities. The purpose is to discover gravitational waves from merging neutron stars and black holes within seconds of the waves arriving at Earth.

There is an initial configuration procedure that must be executed first. During this stage, the template bank is decomposed into SVD bins and initial dist stats are computed. Once setup, the analysis is designed to run continuously throughout an observing period. This page provides an ‘as built’ overview of the entire analysis. If you are simply looking to start an analysis from scratch, there are step-by-step instructions in the rest of this manual starting with Configuration.

Analysis diagram

Below is a diagram relating the various workflows (dashed line boxes) and communication layers (HTTP, Kafka, File I/O) for a functioning low-latency compact binary search. You can click on the diagram to learn more about each component.

Programs used in this analysis

FIXME

gstlal_inspiral

config
doc
source
Disk I/O

SVD bank files. Multiple SVD bank files can be given per job, in order to analyze data from multiple IFOs. These should each correspond to the same SVD bin.

A reference PSD file. We use a file checked into the repo as a starting point, but always use the track-psd option so that the PSD is periodically updated to reflect the current state of the detector noise. What reference PSD do we use? What time range does it correspond to? Is it updated throughout the run at all?

Ranking statistic input file. This contains likelihood ratio ranking statistic data according to our signal and noise models, created by the create_prior_dist_stats jobs in the set-up stage. These files are in the dist_stats directory and we use the naming convention {IFOs}-{SVD_GROUP_NUM}_GSTLAL_DIST_STATS-0-0.xml.gz For injection jobs, this input file is taken from the one used for non-injection job of the same SVD bin so that the noise model is consistent between non-injection/injection twin jobs. And the rankingstat data is not updated or overwritten for this case (technically, it is internally adding counts but the internal data get overwritten by the next snapshot of the rankingstat file and will be used for ranking-statistic evaluation).

Ranking stat PDF file. This is used to compute the FAP and FAR of triggers. This file is made by the gstlal_marginalize_liikelihoods_online job. It’s kept in the dist_stat_pdfs directory and is named as {IFOs}-GSTLAL_DIST_STAT_PDFS-0-0.xml.gz.

Time slide xml file. This is made in the Makefile using lalapps_gen_timeslide

Filename to write out ranking statistic data to. Overwrite / update the same file that was given as the ranking statistic input file (ie the one made from create_prior_dist_stats in the dist_stats directory), when the output filename is set to be the same as the input filename (which is usually the case for online analysis).

For injection jobs, the rankingstat data is not output to anywhere because the collected background is nonsense.

Zero-lag ranking stat PDF. This is a histogram of the likelihood ratios of zero-lag triggers collected in the filtering. It gets written at start-up, and updated as the job continually runs. These go in the zerolag_dist_stat_pdfs directory and are named as {IFOs}-{SVD_GROUP_NUM}_GSTLAL_ZEROLAG_DIST_STAT_PDFS-0-0.xml.gz

Trigger files get written out to the directories gracedb_uploads/{GPS_TIME_STAMP}/ FIXME LINK TO TABLE DEFINITIONS
Http requests

URL is advertised to a registry file in the top level of the analysis run directory. Data can be requested from the job via this url using bottle.
Kafka topics

Output topics: all scientific metric topics, data quality metric topics, latency metric topics, and monitoring topics. See https://gwsci.org/ops/diagrams#kafka.

Analyze gravitational wave strain data in real time, filtering it with an SVD bank of CBC waveforms to generate event triggers. In the online mode, the inspiral jobs assign likelihoods and compute the FAR of triggers.

Notes:

Data source - what are all the different options for this? FIXME LINK TO OPS PAGE SOURCE OF TRUTH
Channel names FIXME LINK TO OPS PAGE SOURCE OF TRUTH
State channel names FIXME LINK TO OPS PAGE SOURCE OF TRUTH
DQ channel names FIXME LINK TO OPS PAGE SOURCE OF TRUTH
State vector on and off bits FIXME LINK TO OPS PAGE SOURCE OF TRUTH
Shared memory partition FIXME LINK TO OPS PAGE SOURCE OF TRUTH
FAR threshold for uploads FIXME LINK TO OPS PAGE SOURCE OF TRUTH
Group, pipeline, search FIXME LINK TO OPS PAGE SOURCE OF TRUTH
Labels FIXME LINK TO OPS PAGE SOURCE OF TRUTH
Service URL (GraceDb, playground, test, etc.) FIXME LINK TO OPS PAGE SOURCE OF TRUTH

gstlal_inspiral_marginalize_likelihoods_online

config
doc
source
Disk I/O

Input: Text files containing the URL of a web server from which to retrieve likelihood data from a particular job. These are named as {JOB_NUM}_noninj_registry.txt (there are also {JOB_NUM}_inj_registry.txt for injection gstlal inspiral jobs but these are not used by the marginalize likelihoods job) and are kept in the top level of the run directory.

Output: Name of an xml file to write out marginalized ranking statistic PDFs to. This file gets written to the directory dist_stat_pdfs and we use the naming convention {IFOs}-GSTLAL_DIST_STAT_PDFS-0-0.xml.gz. This histogram contains noise and zerolag triggers collected from all the inspiral jobs. This zerolag counts will be used to apply the extinction model (simulate the clustering effect on the ranking statistic distribution based on zerolag counts and apply that to the noise counts) when assigning FAP/FAR in the online configuration.
Http requests

registry files for all gstlal inspiral (noninjection) jobs which provide a url to request data from. The ranking_data.xml files are requested from these urls.

Calculate ranking statistic PDFs from each running gstlal inspiral job and marginalize the PDFs (ie add the histograms) across all SVD bins. This is repeated in a continuous loop and takes about 4 - 8 hours to complete one round of marginalization across all bins.

Notes:

Ranking stats from each gstlal inspiral job are gathered via HTTP request.

gstlal_ll_dq

Produce noise and range history metrics for each ifo

Notes:

gstlal_ll_inspiral_event_plotter

config
doc
source
Disk I/O

Optionally write out plots to disk if --output-path is provided.
Kafka topics

Input topic: gstlal.<analysis_tag>.inj_uploads OR gstlal.<analysis_tag>.uploads

Input topic: gstlal.<analysis_tag>.inj_ranking_stat OR gstlal.<analysis_tag>.ranking_stat

Ingest messages from the uploads and ranking stat Kafka topics, store the event info in a dictionary keyed by event GPS time and SVD bank bin. Handle the stored event messages - upload all of the auxiliary files and plots to the event on GraceDb. This includes:

The ranking statistic data file (ranking data.xml.gz) - Ranking statistic plots (background (noise) PDF, injection (signal) PDF, zero lag (candidates) PDF, LR, likelihood ratio CCDF, horizon distance vs. time, rates) - These are made by the functions in plots/far.py - PSD plots - These are made by the functions in plots/psd.py - SNR timeseries plots

Notes:

Kafka URL
GraceDb group, pipeline, search and service url to use
No file outputs, unless `output-path` is set by user, then plots are saved to disk in addition to being uploaded to GraceDb.

gstlal_ll_inspiral_event_uploader

config
doc
source
Kafka topics

Input topics: gstlal.<analysis_tag>.inj_events OR gstlal.<analysis_tag>.events

Output topics: gstlal.<analysis_tag>.inj_favored_events OR gstlal.<analysis_tag>.favored_events

Ingest messages via Kafka from the events topic (from the gstlal inspiral jobs) for each trigger generated. These messages are JSON packets containing the trigger GPS times, FAR, SNR, and coinc file (including PSD). Group these individual triggers into event candidates, by using a time window around the coalescence time and store them in a dictionary. Handle the stored event candidates - when there is a new candidate or enough time has passed to upload a new event to an existing candidate, process the candidates by choosing a favored event” (default favored event function is maximum SNR, minimum FAR or composite event aggregation is also supported). And send the favored event information (time, SNR, FAR, PSD and coinc files) to a Kafka topic. Upload the event to GraceDb and send a message with the GraceDB event ID, coinc file, and event time to the Kafka uploads topic.

Notes:

Kafka URL and topic to consume messages from. We consume messages from the events topic which are sent by the inspiral jobs.
GraceDb group, pipeline, search and service url to use in uploading events.
Trials factor on the FAR. The FAR threshold will be = FAR / trials factor, where usually the trials factor corresponds to the number of independent online pipelines, ie 5 - CWB, GstLAL, MBTA, PYCBC, SPIIR. FIXME not used.
Upload cadence - determine how long to wait between sending multiple events for the same event window.
No file outputs.
Produces Kafka messages to the `favored events`, and `uploads` topic.

gstlal_ll_inspiral_pastro_uploader

config
doc
source
Disk I/O

Input file: p(astro) model file including some pre-computed data necessary to compute p(astros)

Input file: marginalized ranking stat PDF - this file is read in every four hours so that the p(astro) model can be updated with the latest ranking stat signal and noise model.

Output file: p(astro) model file - written out to disk every time the ranking stat information is updated.
Kafka topics

Input topics: gstlal.<analysis_tag>.inj_uploads OR gstlal.<analysis_tag>.uploads

Ingest messages from the uploads Kafka topic, store the event info in a list. Handle the stored event messages - compute the p(astro) values (pTerrestrial, and probability of each source class, for AllSky this is pBNS, pBBH, and pNSBH but the EarlyWarning and LowMass searches may only compute a subset of these according to their parameter space. Upload the p(astro) json file as gstlal.p_astro.json to the event on GraceDB and apply the label PASTRO_READY to the event.

Notes:

Kafka URL
GraceDb group, pipeline, search and service url to use

gstlal_ll_inspiral_trigger_counter

config
doc
source
Disk I/O

Output: XML filename to write out the zero-lag ranking statistic data to. We set this to go in the zero lag dist stat pdfs directory, and use the file name {IFOS}-GSTLAL_ZEROLAG_DIST_STAT_PDFS-0-0.xml.gz

Collect zero-lag triggers from inspiral jobs, via the Kafka events. These messages contain the GPS times, FAR, SNR, coinc file, PSD file, and p(astro) file of triggers. Cluster triggers over 10 second windows by max likelihood. Write out the zero-lag counts histogram to disk at an interval set by the user.

Notes:

Kafka URL and topic to consume messages from.
Specify the output period, how often to write out the zero-lag counts histogram to disk.
Bootstrap file. The program will try to load the specified output file first to get initial counts, if that file doesn’t exist use the bootstrap file instead to start up.

scald_event_collector

Notes:

YAML configuration file from the `web` directory, sets dashboard and plotting options, also the data backend (Influx DB to store metrics in), and schemas
Kafka URI to get messages from
Data type. Always triggers
topics to subscribe to
one schema per topic indicating metrics to aggregate
No file outputs.

scald_metric_collector

Aggregate metric timeseries across jobs (per second?) and store them in an Influx database. These metrics are things like FAR and likelihood history, SNR history for each IFO, latency history, etc. These are used to populate the Grafana dashboards for monitoring. The metrics are stored essentially forever in an Influx DB. These messages are sent to Kafka by the gstlal inspiral jobs, see the update function in EyeCandy (part of the LLOIDTracker).

Notes:

YAML configuration file from the `web` directory, sets dashboard and plotting options, also the data backend (Influx DB to store metrics in), and schemas (metrics to aggregate, eg. FAR history or SNR history, along with aggregation type (min, max) etc.)
Kafka URI to get messages from, topics to subscribe to, one schema per topic indicating metrics to aggregate
No file outputs.

Kafka topics

Notes:

<analysis_tag> is a user provided string, e.g., “gstlal.inspiral_mario_MDC04”
<inj> is either the optional string “inj_” or nothing which delineates between an injection or non injection run
<ifo> is a particular detector, e.g., “L1”, or “K1”

Scientific metric topics:

gstlal.inspiral_<analysis tag>.far_history:
gstlal.inspiral_<analysis tag>.far_history:
gstlal.inspiral_<analysis tag>.likelihood_history:
gstlal.inspiral_<analysis tag>.inj_likelihood_history:
gstlal.inspiral_<analysis tag>.snr_history:
gstlal.inspiral_<analysis tag>.inj_snr_history:
gstlal.inspiral_<analysis tag>.<ifo>_snr_history:
gstlal.inspiral_<analysis tag>.inj_<ifo>_snr_history:
gstlal.testsuite_<analysis tag>.<ifo>_psd:
gstlal.inspiral_<analysis tag>.coinc:

Data quality metric topics:

gstlal.inspiral_<analysis tag>.inj_<ifo>_inj_dqvectorsegments:
gstlal.inspiral_<analysis tag>.<ifo>_dqvectorsegments:
gstlal.inspiral_<analysis tag>.inj_<ifo>_dqvectorsegments:
gstlal.testsuite_<analysis tag>.<ifo>_dqvectorsegments:
gstlal.inspiral_<analysis tag>.inj_<ifo>_whitehtsegments:
gstlal.inspiral_<analysis tag>.inj_<ifo>_inj_whitehtsegments:
gstlal.inspiral_<analysis tag>.inj_<ifo>_inj_statevectorsegments:
gstlal.inspiral_<analysis tag>.inj_<ifo>_statevectorsegments:
gstlal.inspiral_<analysis tag>.<ifo>_statevectorsegments:
gstlal.testsuite_<analysis tag>.<ifo>_statevectorsegments:
gstlal.inspiral_<analysis tag>.<ifo>_strain_dropped:
gstlal.inspiral_<analysis tag>.inj_<ifo>_strain_dropped:
gstlal.inspiral_<analysis tag>.<ifo>_noise:

Latency metric topics:

gstlal.inspiral_<analysis tag>.<ifo>_snrSlice_latency:
gstlal.inspiral_<analysis tag>.inj_<ifo>_snrSlice_latency:
gstlal.inspiral_<analysis tag>.<ifo>_datasource_latency:
gstlal.inspiral_<analysis tag>.inj_<ifo>_datasource_latency:
gstlal.inspiral_<analysis tag>.inj_latency_history:
gstlal.inspiral_<analysis tag>.latency_history:
gstlal.inspiral_<analysis tag>.<ifo>_whitening_latency:
gstlal.inspiral_<analysis tag>.inj_<ifo>_whitening_latency:
gstlal.inspiral_<analysis tag>.inj_all_itacac_latency:
gstlal.inspiral_<analysis tag>.all_itacac_latency:
gstlal.inspiral_<analysis tag>._all_itacac_latency:

Event topics:

gstlal.inspiral_<analysis tag>.favored_events:
gstlal.inspiral_<analysis tag>.inj_events:
gstlal.inspiral_<analysis tag>.uploads:
gstlal.inspiral_<analysis tag>.events:
gstlal.inspiral_<analysis tag>.p_astro:
gstlal.inspiral_<analysis tag>.ranking_stat:
gstlal.inspiral_<analysis tag>.ram_history:

Monitoring topics:

gstlal.inspiral_<analysis tag>.inj_ram_history:
gstlal.inspiral_<analysis tag>.ram_history:
gstlal.inspiral_<analysis tag>.inj_uptime:
gstlal.inspiral_<analysis tag>.uptime:

On disk layout

A shared file system is used to store configuration data, archives of trigger outputs, and occasionally to pass information between running jobs (though the low-latency information is typically passed via http or kafka).

archive: empty?
dtdphi: Makefile only?
mass_model: H1L1V1-GSTLAL_MASS_MODEL-0-0.xml.gz: wrong file extension??? and Makefile
profiles: empty?
svd:
cit_mario_online.yml
ics_online.yml
psd: H1L1V1-GSTLAL_REFERENCE_PSD-0-0.xml.gz Makefile
influx_creds.sh
bank: bbh_low_q.xml.gz bns.xml.gz imbh_low_q.xml.gz Makefile mario_bros_offline.xml.gz nsbh.xml.gz other_bbh.xml.gz
svd_bank: empty???
H1L1V1-GSTLAL_REFERENCE_PSD-0-0.xml.gz
Makefile
env.sh
H1L1V1-GSTLAL_SVD_MANIFEST-0-0.json
tisi.xml
filter: contains e.g., svd_bank/H1-0358_GSTLAL_SVD_BANK-0-0.xml.gz
nohup.out
split_bank: contains e.g., H1L1V1-0191_GSTLAL_SPLIT_BANK_0577-0-0.xml.gz
aggregator: contains e.g., /1/3/3/6/2/7/V1-PSD-1336279900-100.hdf5 do we need these at all?
13362: contains e.g., H1L1V1-0918_inj_mdc04_LLOID-1336273048-14933.xml.gz H1L1V1-0918_inj_mdc04_LLOID-1336273048-60.xml.gz H1L1V1-0918_inj_mdc04_SEGMENTS-1336273048-9028.xml.gz H1L1V1-0918_inj_mdc04_SEGMENTS-1336273108-0.xml.gz H1L1V1-0918_noninj_LLOID-1336273064-14864.xml.gz H1L1V1-0918_noninj_LLOID-1336273064-9.xml.gz H1L1V1-0918_noninj_LLOID-1336287922-14406.xml.gz H1L1V1-0918_noninj_LLOID_DISTSTATS-1336273064-10.xml.gz H1L1V1-0918_noninj_LLOID_DISTSTATS-1336273064-14865.xml.gz H1L1V1-0918_noninj_LLOID_DISTSTATS-1336287922-14407.xml.gz H1L1V1-0918_noninj_SEGMENTS-1336273064-9012.xml.gz H1L1V1-0918_noninj_SEGMENTS-1336273074-0.xml.gz H1L1V1-0918_noninj_SEGMENTS-1336287922-14399.xml.gz
plots: contains e.g., COMBINED-GSTLAL_INSPIRAL_PLOT_BACKGROUND_ALL_NOISE_LIKELIHOOD_RATIO_CCDF_CLOSED_BOX-1336272983-114190.png
config.yml
web: contains e.g., inspiral.yml online_dashboard.json
test-suite: is this the test suite dag? Is this the preferred way to run it? Can we point to test-suite specific documentation?
dist_stat_pdfs: contains e.g., H1L1V1-GSTLAL_DIST_STAT_PDFS-0-0.xml.gz
gracedb_uploads: contains e.g., 13372/H1L1V1-GSTLAL_0621_inj_mdc04_7_945_CBC_AllSky_0621_RankingData-1337299900-1.xml.gz 13372/H1L1V1-GSTLAL_0621_inj_mdc04_7_945_CBC_AllSky-1337299900-1.xml Where is pastro??
dist_stats: contains e.g., H1L1V1-0915_GSTLAL_DIST_STATS-0-0.xml.gz
logs
zerolag_dist_stat_pdfs: contains e.g., H1L1V1-0406_GSTLAL_ZEROLAG_DIST_STAT_PDFS-0-0.xml.gz

HTTP traffic

FIXME

Important References and Resources

online configuration

Software and service stack

The GstLAL online analysis relies on several open source software libraries. Some of these are available in the gwsci container, but some are not. Namely, the following services are required beyond the software in the gwsci container:

Kafka - used to stream data products for I/O between different processes
InfluxDB - used to store metric data
Grafana - used to visualize metric data

Additionally there is an implicit assumption that you are deploying this analysis on an LDG-compatible site running HTCondor with low-latency data services running (a few different flavors are supported).

Online Operations Manual

Table of Contents

Introduction

Analysis diagram

Programs used in this analysis

gstlal_inspiral

Disk I/O

Http requests

Kafka topics

Notes:

gstlal_inspiral_marginalize_likelihoods_online

Disk I/O

Http requests

Notes:

gstlal_ll_dq

Notes:

gstlal_ll_inspiral_event_plotter

Disk I/O

Kafka topics

Notes:

gstlal_ll_inspiral_event_uploader

Kafka topics

Notes:

gstlal_ll_inspiral_pastro_uploader

Disk I/O

Kafka topics

Notes:

gstlal_ll_inspiral_trigger_counter

Disk I/O

Notes:

scald_event_collector

Notes:

scald_metric_collector

Notes:

Kafka topics

On disk layout

HTTP traffic

Important References and Resources

Software and service stack