Online Operations Manual
Table of Contents
Common Errors and Known Solutions
Node issue
When running your setup jobs online_setup_dag.dag
, you may find some of the gstlal_inspiral_svd_bank jobs
failed with an error:
Traceback (most recent call last):
...
...
ImportError: libmkl_rt.so: cannot open shared object file: No such file or directory
...
...
ModuleNotFoundError: No module named '_lal'
which is possibly due to the incompatibility between some of the cluster nodes and gstlal jobs. You can try condor_release job_ID
and let the jobs land on other workable nodes.
If condor_release
is not helping, try touch ~/.nointel
and log out of the cluster. Next, you can log back in, source env.sh
, and repeat the submission of online_setup_dag.dag. nointel
tells the cluster to avoid running the gstlal jobs on the INTEL nodes.
File corruption
Errors of the following form often indicate file corruption:
xml.sax._exceptions.SAXParseException: <unknown>:1:0: syntax error
This can happen if Git Large File Storage is not installed when large files are downloaded from a repository. Try running git lfs install
, and then redownloading any files that were checked into the repo using lfs. These files can be identified by an LFS
icon next to the file name in the gitlab web interface.
Environment variables
GSTLAL_CHECK_TIMESTAMPS
The check_timestamps element is used for debugging purposes. If the environment variable GSTLAL_CHECK_TIMESTAMPS
is not set, the check_timestamps element will be skipped. If GSTLAL_CHECK_TIMESTAMPS
is exported, the check_timestamps element will be called.
GST_DEBUG
Sets gstreamer logging level.