Online Operations Manual


Common Errors and Known Solutions

Node issue

When running your setup jobs online_setup_dag.dag, you may find some of the gstlal_inspiral_svd_bank jobs failed with an error:

Traceback (most recent call last):
	...
	...
	ImportError: libmkl_rt.so: cannot open shared object file: No such file or directory
	...
	...
	ModuleNotFoundError: No module named '_lal'

which is possibly due to the incompatibility between some of the cluster nodes and gstlal jobs. You can try condor_release job_ID and let the jobs land on other workable nodes.

If condor_release is not helping, try touch ~/.nointel and log out of the cluster. Next, you can log back in, source env.sh, and repeat the submission of online_setup_dag.dag. nointel tells the cluster to avoid running the gstlal jobs on the INTEL nodes.

File corruption

Errors of the following form often indicate file corruption:

xml.sax._exceptions.SAXParseException: <unknown>:1:0: syntax error

This can happen if Git Large File Storage is not installed when large files are downloaded from a repository. Try running git lfs install, and then redownloading any files that were checked into the repo using lfs. These files can be identified by an LFS icon next to the file name in the gitlab web interface.

Environment variables

GSTLAL_CHECK_TIMESTAMPS

The check_timestamps element is used for debugging purposes. If the environment variable GSTLAL_CHECK_TIMESTAMPS is not set, the check_timestamps element will be skipped. If GSTLAL_CHECK_TIMESTAMPS is exported, the check_timestamps element will be called.

GST_DEBUG

Sets gstreamer logging level.