资 源 简 介
io-watchdog is a facility for monitoring user applications and
parallel jobs for "hangs" which typically have a side effect
of ceasing all IO in a cyclic application (i.e. one that
writes something to a log or data file during each cycle of
computation). The io-watchdog attempts to watch all IO coming
from an application and triggers a set of user-defined actions
when IO has stopped for a configurable timeout period.
Read the full IO Watchdog README
NEWS
io-watchdog v0.8 released
2010-08-23
This version fixes a critical bug in the io-watchdog plugin for SLURM
which caused the --io-watchdog option to srun to fail to work for
all jobs unless the rank option was used (e.g. --io-watchdog rank=0).
All users of the io-watchdog plugin for SLURM should update to this release
immediately.
io-watchdog v0.7 r