Pullers do not recover after a database operation failure
If a data pull fails, e.g. database is stopped temporarily, judging from the logs, or lack thereof, it stops scheduling future pulls.
There are two types of logged failures which may (or not) have to be mitigated in order to keep triggering pulls:
-
Errors were encountered while pulling data from apps
inperiodicexecutor.go:167
-
Problem getting interval
inperiodicexecutor.go:181
Comments in code suggest that there is an intent to periodically check:
// Interval is used while the puller is inactive to check if it was re-enabled.
const InactiveInterval int64 = 60
But it does not function, or is designed to not function in this case. It makes sense to treat failures the same way as intentional disabling.
This sketchy patch fixes the problem, but is too invasive in one part, and probably too incomplete in other parts.
diff --git a/backend/server/agentcomm/puller.go b/backend/server/agentcomm/puller.go
index c48348d1..7b38e5b0 100644
--- a/backend/server/agentcomm/puller.go
+++ b/backend/server/agentcomm/puller.go
@@ -51,5 +51,2 @@ func NewPeriodicPuller(db *dbops.PgDB, agents ConnectedAgents, pullerName, inter
)
- if err != nil {
- return nil, err
- }
@@ -64,3 +61,3 @@ func NewPeriodicPuller(db *dbops.PgDB, agents ConnectedAgents, pullerName, inter
- return periodicPuller, nil
+ return periodicPuller, err
}
diff --git a/backend/util/periodicexecutor.go b/backend/util/periodicexecutor.go
index 731d8bb1..188c453c 100644
--- a/backend/util/periodicexecutor.go
+++ b/backend/util/periodicexecutor.go
@@ -183,3 +183,2 @@ func (executor *PeriodicExecutor) executorLoop() {
log.Errorf("Problem getting interval: %+v", err)
- return
}
@@ -190,9 +189,3 @@ func (executor *PeriodicExecutor) executorLoop() {
- if interval <= 0 && executor.active {
- // if executor should be disabled but it is active then
- if executorInterval != InactiveInterval {
- executor.Reset(InactiveInterval)
- }
- executor.active = false
- } else if interval > 0 && interval != executorInterval {
+ if interval > 0 && interval != executorInterval {
// if executor interval is changed and is not 0 (disabled)