Record job failures with application context.
AT-AT needs to be able to track which user tasks failed and why. To accomplish this we: - Enabled Celery's results backend, which logs task results to a data store; a Postgres table, in our case. (https://docs.celeryproject.org/en/latest/userguide/tasks.html#result-backends) - Created tables to track the relationships between the relevant models (Environment, EnvironmentRole) and their task failures. - Added an `on_failure` hook that tasks can use. The hook will add records to the job failure tables. Now a resource like an `Environment` has access to it task failures through the corresponding failure table. Notes: - It might prove useful to use a real foreign key to the Celery results table eventually. I did not do it here because it requires that we explicitly store the Celery results table schema as a migration and add a model for it. In the current implementation, AT-AT can be agnostic about where the results live. - We store the task results indefinitely, so it is important to specify tasks for which we do not care about the results (like `send_mail`) via the `ignore_result` kwarg.
This commit is contained in:
41
tests/test_jobs.py
Normal file
41
tests/test_jobs.py
Normal file
@@ -0,0 +1,41 @@
|
||||
import pytest
|
||||
|
||||
from atst.jobs import RecordEnvironmentFailure, RecordEnvironmentRoleFailure
|
||||
|
||||
from tests.factories import EnvironmentFactory, EnvironmentRoleFactory
|
||||
|
||||
|
||||
def test_environment_job_failure(celery_app, celery_worker):
|
||||
@celery_app.task(bind=True, base=RecordEnvironmentFailure)
|
||||
def _fail_hard(self, environment_id=None):
|
||||
raise ValueError("something bad happened")
|
||||
|
||||
environment = EnvironmentFactory.create()
|
||||
celery_worker.reload()
|
||||
|
||||
# Use apply instead of delay since we are testing the on_failure hook only
|
||||
task = _fail_hard.apply(kwargs={"environment_id": environment.id})
|
||||
with pytest.raises(ValueError):
|
||||
task.get()
|
||||
|
||||
assert environment.job_failures
|
||||
job_failure = environment.job_failures[0]
|
||||
assert job_failure.task == task
|
||||
|
||||
|
||||
def test_environment_role_job_failure(celery_app, celery_worker):
|
||||
@celery_app.task(bind=True, base=RecordEnvironmentRoleFailure)
|
||||
def _fail_hard(self, environment_role_id=None):
|
||||
raise ValueError("something bad happened")
|
||||
|
||||
role = EnvironmentRoleFactory.create()
|
||||
celery_worker.reload()
|
||||
|
||||
# Use apply instead of delay since we are testing the on_failure hook only
|
||||
task = _fail_hard.apply(kwargs={"environment_role_id": role.id})
|
||||
with pytest.raises(ValueError):
|
||||
task.get()
|
||||
|
||||
assert role.job_failures
|
||||
job_failure = role.job_failures[0]
|
||||
assert job_failure.task == task
|
Reference in New Issue
Block a user