Nornir task 失败

有时task可能会失败。让我们看看如何处理nornir中失败的task。

让我们像往常一样从所需的样板开始:

import logging

from nornir import InitNornir
from nornir.core.task import Task, Result
from nornir_utils.plugins.functions import print_result

# instantiate the nr object
nr = InitNornir(config_file="config.yaml")
# let's filter it down to simplify the output
cmh = nr.filter(site="cmh", type="host")

def count(task: Task, number: int) -> Result:
    return Result(
        host=task.host,
        result=f"{[n for n in range(0, number)]}"
    )

def say(task: Task, text: str) -> Result:
    if task.host.name == "host2.cmh":
        raise Exception("I can't say anything right now")
    return Result(
        host=task.host,
        result=f"{task.host.name} says {text}"
    )

现在,作为示例,我们将使用与上一教程中使用的task组类似的task组:

def greet_and_count(task: Task, number: int):
    task.run(
        name="Greeting is the polite thing to do",
        severity_level=logging.DEBUG,
        task=say,
        text="hi!",
    )

    task.run(
        name="Counting beans",
        task=count,
        number=number,
    )
    task.run(
        name="We should say bye too",
        severity_level=logging.DEBUG,
        task=say,
        text="bye!",
    )

    # let's inform if we counted even or odd times
    even_or_odds = "even" if number % 2 == 1 else "odd"
    return Result(
        host=task.host,
        result=f"{task.host} counted {even_or_odds} times!",
    )

请记住,上存在一个硬编码错误host2.cmh,让我们看看运行task时会发生什么:

result = cmh.run(
    task=greet_and_count,
    number=5,
)

让我们检查对象:

result.failed
True
result.failed_hosts
{'host2.cmh': MultiResult: [Result: "greet_and_count", Result: "Greeting is the polite thing to do"]}
result['host2.cmh'].exception
nornir.core.exceptions.NornirSubTaskError()
result['host2.cmh'][1].exception
Exception("I can't say anything right now")

如您所见,结果对象知道出了点问题,可以根据需要检查错误。

您还可以print_result在其上使用该功能:

print_result(result)
greet_and_count*****************************************************************
* host1.cmh ** changed : False *************************************************
vvvv greet_and_count ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
host1.cmh counted even times!
---- Counting beans ** changed : False ----------------------------------------- INFO
[0, 1, 2, 3, 4]
^^^^ END greet_and_count ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* host2.cmh ** changed : False *************************************************
vvvv greet_and_count ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv ERROR
Subtask: Greeting is the polite thing to do (failed)

---- Greeting is the polite thing to do ** changed : False --------------------- ERROR
Traceback (most recent call last):
  File "/home/dbarroso/workspace/dbarrosop/nornir/nornir/core/task.py", line 98, in start
    r = self.task(self, **self.params)
  File "<ipython-input-1-3ab8433d31a3>", line 20, in say
    raise Exception("I can't say anything right now")
Exception: I can't say anything right now

^^^^ END greet_and_count ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

如果task有错误,还有一种方法将引发异常:

from nornir.core.exceptions import NornirExecutionError
try:
    result.raise_on_error()
except NornirExecutionError:
    print("ERROR!!!")
ERROR!!!

跳过host

Nornir将跟踪失败的host,并且不会在它们上运行将来的task:

from nornir.core.task import Result

def hi(task: Task) -> Result:
    return Result(host=task.host, result=f"{task.host.name}: Hi, I am still here!")

result = cmh.run(task=hi)
print_result(result)
hi******************************************************************************
* host1.cmh ** changed : False *************************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
host1.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

您可以通过传递参数在失败的host上强制执行taskon_failed=True

result = cmh.run(task=hi, on_failed=True)
print_result(result)
hi******************************************************************************
* host1.cmh ** changed : False *************************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
host1.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* host2.cmh ** changed : False *************************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
host2.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

如果使用on_good标记,也可以排除掉“良好”的host:

result = cmh.run(task=hi, on_failed=True, on_good=False)
print_result(result)
hi******************************************************************************
* host2.cmh ** changed : False *************************************************
vvvv hi ** changed : False vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv INFO
host2.cmh: Hi, I am still here!
^^^^ END hi ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

为此,nornir可以在其共享数据对象中保留一组失败的host:

nr.data.failed_hosts
{'host2.cmh'}

如果您想将某些host标记为成功,并使它们恢复执行将来的task的资格,则可以使用功能restore_host在每个host上单独进行操作,或者使用reset_failed_hosts完全重置列表:

nr.data.reset_failed_hosts()
nr.data.failed_hosts
set()

自动引发错误

或者,您可以使用配置选项将nornir配置为在出现错误的情况下自动引发异常raise_on_error

nr = InitNornir(config_file="config.yaml", core={"raise_on_error": True})
cmh = nr.filter(site="cmh", type="host")
try:
    result = cmh.run(
        task=greet_and_count,
        number=5,
    )
except NornirExecutionError:
    print("ERROR!!!")
ERROR!!!

工作流程

默认的工作流程应适用于大多数用例,因为会跳过有错误的host,并且print_result应提供足够的信息以了解发生了什么情况。对于更复杂的工作流程,此框架应为您提供足够的空间来轻松实现它们,而不管其复杂性如何。下一个  以前

觉得文章有用?

点个广告表达一下你的爱意吧 !😁