Retry Safety

Retry safety is a very advanced topic and most developers using Flawless don't need to be aware of it. Flawless will always do the right thing to guarantee that each line of code is executed exactly once. If this guarantee is not possible Flawless will mark the workflow as failed. This document describes how developers can additionally harden workflows, so that they can automatically recover from the rarest edge case failures.

Retry safety is a property specific to the implementation of the Flawless side effect log system. If a workflow is abruptly stopped, Flawless will retry it again using the existing side effect log to assure that no side effect is executed twice. However, retries are tricky and not always safe to do. It very much depends on what external systems you are talking to and if they are idempotent.

Flawless follows Rust's philosophy here, and always fails in case it can't guarantee complete safety, but offers an escape hatch for developers if they know that the operation is safe to repeat. This escape hatch comes in the form of the .idempotent() function.

Let's look at an example and failure scenario from which Flawless can't recover.

use flawless::workflow;
use flawless_http::post;

#[workflow("http-call")]
fn http_call() {
    let response = post("https://example.com/safe-to-repeat")
        .set_header("Accept", "application/json")
        .send()
        .unwrap();
    let response_txt = String::from_utf8(response.body()).unwrap();
    log::info!("{}", response_txt.trim());
}

If this workflow is interrupted before or after the .send() call, Flawless will always be able to resume it. In case this workflow stops exactly when the request is sent, but no answer yet received, then Flawless can't tell if the external endpoint observed the call and will not repeat it to provide a called exactly once guarantee.

It's also important to mention that Flawless doesn't care if the endpoint returns an error. It's the developer's duty to handle this case. Flawless is only concerned with failure scenarios that the developer can't handle directly in code. If the machine is shut down while some code is running, no amount of if-elses is going to help you. But once the machine is back up, Flawless will make it seem like the code just continued running from where it stopped.

By marking the request as .idempotent(), we tell Flawless that this endpoint is idempotent and can be called multiple times. This makes the workflow completely invincible, and now it always can be executed until completion.

use flawless::workflow;
use flawless_http::post;

#[workflow("http-call")]
fn http_call() {
    let response = post("https://example.com/safe-to-repeat")
        .set_header("Accept", "application/json")
        .idempotent()
        .send()
        .unwrap();
    let response_txt = String::from_utf8(response.body()).unwrap();
    log::info!("{}", response_txt.trim());
}

Idempotence is not only limited to the flawless_http crate, it's general concept and the Idempotence trait is part of the flawless crate.