Support automatic retry of failed builds
under review
A
Aaron
It would be great if CircleCI allowed you to configure an automatic retry of a build upon failure. Ideally, you would be able to specify the max number of times you would like it to retry.
CCI-I-935
Nathan Fish
under review
Would folks expect to see this work as a job configuration option or would they rather see it as a project setting and CircleCI tries to auto-retry based on some inspection of the job?
M
Maxime Lapointe
Nathan Fish Configuration would be more beneficial for us. Having different kind of backoff options and retry limits, and also the possibility of outputting metadata during/after retries (logging, further triggers in the workflow DAG, etc.)
G
Grandi Andrea
Nathan Fish both would be fine. In case of config, something like RETRY_ON_FAILURE=true and RETRY_MAX_ATTEMPTS=5 could work. Same for Project settings.
The benefits of being in the config is that in many cases devs could change these values with a PR
Iain Beeston
Nathan Fish Per configuration would be best for me. For me, most jobs aren't flakey and there's no point in retrying. If I could turn on auto retry for specific jobs I could have fine-grained control
J
J. Casalino
Nathan Fish Per configuration option would also be my preference; I would not want to set it as a global option for the entire server. I would want to set the backoff and retry limits as Maxime Lapointe mentioned.
C
Cody Smith
Nathan Fish Job config option would make the most sense to me. I've got some jobs that run on a flaky resource class, while others use the one of the reliable/builtin resource classes. So only the former need retries configured.
Mark Gibaud
Nathan Fish Per configuration. Really only for niche/specific jobs (like flakey tests) so job scope makes sense.
Nathan Fish
While maybe not exactly what folks are looking for here. Would https://circleci.canny.io/api-feature-requests/p/allow-re-run-failed-tests-to-be-triggered-via-api at least give you come flexibility to create your own automation rules for this? I'm thinking it would be handy for known flaky tests and such.
G
Grandi Andrea
Nathan Fish hi, honestly not much.
To know that a Ci job failed I would have to build a dedicated service which either poll your API (something we both don't want) or listens to some webhook you send and act on it (by parsing some response, getting the job ID which failed, calling an api etc....).
It would be much easier for you to know which job failed and just retrigger them (if the user has allowed such option), without asking users to implement their own service.
Lud
Grandi Andrea 100% agree. I don't want to provision a server, maintain custom code, figure out access in a secure way, etc.
D
Daniel Janicek
Nathan Fish I was looking for this feature today too. I think the ask (at least from my perspective) is more of convenience/ease of use thing. Auto-retries are totally possible right now through either on_fail or we have a retry.sh script to retry individual commands. It would make things easier though if I didn't have to sprinkle retry.sh across every command that might fail with a transient network issue.
It would be neat to have things auto-retry whenever a job fails with a HTTP 503(or related) request response anywhere in the workflow. IDK how realistic that is if Circle-ci only sees the script exit code, but it would be cool!
J
J. Casalino
+1, we also have occasional transient failures which cause the workflow to fail when a simple retry would resolve it. There's no reason to involve a human when a script can do the same thing. I would envision two settings for this: 1. Max number of retries, and 2. The delay between retries. If the delay is not specified, a progressive backoff should occur between retries (e.g. wait 1 minute after first failure, 2 minutes after second, 4 minutes after third, etc).
C
Craig Hawkes
Agree, being able to automate the rerun a single job if it fails would be useful.
It's possible to do this via the UI, and makes so much more sense to have it as an auto option.
One note - it would need to wait for all other jobs to complete, and then rerun all failed jobs
so basically a way to automate the option provided from the UI if there are any failed jobs (maybe optionally base on which jobs failed?)
Andrei R
Allowing automatic retries of failed jobs and workflows is good for CircleCI business. More retries = more compute time spent = more $$$
Amalia Nostalgia
5000€ aprobados, por favor necesito dinero.gracias Saludos Amalia
Conner Babb
+1, this would be a great feature
Lester DeKay
Please!
r
romw314
+1; I also want this feature.
S
Steve Mena
Ironically, we need this to retry transient failures in orb publishing due to a "Something unexpected happened" error.
Load More
→