Rerun job from the job detail page | Feature Requests

Rerun job from the job detail page

complete

sinsvend@gmail.com

Right now you have the possiblity to "Rerun job with SSH" from the job detail url. 
But to rerun the failed job without SSH, I need to go to the workflow tag and then find "rerun failed jobs".
I would like a small improvement in the UI so the button, maybe was a dropdown button and I was able to select rerun, but not with debug. 
(the reason for that i do not want with debug is because it takes on of the agents as hostage and we cant run as much in parallel as we want to)
CCI-I-446

June 4, 2018

Liya Ai

This feature has shipped! Please feel free to comment if there are any additional questions.

Rafael Ruiz

A little weird but we have found the Rerun buttons are only available to folks with write access to the repo. Is there a way to configure this such that anyone who can open a PR which triggers the job can rerun their own jobs?

andrew.rosca@dominodatalab.com

+1 for this.

Nico Ritsche

Would really like to see the possibility to rerun individual pipelines in a parallel workflow step. Coming from Codeship, this is something I miss in CircleCi.

Edward Anderson

Ok, no problem if individual jobs can’t be restarted arbitrarily.
To be clear, all we want is to not have to navigate to the workflow page to
take the “Rerun from failed” action. It’s just a UX problem.
The failed job knows its associated workflow (when there is one), and you
should be able to determine whether the “Rerun from failed” action is
available. When that action is available on the job’s workflow, add that
action to the Rebuild dropdown on the failed job page, and we’re golden!
This should be doable, right?

Nathan Dintenfass

Run from failed continues where the previous workflow left off, restoring its state and then continuing on. In that case, all failed jobs rerun, and the graph of the original workflow is intact and has expected behavior. For a completed and successful workflow there's no "place where it left off" to restart from, so the integrity of the graph is no longer present. In addition, we don't keep the state of a successful, completed workflow around for as long, but more importantly what exactly should happen in terms of downstream jobs would not be as clear in the situation of restarting an arbitrary job in a workflow graph. We are exploring more options than what we currently have (which is return the entire workflow) that would allow both the simple case to stay simple (a single job in a single workflow, for instance), but without creating either non-determinant or unexpected behaviors.

Jonas T

Why don't you have the same problems right now with "Rerun failed jobs"? As far as I can tell, if I have three jobs running concurrently, it makes no difference whether one of them is already running for the second time because it failed once. Is there one? Does it have access to the workspace of its previous failed execution? Or to workspaces of other concurrent jobs that are still running? 🧐

Nathan Dintenfass

@Jonas - it's not that there is intrinsic interdependency as much as that there could be depending on how you configure your build. For instance, if you're using workspaces (which share data across jobs) or approval jobs (which change the effective actor on downstream jobs) we run into either stale state or ambiguous permissions in some cases. That said, we definitely hear that people would like better rerun control over individual jobs, so we're looking into the feasibility of more nuanced rerun machinery and/or a way to better control long-term state of a workflow (or possibly have rerun capabilities on individual jobs something that times out, so we're not holding the workflow state indefinitely). tldr: jobs execute independently unless you use techniques that make them dependent, and our rerun machinery is not (yet) sensitive to that distinction, so we have fairly coarse capabilities on that front. We're also looking at letting you string workflows together, which would allow breaking out logical units more effectively into separate workflows, making the "cost" of a workflow-level rerun lower in many cases. I don't want to overpromise because all of that is not yet prioritized on a backlog, but we are keenly aware that some folks would like to have more granular control over execution, and we're tackling that broader issue on a variety of fronts.

Jonas T

Aren't concurrently running jobs completely independent? I'm still wondering why I can't just rerun a single container out of a several-container-job, if only that one container failed. There must be more interdependency that I'm realizing, so could you elaborate a little more on what those are?

Nathan Dintenfass

Edward - if you click through to the workflow you can run from the failed jobs again. The reason we couldn't put that link directly on the job is that it could create a race condition with the rest of the running workflow. Re-running a single failed job in a multi-job workflow is something we'll be looking at, but it gets tricky in terms of how you'd actually want things to play out given the relationships among jobs in that situation.

→