Skip to content

Commit 7458b52

Browse files
barbados-clemensnx-cloud[bot]graphite-app[bot]
authored
docs(nx-cloud): document new flaky tasks analytics (#33253)
Co-authored-by: nx-cloud[bot] <71083854+nx-cloud[bot]@users.noreply.github.com> Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>
1 parent 4170c2f commit 7458b52

File tree

9 files changed

+65
-5
lines changed

9 files changed

+65
-5
lines changed
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

astro-docs/src/content/docs/features/CI Features/flaky-tasks.mdoc

Lines changed: 53 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -39,12 +39,60 @@ In this image, the `e2e-ci--src/e2e/app.cy.ts` task is a flaky task that has bee
3939

4040
When a flaky task fails in CI with [distributed task execution](/docs/features/ci-features/distribute-task-execution) enabled, Nx will **automatically send that task to a different agent** and run it again (up to 2 tries in total). Its important to run the task on a different agent to ensure that the agent itself or the other tasks that were run on that agent are not the reason for the flakiness.
4141

42-
## Manually Mark a Task as Flaky or Not Flaky
42+
## Flaky Task Analytics
4343

44-
If you suspect that a task is flaky, but Nx has not confirmed it yet, you can manually **mark it as likely flaky** from the run details screen. Failed tasks that are not flaky will have a button that says **"Mark task as likely flaky"**.
44+
{% aside type="note" title="Enterprise Feature" %}
45+
Workspace flaky task analytics is currently available for organizations on the Enterprise plan. Reach out if your organization is [interested in Nx Enterprise](https://nx.dev/enterprise?utm_source=nx.dev&utm_medium=callout&utm_campaign=flaky-task-analytics).
46+
{% /aside %}
4547

46-
![Mark task as likely flaky button](../../../../assets/features/ci-features/mark-task-as-likely-flaky.png)
48+
Nx Cloud provides analytics to help you understand and manage flaky tasks across your workspace. The analytics dashboard gives you insights into which tasks are flaky, how often they fail, and how much time is being wasted on reruns.
4749

48-
Once you've resolved the issue that caused a task to be flaky, you can immediately mark the task as not flaky by clicking on **"Mark task as no longer flaky"** on the same run details screen.
50+
![Flaky Tasks dashboard](../../../../assets/features/ci-features/nx-cloud-flaky-tasks-metrics-chart.avif)
4951

50-
![Mark task as no longer flaky button](../../../../assets/features/ci-features/mark-task-as-no-longer-flaky.png)
52+
The dashboard displays key metrics over the time range selected (7 days vs 30 days) to give you a quick overview of your workspace health.
53+
54+
- **Active flaky tasks** - The total number of tasks in your workspace that have a flake rate greater than 0 within the selected time window.
55+
- **Average flake rate** - A weighted average flake rate across all tasks in your workspace. This metric uses the sample size to weight each task's flake rate proportionally, so a task that ran 1000 times with 5% flake rate has more impact than one that ran 10 times with 50% flake rate.
56+
- **High risk tasks** - The number of tasks with a flake rate higher than 20%, indicating severe reliability issues that need immediate attention.
57+
58+
The chart shown provides a visual representation of your flaky tasks, helping you quickly identify which tasks need the most attention.
59+
60+
Tasks are plotted based on their **impact score**, which is calculated as `flake_rate × sample_size`. This means frequently-run flaky tasks are weighted higher than rarely-run flaky tasks.
61+
62+
Priority levels are determined using percentile-based thresholds that scale across organizations of any size:
63+
64+
- **High priority** (red) - Top 10% of tasks by impact score (90th percentile and above). These tasks have severe flakiness and should be addressed immediately.
65+
- **Medium priority** (yellow) - Next 23% of tasks by impact score (67th-90th percentile). These tasks have moderate flakiness with sufficient data.
66+
- **Low priority** (gray) - Bottom 67% of tasks by impact score. These tasks have minor flakiness or not enough data to be concerning.
67+
68+
Tasks on the right side of the chart typically represent the highest priority items that need attention. The scatter plot shows up to 50 results, sorted by most recent flaked tasks.
69+
70+
### Flaky Task Table
71+
72+
The table provides detailed information about each flaky task in your workspace:
73+
74+
![Flaky Tasks Analytics Table](../../../../assets/features/ci-features/nx-cloud-flaky-tasks-table.avif)
75+
76+
By default, the table loads your most recent flaky tasks. Each row includes:
77+
78+
- **Task** - The project and target combination (e.g., `my-app:test`)
79+
- **Flake rate** - Measures how often a task succeeds due to flakiness. Specifically, it represents the percentage of total successes that came from unreliable (flaky) task hashes: `flaky_successes / (flaky_successes + non_flaky_successes)`. This tells you: "Of all the times this task succeeded, how many successes came from unreliable code?"
80+
- **Total reruns** - The number of times a task was executed more than once due to flakiness. This counts the "extra" executions that happened because the task failed and needed to be retried. Calculated as: `total_executions - unique_hash_count`
81+
- **Time wasted** - An estimate of the total time spent on reruns, calculated by multiplying the total reruns by the average task duration
82+
- **Last failure** - The timestamp of the most recent failure across all contributing task hashes
83+
84+
#### Flaky Task Detail View
85+
86+
Click on any row in the table to view detailed information about a specific flaky task.
87+
88+
The **Overview** tab shows summary statistics and trends for the selected task such as flake rate, time wasted and automatic deflake counts
89+
90+
![Flaky Task Detail Overview](../../../../assets/features/ci-features/nx-cloud-flaky-tasks-details.avif)
91+
92+
The **Activity** tab displays a timeline of all executions, showing when the task failed and succeeded to jump directly into the runs.
93+
94+
![Flaky Task Detail Activity](../../../../assets/features/ci-features/nx-cloud-flaky-tasks-detail-activity.avif)
95+
96+
The **Environments** tab provides insights into the different environments where the task was executed, helping identify if certain environments contribute to flakiness.
97+
98+
![Flaky Task Detail Environments](../../../../assets/features/ci-features/nx-cloud-flaky-tasks-detail-environment.avif)

astro-docs/src/content/docs/reference/glossary.mdoc

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,12 @@ A script that performs some action on your code. This can include building, lint
9292

9393
> See: [Executors and Configurations](/docs/concepts/executors-and-configurations)
9494

95+
### Flake Rate
96+
97+
A metric that measures how often a task succeeds due to flakiness rather than reliable code. It represents the percentage of total successes that came from unreliable (flaky) task hashes: `flaky_successes / (flaky_successes + non_flaky_successes)`. This answers the question: "Of all the times this task succeeded, how many successes came from unreliable code?"
98+
99+
> See: [Flaky Task Analytics](/docs/features/ci-features/flaky-tasks#flaky-task-analytics)
100+
95101
### Flaky Tasks
96102

97103
Tasks that will sometimes succeed and sometimes fail without any change to the inputs. These tasks are often e2e tests and are particularly problematic in CI. Nx Cloud automatically detects flaky tasks and re-runs them.
@@ -110,6 +116,12 @@ A computer science concept that consists of nodes connected by edges. In the Nx
110116

111117
> See: [Explore the Graph](/docs/features/explore-graph)
112118

119+
### Impact Score
120+
121+
A metric used in [Flaky Task Analytics](#flaky-tasks) to prioritize which flaky tasks need attention. Calculated as `flake_rate × sample_size`, it weights frequently-run flaky tasks higher than rarely-run flaky tasks. For example, a task with 50% flake rate and 100 runs has an impact score of 50, the same as a task with 10% flake rate and 500 runs.
122+
123+
> See: [Flaky Task Analytics](/docs/features/ci-features/flaky-tasks#flaky-task-analytics)
124+
113125
### Launch Template
114126

115127
Launch Templates are used to set up an agent machine. They specify a resource class, an image and a series of set up steps before tasks are executed on that machine.

0 commit comments

Comments
 (0)