[BlackboxBenchmarking] Add daily cron job to aggregate fuzzer stats by dylanjew · Pull Request #5265 · google/clusterfuzz

dylanjew · 2026-05-04T14:37:41Z

Adds cron job to aggregate fuzzer stats into a daily bigquery table fuzzer_stats.daily_stats.

Context

We will use this to benchmark our blackbox fuzzers, previously we couldn't easily join the fuzzing hours from BigQuery with the bugs filed by clusterfuzz in our dashboards. We need a separate aggregated table because the fuzzer_stats JobRun tables are all in separate datasets per fuzzer, and we can't simply query across all of those datasets in BigQuery or Plx.

The cron job defaults to yesterdays stats so we can run it after the stats are loaded into bigquery, but takes a date flag so we can backfill days as necessary.

Idempotency

Whenever a date is inserted, the schema uses WRITE_TRUNCATE with a date partition to overwrite all of the rows for that date. So if the job runs multiple times for the same day, it will not add additional rows but overwrite any previous rows for that date.

This simplifies edge cases where the job fails or runs multiple times. We can just make sure the last run of the job succeeds and the data will be correct. It will just pull in the latest data on the JobRun tables for the fuzzers.

Example query:

select fuzzer_name,
SUM(fuzzing_duration) as fuzzing_duration,
SUM(testcases_executed) as testcases_executed,
from `your-project.fuzzer_stats.daily_stats`
group by fuzzer_name
order by fuzzing_duration desc
limit 1000;

The remaining work here is to set up the cron job configuration. This PR only adds the logic for the job. crbug.com/501066151

Related PRs:

These migrate the bigquery and datastore schemas to support the new fields
#5264
#5263

Testing

Ran this against the dev data and verified that the fuzzer stats bigquery table is populated.
Logs from dev: https://paste.googleplex.com/4884361662038016

After the job inserted the aggregated rows into BigQuery, I was able to compare the aggregated testcase stats and fuzzing hours between fuzzers for a given date range.

dylanjew added 9 commits April 30, 2026 11:59

WIP cronjob to aggregate fuzzer stats table

30ee9fd

Change upload to WRITE_TRUNCATE

d3ab515

Fix interval format and poll completion

216a6ac

switch to SQL formatting

bfa0fef

update test

a2bc975

use concurrent threads

a5c79bd

error handling

5023305

Clean up and fixes

1d7f4b1

Add date flag

9e7225b

dylanjew requested a review from a team as a code owner May 4, 2026 14:37

dylanjew changed the title ~~Add daily cron job to aggregate fuzzer stats~~ [BlackboxBenchmarking] Add daily cron job to aggregate fuzzer stats May 4, 2026

remove bigquery script

21c24de

dylanjew force-pushed the dylanj/aggregate-stats branch from ffc1050 to 21c24de Compare May 4, 2026 14:52

dylanjew requested a review from aakallam May 4, 2026 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BlackboxBenchmarking] Add daily cron job to aggregate fuzzer stats#5265

[BlackboxBenchmarking] Add daily cron job to aggregate fuzzer stats#5265
dylanjew wants to merge 10 commits intodylanj/builtin-indexfrom
dylanj/aggregate-stats

dylanjew commented May 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dylanjew commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Idempotency

Example query:

Related PRs:

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dylanjew commented May 4, 2026 •

edited

Loading