Fix performance regression in the block-level custom CSS feature by mukeshpanchal27 · Pull Request #11686 · WordPress/wordpress-develop

mukeshpanchal27 · 2026-04-30T10:34:29Z

Performance regression for #10777

By checking $block['attrs']['className'] first, it ensure that for the 90% of blocks that don't have custom CSS, the function returns in microseconds without ever triggering the preg_match.

Use of AI Tools

N/A

This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

Add early return for empty custom CSS class string

mukeshpanchal27 · 2026-04-30T10:38:31Z

Based on @westonruter’s findings (https://gist.github.com/westonruter/5ae4155059d2135197976f324d00645c#file-spx-claude-analysis-md), it looks like this runs for every block.

Since most blocks don’t have a custom class, we should return early so we don’t end up calling preg_match() on an empty string.

github-actions · 2026-04-30T10:38:49Z

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

All changes will be lost when closing a tab with a Playground instance.
All changes will be lost when refreshing the page.
A fresh instance is created each time the link below is clicked.
Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

github-actions · 2026-04-30T16:22:55Z

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props mukesh27, westonruter.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

westonruter · 2026-04-30T19:54:36Z

I asked Claude Opus 4.7 to review this change with the SPX profile reports loaded as context:

Looking at the PR's actual change vs. my profile data — it's a real micro-optimization but the gain is tiny in absolute terms. Here's the math:

What the PR does

Adds an early return when $block['attrs']['className'] === '', skipping the preg_match for blocks with no className at all. The PR description says "90% of blocks don't have custom CSS" but note it's actually checking for no className — blocks with is-style-default, alignment classes, etc., still run preg_match. So the early-return likely fires on a smaller fraction than 90%, but still a majority.

What the profile said about this function

From my three runs:

Page	Calls	Exclusive total	Per-call avg
Hello World (124 blocks)	124	32.9µs	0.27µs
Homepage v1 (934 blocks)	934	202.8µs	0.22µs
Homepage v2 (1,590 blocks)	1,590	202.9µs	0.13µs

The preg_match is the dominant cost inside the function — probably ~60% of the per-call time. Saving it for, say, 70% of calls cuts roughly half the total.

Realistic per-request savings

Page	Total regression	Savings from this PR	% of total
Hello World	8,000µs	~10–20µs	0.1–0.3%
Homepage 1,590 blocks	23,000µs	~100–150µs	~0.5%

So on the order of 100µs per page request in the homepage case, more like 10–20µs on a small page. Well under 1% of the wp70 regression, and probably less than that in production (profiler overhead inflates per-call costs).

Verdict

It's a legitimate, zero-risk fix — six added lines, no behavior change since preg_match on '' would have returned no match and taken the same code path. But on its own it does not materially close the wp7.0-vs-6.9 gap.

The bigger lesson: the regression in this code path lives one level up. wp_render_custom_css_class_name is registered as a render_block filter — so it fires on every single block, full stop. Even with the PR, you're paying:

one filter dispatch per block (~150ns × 1,590 = 240µs)
one extra hashtable lookup ($block['attrs']['className']) per block
the PHP function call overhead

The structural fix would be to not register the filter at all unless there's at least one block with custom CSS support enabled (which is rare — it's an opt-in feature). A check at init to count custom-CSS-supporting block types and conditionally register the filter would eliminate ~99% of these calls entirely. That'd save the full ~200µs on the homepage rather than ~100µs.

Same pattern applies to wp_render_custom_css_support_styles, block_core_gallery_render_context, and the anchor support filters. Each PR like 11686 is a small win; the bigger wins are at the registration layer.

Worth landing this PR as-is, but I'd encourage the contributor (or you) to also propose conditional filter registration as the larger follow-up.

Add check for empty custom CSS before block support validation.

mukeshpanchal27 · 2026-05-01T06:13:18Z

@westonruter Thanks for the report 😄

While we’re figuring out the best way to register the filter, I updated the PR with a change that avoids calling the Block Registry: a331593

westonruter · 2026-05-05T17:31:36Z

While we’re figuring out the best way to register the filter, I updated the PR with a change that avoids calling the Block Registry: a331593

Here's Claude's analysis with this addition:

The new revision adds a second, more impactful optimization. Now there are two changes:

Change 1 (unchanged): `wp_render_custom_css_class_name`

Same early-return on empty className we already analyzed. ~100µs/page on the homepage.

Change 2 (new): `wp_render_custom_css_support_styles` — much bigger win

The reorder is actually the meaningful one. Original ordering was:

$block_type = WP_Block_Type_Registry::get_instance()->get_registered(...); // hash lookup
if ( ! block_has_support( $block_type, 'customCSS', true ) ) return ...;   // walks supports tree
$custom_css = trim( $parsed_block['attrs']['style']['css'] ?? '' );
if ( empty( $custom_css ) ) return ...;

New ordering:

$custom_css = $parsed_block['attrs']['style']['css'] ?? '';                // O(1) array lookup
if ( empty( trim( $custom_css ) ) ) return ...;
$block_type = WP_Block_Type_Registry::get_instance()->get_registered(...); // only if needed
if ( ! block_has_support( $block_type, 'customCSS', true ) ) return ...;

This is exactly the structural improvement I called out in my last response on the previous revision — moving the cheapest check first. From my round 2 profile (1,590 blocks):

Old path per block: array lookup + registry singleton call + get_registered() hashtable lookup + block_has_support() walking the supports tree = ~0.5µs each, ~750µs total across all blocks
New path per block (no CSS, the ~99% case): single array lookup + trim of '' + empty() = ~0.05µs each
Savings: ~675–800µs per page request

That's roughly 5–8× the savings of the className change alone. The two changes together save ~800µs–1ms per homepage request — still under 1% of the 123ms total regression, but a meaningful "death by a thousand cuts" reduction in the per-block tax.

Small nit on the new code

if ( empty( trim( $custom_css ) ) ) return ...;

trim() runs on every call, including when $custom_css === '' (hot path). A micro-tweak:

if ( '' === $custom_css || '' === trim( $custom_css ) ) return ...;

…short-circuits the trim for the dominant empty case. Almost certainly noise-level (~10ns × 1,500 calls ≈ 15µs), so not worth blocking on. Mention it as an inline comment if anything.

Behavior change check

The reorder changes the order of two early-return paths but produces the same outcome for every block:

Block with no style.css AND no support → both versions early-return (just by different paths)
Block with no style.css AND has support → both versions early-return
Block with style.css AND no support → both versions early-return
Block with style.css AND has support → both versions proceed

So no semantic change. Pure win.

What I'd say on the PR

Nice follow-up — moving the cheap attribute check ahead of the registry/support lookups is the more impactful change of the two. On a 1,590-block test page this saves ~700–800µs, vs. ~100µs from the className check alone. Solid micro-optimization with no behavior change.

One small nit: empty( trim( $custom_css ) ) runs trim() even on the empty hot path. '' === $custom_css || '' === trim( $custom_css ) would skip it. Probably ~15µs/page, so optional.

Also: PR description still describes only the className change — worth updating it to mention the support_styles reorder too, since that's where most of the benefit comes from now.

The bigger structural win (only registering these render_block / render_block_data filters when at least one block type actually opts into customCSS) is still on the table as a follow-up, but this PR is a clear improvement and worth landing as-is.

westonruter · 2026-05-05T18:40:20Z

I obtained the SPX report before and after.

trunk: call_count: 744269 → 744.27K
PR: call_count: 740707 → 740.71K
Δ: −3,562 calls (−0.48%)

Per Claude, that accounts for:

Saved calls	Where
891	`WP_Block_Type_Registry::get_registered` (one per `wp_render_custom_css_support_styles` early-return)
891	`WP_Block_Type_Registry::get_instance` (same)
891	`block_has_support` (same)
~889	Remaining — the bulk of these come from the className change (`preg_match` plus its internal helpers skipped on blocks where `$block['attrs']['className']` is empty), and a handful from the trim/empty path being shorter

The 891 number is the smoking gun — it matches exactly the count of wp_render_custom_css_support_styles calls that now hit the early return before doing the registry/support lookups. The relationship is precise enough that you can verify the optimization worked without measuring time at all.

Just to put the magnitude in perspective: 0.48% fewer function dispatches per request. If average dispatch cost is ~1µs (reasonable order-of-magnitude on warm opcache), that's ~3.5ms of savings — and that's why call counts are a more sensitive instrument than wall time for changes this size. It's also why the wall-time delta is invisible: ~3.5ms of expected savings is right in the same ballpark as the ±2ms run-to-run drift in your 500-iteration medians.

(Weston here) I ran benchmark-server-timing with and without the fix applied over 500 iterations. The homepage is showing 57 posts.

Without the fix applied:

URL	http://localhost:8870/
Success Rate	99.2%
Response Time (median)	674.55
wp-before-template (median)	41.49
wp-template (median)	188.58
wp-filter-render_block_data (median)	0.01
wp-filter-render_block (median)	0.01
wp-filter-the_content (median)	0.1
wp-total (median)	230.47

With the fix applied:

URL	http://localhost:8870/
Success Rate	99.2%
Response Time (median)	704.34
wp-before-template (median)	41.91
wp-template (median)	190.12
wp-filter-render_block_data (median)	0.01
wp-filter-render_block (median)	0.01
wp-filter-the_content (median)	0.11
wp-total (median)	232.1

Claude's analysis of this data:

Honest reading: the PR's measured wall-time impact is statistically zero at this scale. Both runs land at ~231ms wp-total ± a couple of ms of noise. Specifically:

Metric	Without	With	Δ
wp-total	230.47ms	232.1ms	+1.63ms
wp-template	188.58ms	190.12ms	+1.54ms
wp-before-template	41.49ms	41.91ms	+0.42ms
Response Time	674.55ms	704.34ms	+29.8ms

A 1.63ms median delta on a 230ms baseline is +0.7% — well within run-to-run drift between two separate 500-iteration runs. Background CPU, thermal state, MySQL/opcache warmth, and container scheduling all contribute multiple-ms of systematic variance between sessions, so I would not read +1.63ms as "the PR is slower." It's noise. (The +29.8ms Response Time delta confirms this — that's whole-request including network overhead and is even noisier; it's drifting in the same direction as wp-total at ~5×, consistent with environmental drift rather than the PR.)

Reconciling with the call-count evidence

The SPX run showed the optimization is structurally working (−891 lookup calls per page). My estimate was the savings would be ~100–1000µs. The benchmark confirms that the upper bound of the actual savings is under the noise floor of a 500-iteration median — which fits cleanly with my estimate. If the PR were saving 5ms per request it would show; saving 0.1–0.5ms is invisible at this signal-to-noise ratio.

So the PR is:

✅ Doing what it claims (call counts down by exactly the predicted amount)
✅ Risk-free (no behavior change)
❌ Not measurably improving page time on its own

What this means for the bigger regression hunt

This is the uncomfortable truth about the wp70-vs-wp69 regression: none of the individual fixes from my earlier list will measurably move the needle on their own. The ~25–30ms total regression is genuinely composed of many ~0.1–1ms costs. Killing them one at a time will keep producing benchmark results that look exactly like this one — within noise.

That has implications for the workflow:

Don't try to validate individual micro-optimizations with end-to-end benchmarks. They can't resolve sub-ms changes. Use call counts (rock-solid) and per-function exclusive timings from SPX (less reliable but still indicative) for that level.
Stack several optimizations and benchmark the bundle. If you fix 5–10 such issues at once and re-benchmark, the cumulative ~5–10ms should clear the noise floor.
Reserve benchmark-server-timing for confirming the bundle, not for confirming individual fixes.

The structural wins I called out earlier — conditionally registering the render_block_data filters only when at least one registered block actually opts into customCSS, caching WP_Theme_JSON_Resolver::get_merged_data per origin, static-caching wp_get_global_styles inside the layout filter — are the ones that should individually clear the noise floor, because each saves multiple ms or eliminates an entire filter dispatch chain. This PR (good as it is) is a "death by a thousand cuts" fix and will need to land alongside many siblings to show up in benchmarks.

Recommendation for what to say on the PR

Something like:

Confirmed via SPX call counts that the optimization fires correctly: −891 calls each to WP_Block_Type_Registry::get_registered, get_instance, and block_has_support per homepage render (matching the count of wp_render_custom_css_support_styles invocations that now skip them). Estimated savings ~100–500µs/request — too small to surface above the noise floor in a 500-iteration benchmark-server-timing median (within ±2ms drift between runs), but call-count evidence is unambiguous. Worth landing as part of a broader sweep through the new per-block filter overhead.

The PR is correct and worth merging. Just don't expect any individual benchmark to vindicate it.

Co-authored-by: Weston Ruter <westonruter@gmail.com>

github-actions · 2026-05-06T04:15:34Z

Trac Ticket Missing

This pull request is missing a link to a Trac ticket. For a contribution to be considered, there must be a corresponding ticket in Trac.

To attach a pull request to a Trac ticket, please include the ticket's full URL in your pull request description. More information about contributing to WordPress on GitHub can be found in the Core Handbook.

mukeshpanchal27 added 4 commits April 16, 2026 10:10

Update BASE_TAG default version to 6.8.0

79df64b

Merge branch 'WordPress:trunk' into trunk

57bd3b6

Return early if no custom CSS class is provided

98aa939

Add early return for empty custom CSS class string

Update default BASE_TAG version to 6.7.0

391f5b7

mukeshpanchal27 requested a review from westonruter April 30, 2026 10:38

mukeshpanchal27 marked this pull request as ready for review April 30, 2026 16:22

Merge branch 'trunk' into perf/10777

ac2597e

Merge branch 'trunk' into perf/10777

7e5a7d0

mukeshpanchal27 self-assigned this May 1, 2026

Refactor custom CSS support validation logic

a331593

Add check for empty custom CSS before block support validation.

westonruter reviewed May 5, 2026

View reviewed changes

Comment thread src/wp-includes/block-supports/custom-css.php Outdated

westonruter reviewed May 5, 2026

View reviewed changes

Comment thread src/wp-includes/block-supports/custom-css.php Outdated

westonruter reviewed May 5, 2026

View reviewed changes

Comment thread src/wp-includes/block-supports/custom-css.php Outdated

westonruter reviewed May 5, 2026

View reviewed changes

Comment thread src/wp-includes/block-supports/custom-css.php Outdated

mukeshpanchal27 and others added 2 commits May 6, 2026 09:38

Merge branch 'trunk' into perf/10777

68c4f7e

Apply suggestions from code review

439b39f

Co-authored-by: Weston Ruter <westonruter@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix performance regression in the block-level custom CSS feature#11686

Fix performance regression in the block-level custom CSS feature#11686
mukeshpanchal27 wants to merge 9 commits intoWordPress:trunkfrom
mukeshpanchal27:perf/10777

mukeshpanchal27 commented Apr 30, 2026

Uh oh!

mukeshpanchal27 commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

westonruter commented Apr 30, 2026

Uh oh!

mukeshpanchal27 commented May 1, 2026

Uh oh!

westonruter commented May 5, 2026

Uh oh!

westonruter commented May 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mukeshpanchal27 commented Apr 30, 2026

Use of AI Tools

Uh oh!

mukeshpanchal27 commented Apr 30, 2026

Uh oh!

github-actions Bot commented Apr 30, 2026

Test using WordPress Playground

Some things to be aware of

Uh oh!

github-actions Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

westonruter commented Apr 30, 2026

What the PR does

What the profile said about this function

Realistic per-request savings

Verdict

Uh oh!

mukeshpanchal27 commented May 1, 2026

Uh oh!

westonruter commented May 5, 2026

Change 1 (unchanged): wp_render_custom_css_class_name

Change 2 (new): wp_render_custom_css_support_styles — much bigger win

Small nit on the new code

Behavior change check

What I'd say on the PR

Uh oh!

westonruter commented May 5, 2026

Reconciling with the call-count evidence

What this means for the bigger regression hunt

Recommendation for what to say on the PR

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026

Trac Ticket Missing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Apr 30, 2026 •

edited

Loading

Change 1 (unchanged): `wp_render_custom_css_class_name`

Change 2 (new): `wp_render_custom_css_support_styles` — much bigger win