Skip to content

JS: Refactor CFGExtractor.java#6406

Merged
codeql-ci merged 19 commits intogithub:mainfrom
erik-krogh:cleanCfg
Aug 9, 2021
Merged

JS: Refactor CFGExtractor.java#6406
codeql-ci merged 19 commits intogithub:mainfrom
erik-krogh:cleanCfg

Conversation

@erik-krogh
Copy link
Copy Markdown
Contributor

@erik-krogh erik-krogh commented Aug 3, 2021

I considered taking a look at #6362

But the code in CFGExtractor.java wasn't that readable.
So before I start fixing bugs, I went ahead and refactored CFGExtractor.java.

  • There was a lot of Object types. I changed all of those to more descriptive types (mostly Collection<Node>).
    visitSequence still works with raw types. I don't think we would gain anything by refactoring that method any further.
  • Lots of unclear names: hcaerof, V, seq, succ. I tried to use more descriptive names.

No behavior change.
An evaluation looks good (no increase in extraction time).

This is based on a refactor I tried to do last year but never finished.

@github-actions github-actions Bot added the JS label Aug 3, 2021
@erik-krogh erik-krogh added the no-change-note-required This PR does not need a change note label Aug 3, 2021
@erik-krogh erik-krogh marked this pull request as ready for review August 3, 2021 20:42
@erik-krogh erik-krogh requested a review from a team as a code owner August 3, 2021 20:42
Copy link
Copy Markdown
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!!! 💪 This is seriously excellent work! 👏

The one thing I'm not so sure about is rewriting some hot methods to have shorter but potentially slower implementations. End-to-end benchmarking is dominated by other factors like TRAP writing, so behind the noise we can't really see how the performance of the CFG extractor was affected. Would you mind benchmarking it more closely, or just reverting those particular changes?

@esbena
Copy link
Copy Markdown
Contributor

esbena commented Aug 4, 2021

dca can collect the data for us, but we need to make a custom report generator to display the diff.
If you generate an experiment with --uninterpreted-queries query-suites/javascript/uninterpreted-queries.qls, then the data for https://github.com/github/codeql/blob/1a078c38ad956091f87bf6cbffa15cedc3f380bd/javascript/ql/src/meta/extraction-metrics/PhaseTimings.ql will be available.

@erik-krogh
Copy link
Copy Markdown
Contributor Author

dca can collect the data for us, but we need to make a custom report generator to display the diff.

I did that experiment on default.yml, and did a one-off report generator hack.
The CFG extraction got about 4.5% slower.
But the CFG extration was only about 4% of the total time spent (counting up the timings in PhaseTimings.ql).
So the expected total slowdown is 4% * 4.5% = 0.18%

I'll see if removing some streams and adding some more special cases helps.

private static Collection<Node> nonNullUnion(Collection<Node> xs, Collection<Node> ys) {
List<Node> result = new ArrayList<>(xs);
for (Node y : ys) {
if (!result.contains(y)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is O(n^2).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but n is always small.

I did an experiment trying to extract Microsoft/vscode, and n was at most 3.

My idea was to avoid allocating a HashSet.
Also, it's basically what the implementation did before my refactor.

@erik-krogh
Copy link
Copy Markdown
Contributor Author

I did that experiment on default.yml, and did a one-off report generator hack.
The CFG extraction got about 4.5% slower.

The CFG extraction got slightly faster from the last few commits, but still around 3% slower than before my refactor.

Copy link
Copy Markdown
Contributor

@asgerf asgerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, good enough for me

@codeql-ci codeql-ci merged commit 562ba49 into github:main Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

JS no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants