Skip to content

C++: Work around poor codegen for forex in IR-based range analysis#13035

Merged
rdmarsh2 merged 1 commit intogithub:mainfrom
MathiasVP:ranked-phi-hack-for-bounded-phi
May 4, 2023
Merged

C++: Work around poor codegen for forex in IR-based range analysis#13035
rdmarsh2 merged 1 commit intogithub:mainfrom
MathiasVP:ranked-phi-hack-for-bounded-phi

Conversation

@MathiasVP
Copy link
Copy Markdown
Contributor

Recursion through forall and forex needs special care in QL. During the modulus-analysis phase of the IR-based range analysis such care is already taken, but the same level of care wasn't taken in the main range analysis module. This resulted in horrible antijoins on some projects. Here is one such example:

Tuple counts for antijoin_rhs#3/9@i7#L1#982a1web after 31.3s:
  259071899 ~88%        {11} r1 = JOIN boundedPhiCandValidForEdge#9#fffffffff#prev_delta WITH SemanticSSA#aa9d1d08::SemSsaReadPositionPhiInputEdge::phiInput#2#dispred#fff_102#join_rhs ON FIRST 1 OUTPUT Lhs.0 'arg0', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.3 'arg3', Lhs.4 'arg4', Lhs.5 'arg5', Lhs.6 'arg6', Lhs.7 'arg7', Lhs.8 'arg8', Rhs.1, Rhs.2
  83255000  ~80%        {11} r2 = r1 AND NOT boundedPhiCandValidForEdge#9#fffffffff#prev(Lhs.0 'arg0', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.3 'arg3', Lhs.4 'arg4', Lhs.5 'arg5', Lhs.6 'arg6', Lhs.10, Lhs.9)
  83255000  ~23702%     {9} r3 = SCAN r2 OUTPUT In.0 'arg0', In.1 'arg1', In.2 'arg2', In.3 'arg3', In.4 'arg4', In.5 'arg5', In.6 'arg6', In.7 'arg7', In.8 'arg8'
                        return r3

(I cancelled the evaluation midway)

This PR moves the modulus-analysis tricks to the Utils file, and uses it in the main range analysis module. This results in a much better pipeline:

Pipeline order_500000 for boundedPhi#7#fffffff@76642xk3 was evaluated in 1740 iterations totaling 282ms (delta sizes total: 36475).
        580384  ~11%    {8} r1 = SCAN boundedPhiRankStep#8#ffffffff#prev_delta OUTPUT In.0, In.7, In.1, In.2, In.3, In.4, In.5, In.6
        580384  ~11%    {8} r2 = r1 AND NOT boundedPhi#7#fffffff#prev(Lhs.0, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7)
        36850   ~10%    {7} r3 = JOIN r2 WITH RangeUtils#6da26777::maxPhiInputRank#2#ff ON FIRST 2 OUTPUT Lhs.0, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6, Lhs.7
                        return r3

The "trick" is to replace code such as:

predicate p() { forex(X x | range(x) | recPred(x)) }

with

X getElement(int i) { ... }

predicate pFromIndex(int i) {
  recPred(getElement(i)) and
  if i = 1 then any() else pFromIndex(i - 1)
}

predicate p() { pFromIndex(max(int i | exists(getElement(i))) }

@MathiasVP MathiasVP requested a review from a team as a code owner May 4, 2023 11:54
@github-actions github-actions Bot added the C++ label May 4, 2023
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label May 4, 2023
@aschackmull
Copy link
Copy Markdown
Contributor

It's really sad that we have to use hacks like this. @jbj

rank[r](SemSsaReadPositionPhiInputEdge e |
e.phiInput(phi, _)
|
e order by e.getOrigBlock().getUniqueId()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How unique are these ids? From what I can tell, this call ultimately leads to the database id of block.getFirstInstruction().getAst(). Can it never happen that an AST element produces two different instructions that are both first in a basic block?

Copy link
Copy Markdown
Contributor Author

@MathiasVP MathiasVP May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the result of e.getOrigBlock().getUniqueId() is the database id of e's IRBlock's getFirstInstruction().getAst(). I can't quite think of an AST element with that property, but I'm sure one can construct something like this (maybe something involving the ternary operator?).

Originally, we were using the display index from the basic block, but this was totally infeasible to compute on real projects. If you have any suggestions for an integer representation whose uniqueness is easier to argue I'm all for it 😄.

This scheme has been used for a while now already in the new range-analysis library (in the modulus-phase) without any observable issues, though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be easy to check with a consistency query whether two IRBlocks ever get the same id. And even if they do, I suppose it only matters here if those IRBlocks go into the same phi node.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

@MathiasVP MathiasVP requested a review from rdmarsh2 May 4, 2023 16:33
@MathiasVP
Copy link
Copy Markdown
Contributor Author

DCA looks fantastic 🤩.

@rdmarsh2
Copy link
Copy Markdown
Contributor

rdmarsh2 commented May 4, 2023

LGTM modulo @jbj's comment about uniqueness. I don't think we're making the situation worse here, but I'd like to see a consistency query for that in a followup PR.

@rdmarsh2 rdmarsh2 merged commit e32e28d into github:main May 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants