Skip to content

C++: add overflow detection to new range analysis#12599

Merged
MathiasVP merged 16 commits intogithub:mainfrom
rdmarsh2:rdmarsh2/range-analysis-overflow
Apr 17, 2023
Merged

C++: add overflow detection to new range analysis#12599
MathiasVP merged 16 commits intogithub:mainfrom
rdmarsh2:rdmarsh2/range-analysis-overflow

Conversation

@rdmarsh2
Copy link
Copy Markdown
Contributor

No description provided.

@rdmarsh2 rdmarsh2 requested a review from a team as a code owner March 20, 2023 18:36
@github-actions github-actions Bot added the C++ label Mar 20, 2023
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Mar 21, 2023
Copy link
Copy Markdown
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a first pass over the PR, but I think I'd like another round before I claim to fully understand what's going on. In the meantime, we should:

  • Fix the merge conflicts. 700a566 makes me think some of the changes have already been merged into main.
  • Check the performance on this. I suggest you follow the approach I had here: #11928 (comment)

Comment on lines +53 to +70
predicate typeBounds(SemType t, float lb, float ub) {
exists(SemIntegerType integralType, float limit |
integralType = t and limit = 2.pow(8 * integralType.getByteSize())
|
if integralType instanceof SemBooleanType
then lb = 0 and ub = 1
else
if integralType.isSigned()
then (
lb = -(limit / 2) and ub = (limit / 2) - 1
) else (
lb = 0 and ub = limit - 1
)
)
or
// This covers all floating point types. The range is (-Inf, +Inf).
t instanceof SemFloatingPointType and lb = -(1.0 / 0.0) and ub = 1.0 / 0.0
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looked super scary until I realized it was copy/pasted from SimpleRangeAnalysis.

import cpp
import experimental.semmle.code.cpp.semantic.analysis.ModulusAnalysis
import experimental.semmle.code.cpp.semantic.Semantic
//import experimental.semmle.code.cpp.semantic.Semantic
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented out? Should we just delete it?

Comment on lines +246 to +248
signature module OverflowSig<DeltaSig D> {
predicate semExprDoesntOverflow(boolean positively, SemExpr expr);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this module signature (and its one predicate) could use some QLDoc.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking: could we also write the predicate as semExprDoesNotOverflow. It took me a while to parse Doesnt 😅

Comment on lines +35 to +41
typeBounds(expr.getSemType(), lb, ub) and
ConstantStage::initialBounded(expr, any(ConstantBounds::SemZeroBound b), delta, positively.booleanNot(), _, _, _)
|
positively = true and delta < ub
or
positively = false and delta > lb
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking my understanding here: We're finding the minimum and maximum values from the the of expr, and then we're comparing those values with the bounds we can obtain from the constant stage.

I'm not sure I understand the logic, though. If we're trying to deduce that expr doesn't overflow positively, we're checking whether delta (which is the lower bound of expr) is less than the maximum value of the type? Shouldn't this be the other way around? Or have I confused myself?

Comment on lines +43 to +51
/*
predicate semExprOverflow(float delta, boolean upper, SemExpr expr) {
exists(float lb, float ub | typeBounds(expr.getSemType(), lb, ub) |
upper = false and delta < lb
or
upper = true and delta > ub
)
}
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be deleted?

* due to a conversion.
*/
predicate exprMightOverflowPositively(Expr expr) { none() }
predicate exprMightOverflowPositively(Expr expr) { upperBound(expr) > exprMaxVal(expr) }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was confused about this for a second: we only generate bounds for expressions that won't overflow (due to the disjunction in the bounded predicate), but here it looks like we're looking for expressions that actually do have an overflowing upper bound. However, this works because upperBound is defined to only use the constant stage of range-analysis which doesn't (and can't) have that disjunction that rules out overflowing expressions. Is this correct?

Could we write this explicitly in a comment here, or add QLDoc to upperBound and lowerBound to make this clear?

int total = 0;
for (i = 0; i < 2; i = i+1) {
range(i); // $ range=<=1 range=>=0
range(i); // $ range=<=1 MISSING: range=>=0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we losing these results because we're now claiming that i+1 on line 41 overflows?

In general: Should we add an InlineExpectationsTest that highlights all the places where we're claiming an overflow might occur. This would give me a lot more confidence in what's actually going on here.

@rdmarsh2 rdmarsh2 force-pushed the rdmarsh2/range-analysis-overflow branch from 3c49432 to 204dbee Compare March 21, 2023 15:02
Comment on lines +973 to +980
positively = false and
(
expr.getOpcode() instanceof Opcode::Negate or
expr.getOpcode() instanceof Opcode::SubOne
)
or
positively = false and
expr.(SemDivExpr).getSemType() instanceof SemFloatingPointType
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
positively = false and
(
expr.getOpcode() instanceof Opcode::Negate or
expr.getOpcode() instanceof Opcode::SubOne
)
or
positively = false and
expr.(SemDivExpr).getSemType() instanceof SemFloatingPointType
positively = false and
(
expr.getOpcode() instanceof Opcode::Negate or
expr.getOpcode() instanceof Opcode::SubOne or
expr.(SemDivExpr).getSemType() instanceof SemFloatingPointType
)

Copy link
Copy Markdown
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing through with this, Robert! I have a few comments, but it's mostly just to help my understanding.

Comment thread cpp/ql/test/library-tests/ir/range-analysis/Overflow.ql Outdated
rdmarsh2 and others added 3 commits April 11, 2023 10:41
Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
@MathiasVP
Copy link
Copy Markdown
Contributor

DCA is showing a really bad join in potentiallyOverflowingExpr (for example, when running on ImageMagick-ImageMagick) 😂.

Before:

```
Tuple counts for RangeAnalysisStage#38d7ce80::RangeStage#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisImpl#edd69a76::ConstantBounds#FloatDelta#0eab55d1::FloatOverflow#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::potentiallyOverflowingExpr#2#ff/2@36ed7auu after 42.1s:
    365        ~0%     {2} r1 = JOIN num#SemanticOpcode#e6f455a5::TNegate#f WITH SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs ON FIRST 1 OUTPUT false, Rhs.1 'expr'

    0          ~0%     {2} r2 = JOIN num#SemanticOpcode#e6f455a5::TSubOne#f WITH SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs ON FIRST 1 OUTPUT false, Rhs.1 'expr'

    365        ~0%     {2} r3 = r1 UNION r2

    0          ~0%     {2} r4 = JOIN num#SemanticOpcode#e6f455a5::TAddOne#f WITH project#SemanticExpr#91573b9a::SemKnownExpr#class#fff_10#join_rhs ON FIRST 1 OUTPUT true, Rhs.1 'expr'

    2          ~0%     {2} r5 = JOIN m#RangeAnalysisStage#38d7ce80::RangeStage#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisImpl#edd69a76::ConstantBounds#FloatDelta#0eab55d1::FloatOverflow#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::boundedPhiInp1#6#ffbfff WITH num#SemanticOpcode#e6f455a5::TMul#f CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 'positively'
    22026      ~0%     {2} r6 = JOIN r5 WITH SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'positively', Rhs.1 'expr'

    2          ~0%     {2} r7 = JOIN m#RangeAnalysisStage#38d7ce80::RangeStage#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisImpl#edd69a76::ConstantBounds#FloatDelta#0eab55d1::FloatOverflow#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::boundedPhiInp1#6#ffbfff WITH num#SemanticOpcode#e6f455a5::TShiftLeft#f CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 'positively'
    1978       ~0%     {2} r8 = JOIN r7 WITH SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'positively', Rhs.1 'expr'

    24004      ~0%     {2} r9 = r6 UNION r8
    24004      ~0%     {2} r10 = r4 UNION r9
    24369      ~0%     {2} r11 = r3 UNION r10

    2726       ~1%     {2} r12 = JOIN project#SemanticExpr#91573b9a::SemDivExpr#fffff WITH project#SemanticExpr#91573b9a::SemKnownExpr#class#fff#2 ON FIRST 1 OUTPUT Rhs.1, Lhs.0 'expr'
    1900       ~2%     {2} r13 = JOIN r12 WITH SemanticType#3725723c::SemFloatingPointType#ff ON FIRST 1 OUTPUT false, Lhs.1 'expr'

    4500       ~0%     {1} r14 = JOIN SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs WITH num#SemanticOpcode#e6f455a5::TAdd#f ON FIRST 1 OUTPUT Lhs.1 'expr'

    0          ~0%     {1} r15 = JOIN SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs WITH num#SemanticOpcode#e6f455a5::TPointerAdd#f ON FIRST 1 OUTPUT Lhs.1 'expr'

    4500       ~0%     {1} r16 = r14 UNION r15
    4000       ~0%     {2} r17 = JOIN r16 WITH project#SemanticExpr#91573b9a::SemBinaryExpr#fffff#2 ON FIRST 1 OUTPUT Rhs.1, Lhs.0 'expr'
    7000       ~0%     {2} r18 = JOIN r17 WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'expr'

    1264       ~0%     {2} r19 = JOIN r18 WITH num#Sign#2ecc774b::TNeg#f ON FIRST 1 OUTPUT Lhs.0, Lhs.1 'expr'
    188324151  ~0%     {2} r20 = JOIN r19 WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'expr', Rhs.1
    1000       ~0%     {2} r21 = JOIN r20 WITH project#SemanticExpr#91573b9a::SemBinaryExpr#fffff ON FIRST 2 OUTPUT false, Lhs.0 'expr'

    2900       ~0%     {2} r22 = r13 UNION r21

    3259       ~2%     {2} r23 = JOIN r18 WITH num#Sign#2ecc774b::TPos#f ON FIRST 1 OUTPUT Lhs.0, Lhs.1 'expr'
    1521124720 ~0%     {2} r24 = JOIN r23 WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'expr', Rhs.1
    3000       ~2%     {2} r25 = JOIN r24 WITH project#SemanticExpr#91573b9a::SemBinaryExpr#fffff ON FIRST 2 OUTPUT true, Lhs.0 'expr'
```
(I stopped evaluation midway.)

After:

```ql
Evaluated relational algebra for predicate RangeAnalysisStage#38d7ce80::RangeStage#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisImpl#edd69a76::ConstantBounds#FloatDelta#0eab55d1::FloatOverflow#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::potentiallyOverflowingExpr#2#ff@dc3a0712 with tuple counts:
  26269   ~2%    {2} r1 = _SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs_m#RangeAnalysisStage#38d7ce80::R__#shared UNION _SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs_SemanticType#3725723c::SemFloati__#shared

  26269   ~2%    {2} r2 = _num#SemanticOpcode#e6f455a5::TAddOne#f_project#SemanticExpr#91573b9a::SemKnownExpr#class#fff_10#joi__#shared UNION r1

  41333   ~1%    {2} r3 = JOIN _SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs_num#SemanticOpcode#e6f455a5::TAd__#shared WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1

  5806   ~2%    {2} r4 = JOIN r3 WITH num#Sign#2ecc774b::TNeg#f ON FIRST 1 OUTPUT Lhs.1, Lhs.0
  5806   ~1%    {3} r5 = JOIN r4 WITH project#SemanticExpr#91573b9a::SemBinaryExpr#fffff#2 ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0
  3612   ~0%    {2} r6 = JOIN r5 WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff ON FIRST 2 OUTPUT false, Lhs.2

  18476   ~1%    {2} r7 = JOIN r3 WITH num#Sign#2ecc774b::TPos#f ON FIRST 1 OUTPUT Lhs.1, Lhs.0
  18476   ~1%    {3} r8 = JOIN r7 WITH project#SemanticExpr#91573b9a::SemBinaryExpr#fffff#2 ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0
  18444   ~2%    {2} r9 = JOIN r8 WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff ON FIRST 2 OUTPUT true, Lhs.2

  22056   ~0%    {2} r10 = r6 UNION r9

  24137   ~2%    {2} r11 = JOIN _SemanticExpr#91573b9a::SemExpr::getOpcode#0#dispred#fb_10#join_rhs_num#SemanticOpcode#e6f455a5::TPo__#shared WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1

  16966   ~2%    {1} r12 = JOIN r11 WITH num#Sign#2ecc774b::TPos#f ON FIRST 1 OUTPUT Lhs.1
  16966   ~4%    {2} r13 = JOIN r12 WITH project#SemanticExpr#91573b9a::SemBinaryExpr#fffff ON FIRST 1 OUTPUT Rhs.1, Lhs.0
  24917   ~1%    {2} r14 = JOIN r13 WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1
  2781   ~0%    {2} r15 = JOIN r14 WITH num#Sign#2ecc774b::TNeg#f ON FIRST 1 OUTPUT false, Lhs.1

  2817   ~0%    {1} r16 = JOIN r11 WITH num#Sign#2ecc774b::TNeg#f ON FIRST 1 OUTPUT Lhs.1
  2817   ~0%    {2} r17 = JOIN r16 WITH project#SemanticExpr#91573b9a::SemBinaryExpr#fffff ON FIRST 1 OUTPUT Rhs.1, Lhs.0
  6922   ~0%    {2} r18 = JOIN r17 WITH SignAnalysisCommon#4b1623af::SignAnalysis#FloatDelta#0eab55d1::FloatDelta#RangeUtils#6da26777::RangeUtil#FloatDelta#0eab55d1::FloatDelta#RangeAnalysisConstantSpecific#878f81e8::CppLangImplConstant##::semExprSign#1#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1
  2765   ~1%    {2} r19 = JOIN r18 WITH num#Sign#2ecc774b::TPos#f ON FIRST 1 OUTPUT true, Lhs.1

  5546   ~0%    {2} r20 = r15 UNION r19
  27602   ~0%    {2} r21 = r10 UNION r20
  53871   ~0%    {2} r22 = r2 UNION r21
                  return r22
```
@MathiasVP
Copy link
Copy Markdown
Contributor

MathiasVP commented Apr 17, 2023

@rdmarsh2 I've pushed a join-order fix, and I'll restart DCA.

MathiasVP
MathiasVP previously approved these changes Apr 17, 2023
Copy link
Copy Markdown
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

MathiasVP
MathiasVP previously approved these changes Apr 17, 2023
@MathiasVP MathiasVP merged commit bb8c3de into github:main Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants