Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 24 additions & 19 deletions cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/ModelUtil.qll
Original file line number Diff line number Diff line change
Expand Up @@ -32,25 +32,34 @@ DataFlow::Node callInput(CallInstruction call, FunctionInput input) {
}

/**
* Gets the instruction that holds the `output` for `call`.
* Gets the node that represents the output of `call` with kind `output` at
* indirection index `indirectionIndex`.
*/
Node callOutput(CallInstruction call, FunctionOutput output) {
private Node callOutputWithIndirectionIndex(
CallInstruction call, FunctionOutput output, int indirectionIndex
) {
// The return value
simpleOutNode(result, call) and
output.isReturnValue()
output.isReturnValue() and
indirectionIndex = 0
or
// The side effect of a call on the value pointed to by an argument or qualifier
exists(int index, int indirectionIndex |
exists(int index |
result.(IndirectArgumentOutNode).getArgumentIndex() = index and
result.(IndirectArgumentOutNode).getIndirectionIndex() = indirectionIndex and
result.(IndirectArgumentOutNode).getIndirectionIndex() = indirectionIndex - 1 and
result.(IndirectArgumentOutNode).getCallInstruction() = call and
output.isParameterDerefOrQualifierObject(index, indirectionIndex)
output.isParameterDerefOrQualifierObject(index, indirectionIndex - 1)
Comment on lines +49 to +51
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the subtractions needed here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not quite happy with that. The indirection index of IndirectArgumentOutNode is defined to be [1 .. n] (up to some number of indirections), but in the caller of this predicate (callOutput/3) the not exists(getIndirectReturnOutNode(call, indirectionIndex + d)) predicate expects the indirection index to be defined [0 .. n - 1] (because that's the indirection index range of IndirectReturnOutNode since it can be both a "direct return" with indirection index being 0, or an indirect return with indirection index being >= 1).

So while it's very difficult to reason about this indirection index, it's very easy to see when they're wrong since any other offsets added here breaks some of our tests 😂. I'd like to clean up this so that we don't need these subtractions at some point, but I'd rather not add too much more onto this PR.

)
or
exists(int ind |
result = getIndirectReturnOutNode(call, ind) and
output.isReturnValueDeref(ind)
)
result = getIndirectReturnOutNode(call, indirectionIndex) and
output.isReturnValueDeref(indirectionIndex)
}

/**
* Gets the instruction that holds the `output` for `call`.
*/
Node callOutput(CallInstruction call, FunctionOutput output) {
result = callOutputWithIndirectionIndex(call, output, _)
}

DataFlow::Node callInput(CallInstruction call, FunctionInput input, int d) {
Expand All @@ -76,19 +85,15 @@ private IndirectReturnOutNode getIndirectReturnOutNode(CallInstruction call, int
*/
bindingset[d]
Node callOutput(CallInstruction call, FunctionOutput output, int d) {
exists(DataFlow::Node n | n = callOutput(call, output) and d > 0 |
exists(DataFlow::Node n, int indirectionIndex |
n = callOutputWithIndirectionIndex(call, output, indirectionIndex) and d > 0
|
// The return value
result = getIndirectReturnOutNode(n.asInstruction(), d)
result = callOutputWithIndirectionIndex(call, output, indirectionIndex + d)
or
// If there isn't an indirect out node for the call with indirection `d` then
// we conflate this with the underlying `CallInstruction`.
not exists(getIndirectReturnOutNode(call, d)) and
not exists(getIndirectReturnOutNode(call, indirectionIndex + d)) and
n = result
or
// The side effect of a call on the value pointed to by an argument or qualifier
exists(Operand operand, int indirectionIndex |
Ssa::outNodeHasAddressAndIndex(n, operand, indirectionIndex) and
Ssa::outNodeHasAddressAndIndex(result, operand, indirectionIndex + d)
)
Comment on lines -88 to -92
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this no longer needed?

Copy link
Copy Markdown
Contributor Author

@MathiasVP MathiasVP Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now handled by the result = callOutputWithIndirectionIndex(call, output, indirectionIndex + d) line because callOutputWithIndirectionIndex has a case for IndirectArgumentOutNode. So when we have:

n = callOutputWithIndirectionIndex(call, output, indirectionIndex) and
result = callOutputWithIndirectionIndex(call, output, indirectionIndex + d)

in callOutput, we get an equivalent condition.

Does that make sense?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially. My main problem is, that it's rather hard for me to map the old code on the new code, because it uses different predicates.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not an easy commit to review as it's not a clear line-for-line mapping 😢. The main point of the commit is to get an indirection index out of callOutput/2 (which has been renamed to callOutputWithIndirectionIndex) so that we can compute indirectionIndex + d on line 92 to "increment" the number of indirections of the resulting node by d.

)
}
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ private class PointerWrapperTypeIndirection extends Indirection instanceof Point
override predicate isAdditionalDereference(Instruction deref, Operand address) {
exists(CallInstruction call |
operandForFullyConvertedCall(getAUse(deref), call) and
this = call.getStaticCallTarget().getClassAndName("operator*") and
this = call.getStaticCallTarget().getClassAndName(["operator*", "operator->", "get"]) and
address = call.getThisArgumentOperand()
)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,8 @@ reverseRead
argHasPostUpdate
postWithInFlow
| test.cpp:384:10:384:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:384:10:384:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:391:10:391:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:391:10:391:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:400:10:400:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:400:10:400:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:407:10:407:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| test.cpp:407:10:407:13 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
viableImplInCallContextTooLarge
uniqueParameterNodeAtPosition
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,6 @@ reverseRead
argHasPostUpdate
postWithInFlow
| realistic.cpp:54:16:54:47 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| realistic.cpp:54:16:54:47 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| realistic.cpp:60:16:60:18 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
| realistic.cpp:60:16:60:18 | memcpy output argument | PostUpdateNode should not be the target of local flow. |
viableImplInCallContextTooLarge
uniqueParameterNodeAtPosition
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2199,6 +2199,19 @@ WARNING: Module TaintTracking has been deprecated and may be removed in future (
| map.cpp:436:55:436:59 | def | map.cpp:436:19:436:60 | call to pair | TAINT |
| map.cpp:436:63:436:67 | first | map.cpp:436:7:436:67 | call to iterator | |
| map.cpp:437:7:437:9 | m35 | map.cpp:437:7:437:9 | call to unordered_map | |
| map.cpp:446:23:446:23 | call to map | map.cpp:448:3:448:3 | m | |
| map.cpp:446:23:446:23 | call to map | map.cpp:449:12:449:12 | m | |
| map.cpp:446:23:446:23 | call to map | map.cpp:451:1:451:1 | m | |
| map.cpp:447:12:447:26 | call to indirect_source | map.cpp:448:10:448:10 | p | |
| map.cpp:448:3:448:3 | m | map.cpp:448:4:448:4 | call to operator[] | TAINT |
| map.cpp:448:3:448:3 | ref arg m | map.cpp:449:12:449:12 | m | |
| map.cpp:448:3:448:3 | ref arg m | map.cpp:451:1:451:1 | m | |
| map.cpp:448:3:448:10 | ... = ... | map.cpp:448:4:448:4 | call to operator[] [post update] | |
| map.cpp:448:4:448:4 | call to operator[] [post update] | map.cpp:448:3:448:3 | ref arg m | TAINT |
| map.cpp:448:10:448:10 | p | map.cpp:448:3:448:10 | ... = ... | |
| map.cpp:449:12:449:12 | m | map.cpp:449:13:449:13 | call to operator[] | TAINT |
| map.cpp:449:12:449:12 | ref arg m | map.cpp:451:1:451:1 | m | |
| map.cpp:449:13:449:13 | call to operator[] | map.cpp:450:8:450:8 | q | |
| movableclass.cpp:8:2:8:15 | this | movableclass.cpp:8:27:8:31 | constructor init of field v [pre-this] | |
| movableclass.cpp:8:21:8:22 | _v | movableclass.cpp:8:29:8:30 | _v | |
| movableclass.cpp:8:29:8:30 | _v | movableclass.cpp:8:27:8:31 | constructor init of field v | TAINT |
Expand Down
13 changes: 13 additions & 0 deletions cpp/ql/test/library-tests/dataflow/taint-tests/map.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -436,3 +436,16 @@ void test_unordered_map()
sink(m35.emplace(std::pair<char *, char *>(source(), "def")).first); // $ MISSING: ast,ir
sink(m35); // $ MISSING: ast,ir
}

namespace {
int* indirect_source();
void indirect_sink(int*);
}

void test_indirect_taint() {
std::map<int, int*> m;
int* p = indirect_source();
m[1] = p;
int* q = m[1];
sink(q); // $ ir MISSING: ast
}
2 changes: 2 additions & 0 deletions cpp/ql/test/library-tests/dataflow/taint-tests/taint.ql
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ module IRTest {
or
source.asIndirectExpr().(FunctionCall).getTarget().getName() = "source"
or
source.asIndirectExpr().(FunctionCall).getTarget().getName() = "indirect_source"
or
source.asParameter().getName().matches("source%")
or
exists(FunctionCall fc |
Expand Down