C++: stitch paths and ignore cast arrays in constant off-by-one query#13045
Conversation
MathiasVP
left a comment
There was a problem hiding this comment.
A couple of comments, but otherwise this LGTM once we have a successful DCA run
| FieldAddressToPointerArithmeticFlow::flowPath(fieldSource, sink) and | ||
| isFieldAddressSource(f, fieldSource.getNode()) and | ||
| pai.getLeft() = sink.getNode().(DataFlow::InstructionNode).asInstruction() and | ||
| pai.getElementSize() = f.getUnspecifiedType().(ArrayType).getBaseType().getSize() and |
There was a problem hiding this comment.
I think we should push the f.getUnspecifiedType() instanceof ArrayType conjunct into the FieldAddressToPointerArithmeticConfig::isSource predicate (or even better: into the isFieldAddressSource predicate) to get a smaller isSource predicate for the first dataflow traversal.
Thinking more about this: If we push the restriction saying that f.getUnspecifiedType() must be an ArrayType into the isSource predicate we can use a flow state to remember the size of the array. Does that not mean that we would be able to handle cases like the one you added:
char *charBuf = (char*) arr->buf;
charBuf[MAX_SIZE_BYTES] = 0;then?
MathiasVP
left a comment
There was a problem hiding this comment.
I like this new approach, but I fear there's a potential performance problem with the current code.
MathiasVP
left a comment
There was a problem hiding this comment.
A couple of small remaining things, but otherwise this LGTM!
MathiasVP
left a comment
There was a problem hiding this comment.
LGTM once DCA shows that performance is fine. Although, it looks like the latest DCA run still shows some performance issues?
|
I've pushed two performance-related commits @rdmarsh2 to your branch. Feel free to modify them as you see fit. I'll start a DCA run to check the performance impact of them. Hopefully that's done by the time your day starts tomorrow 🤞. |
This PR does two things - stitch the first and second data flow paths together to make reviewing results easier, and then fix a false positive where casts to a differently-sized type (e.g.
int[]tochar*) would result in false positives.