Swift: add String taint steps#11185
Conversation
| this.getField().getEnclosingDecl().(ExtensionDecl).getExtendedTypeDecl().getFullName() = | ||
| "String" | ||
| } | ||
| } |
There was a problem hiding this comment.
This appears to be a sensible approach, but I will note a few rough edges in the case of String: String.count, String.isEmpty, String.first and String.last. These things can't contain a lot of data (especially String.isEmpty which is a Bool) so it would be very difficult for an attacker to utilize a theoretical taint path through any of them. In CPP in particular we've found taint flow through strlen (equivalent to String.count) usually produces many more false positive results than useful ones.
I'm not sure what the right solution should be. We could try to exclude the ones we don't want from this TaintInheritingContent, or switch to an explicit modelling approach. We could potentially block all taint through 'small' types everywhere, solving the problem for good (but would this limit some uses of taint?). Or we could stick to our principles that these are taint flows and expect queries to sanitize on an ad-hoc basis if they tend to find false positive results?
There was a problem hiding this comment.
On the one hand, taint flow through small types is irrelevant for most "code injection" type queries. On the other hand, there was Heartbleed.
There was a problem hiding this comment.
Good point, there have been plenty of real bugs related to tainted integers (malicious image width/height bugs spring to mind as a second example) - and as you suggest our recent focus on code injection queries may be biasing my viewpoint.
I remain somewhat nervous of taint flow through strlen / .count. In CPP we've seen a lot of code like this:
char *buffer = (char *)malloc(sizeof(char) * strlen(inputString));
which is usually a reasonable thing to do, even if inputString is tainted. There were other similar patterns, and in the end we blocked taint through strlen. This particular example ought to be uncommon in Swift, because you don't need to explicitly allocate like this when you're using a proper String class. I suppose we could take a wait-and-see approach on whether flow through .count is actually a problem (or a positive) on Swift?
There was a problem hiding this comment.
It may also be worth asking other language teams about this. Experience from C/C++ might not apply very well to this judgment call, but experience from Java/C#/JavaScript/Python and related languages might be a better indication of which direction which should go for Swift.
There was a problem hiding this comment.
In Java, we usually (we aren't 100% consistent) don't consider things like length to be propagating taint, because there are few opportunities in a memory-safe language where a tainted integer could be a problem. Although IMHO, I think we should have those models as well, because some queries actually look for tainted integers. Other queries not interested in that (like code injection) can always add a type-based sanitizer that filters out uninteresting types.
In Swift, where memory management can be a thing, I think modeling these kind of methods makes even more sense. Maybe String.count should be a specific sanitizer for overflow-like vulnerabilities instead of not modeling it at all?
There was a problem hiding this comment.
If we opt for leaving .count et al as taint propagating steps, maybe we could have a library predicate like isConversionToSmallType(DataFlow::Node), so that queries that don't want this behavior can sanitize it out without reinventing the wheel.
There was a problem hiding this comment.
I'd say we keep all those fields/members as potential sources of taint until proven otherwise. As folks have mentioned here, even integers can be sources of taint in some cases. I'm wondering though if there's a way to implement @d10c suggestion without the query writer doing too much work here.
geoffw0
left a comment
There was a problem hiding this comment.
Happy with this. We should keep an eye on how flow through String.count works out in practice. And I've started a DCA run to check any impact on performance.
|
The DCA run failed spectacularly. Before we start another, it might be a good idea to merge latest main into this PR so that everything is up-to-date. |
|
Indeed, merging in |
|
Sounds good. Let me do that then, because my won DCA run also failed on all levels! I'll start another DCA run once I merge/rebase with main. |
44ed216 to
fc04766
Compare
fc04766 to
9bca57d
Compare
if a String is tainted, then all its fields (including those declared in extensions) should be tainted as well
9bca57d to
f0d9dab
Compare
geoffw0
left a comment
There was a problem hiding this comment.
My DCA run succeeded and I don't see anything concerning in the results. 🎉
Now that we have support for taint through fields of String, we can now detect certain flows that we previously marked as [NOT DETECTED]. This commit updates the expected output of CWE-079 (and the in-code annotation of the accompanying test case) to reflect that update.
If a
Stringis tainted, then all its fields should be tainted including computed properties such asutf8. This support follows a similar solution to what we do with URLs usingTaintInheritingContent.