Skip to content

Add query for tainted wordexp calls.#10077

Merged
geoffw0 merged 4 commits intogithub:mainfrom
intrigus-lgtm:cpp/wexpand-commmand-injection
Oct 17, 2022
Merged

Add query for tainted wordexp calls.#10077
geoffw0 merged 4 commits intogithub:mainfrom
intrigus-lgtm:cpp/wexpand-commmand-injection

Conversation

@intrigus-lgtm
Copy link
Copy Markdown
Contributor

This is my first query for C/C++ I think, so there is likely some improvement possible.

Comment thread cpp/ql/src/experimental/Security/CWE/CWE-078/WordexpTainted.qhelp Outdated
@intrigus-lgtm
Copy link
Copy Markdown
Contributor Author

intrigus-lgtm commented Aug 16, 2022

This is inspired by https://github.com/syoyo/tinygltf/issues/368.
Sadly, my query can not find that specific instance.
The taint seems to get lost in LoadFromString around lines 5648.
LoadFromString -> ParseBuffer -> LoadExternalFile -> FindFile -> fs->ExpandFilePath (ExpandFilePathFunction) -> ExpandFilePath (line 2611) -> wordexp (line 2640)

(The used database is from https://lgtm.com/projects/g/syoyo/tinygltf/ci/#ql)
EDIT: The used database can be found here instead: https://github.com/intrigus-lgtm/file-dump/blob/master/syoyo_tinygltf_cpp-srcVersion_81f7dbe53a112d05217a79bb1c986f9ada3b6631-dist_codeql-bundle-linux64-20220811-2843018186.zip?raw=true

Copy link
Copy Markdown
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @intrigus-lgtm, this look to be a nice little query, with great qldoc and a clear example.

I've made a couple of comments, I also intend to look into why taint is getting lost in the inspiring instance.

Comment thread cpp/ql/src/experimental/Security/CWE/CWE-078/WordexpTainted.qhelp Outdated
Comment thread cpp/ql/src/experimental/Security/CWE/CWE-078/WordexpTainted.qhelp
@geoffw0
Copy link
Copy Markdown
Contributor

geoffw0 commented Aug 17, 2022

Hi @intrigus-lgtm. syoyo/tinygltf contains a call to wordexp in tiny_gltf.h line 2661. However the LGTM build does not include this line (presumably because of the #if preprocessor condition a few lines up combined with the specific compliation flags / environment that was used in the build) - so the call is not in the database and your query can't be expected to find it as a result on this particular snapshot. I wouldn't worry about it, but if you're really determined you could attempt to build a snaphot of the project with different settings, or write test cases that are closer to what you see there (e.g. exercising taint flow through a std::string).

I'm also suspicious you're right about taint not reaching the function, but I'm not sure where it stops or why (or whether its correct to). Again, the query is not at fault.

@intrigus-lgtm
Copy link
Copy Markdown
Contributor Author

Hi @intrigus-lgtm. syoyo/tinygltf contains a call to wordexp in tiny_gltf.h line 2661. However the LGTM build does not include this line (presumably because of the #if preprocessor condition a few lines up combined with the specific compliation flags / environment that was used in the build) - so the call is not in the database and your query can't be expected to find it as a result on this particular snapshot. I wouldn't worry about it, but if you're really determined you could attempt to build a snaphot of the project with different settings, or write test cases that are closer to what you see there (e.g. exercising taint flow through a std::string).

I'm also suspicious you're right about taint not reaching the function, but I'm not sure where it stops or why (or whether its correct to). Again, the query is not at fault.

I guess the problem is that lgtm.com received the new commits that patched the problem (by commenting out the vulnerable code), so the db does not contain any calls to wordexp anymore.
You can try this slightly older database instead that works for me:
https://github.com/intrigus-lgtm/file-dump/blob/master/syoyo_tinygltf_cpp-srcVersion_81f7dbe53a112d05217a79bb1c986f9ada3b6631-dist_codeql-bundle-linux64-20220811-2843018186.zip?raw=true

I could successfully find the call to wordexp in that database yesterday, but taint is getting lost as described in #10077 (comment)

@geoffw0
Copy link
Copy Markdown
Contributor

geoffw0 commented Aug 17, 2022

Thanks for that update ... I think the taint issue might be that the flow into ParseBuffer is in a lambda expression, its likely such flows are less well tested that other more common situations. For some reason (perhaps to do with capturing the base_dir variable, or uncertainty that the lambda is called???) taint flow isn't working there. I'll file an internal issue so we get this fixed at some point.

Comment thread cpp/ql/src/experimental/Security/CWE/CWE-078/WordexpTainted.qhelp Outdated
@geoffw0
Copy link
Copy Markdown
Contributor

geoffw0 commented Aug 22, 2022

I've started the checks, and run a moderately large LGTM run of the query here: https://lgtm.com/query/1704064894746891004/ (results look good at a glance)

geoffw0
geoffw0 previously approved these changes Aug 23, 2022
Copy link
Copy Markdown
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to merge. Thank you for your contribution!

@intrigus-lgtm
Copy link
Copy Markdown
Contributor Author

Just to be sure, this does not need a review of the security lab?

@MathiasVP
Copy link
Copy Markdown
Contributor

Just to be sure, this does not need a review of the security lab?

Yes, it will. It will be merged once Security Lab has reviewed it and are happy with it :)

@JarLob
Copy link
Copy Markdown
Contributor

JarLob commented Aug 24, 2022

One FP pattern I have noticed is when the taint from char* is transferred to int through strlen and causes false positives. I think for this query a string to integer sanitizer is need.

I may add more comments in the thread as I review the results.

@geoffw0
Copy link
Copy Markdown
Contributor

geoffw0 commented Aug 24, 2022

I think for this query a string to integer sanitizer is need.

Its fairly common to simply block flow through all integer typed expressions, in order to stop this sort of thing. Something like:

override predicate isSanitizer(DataFlow::Node node) {
  node.asExpr().getUnspecifiedType() instanceof IntegralType
}

@JarLob
Copy link
Copy Markdown
Contributor

JarLob commented Aug 24, 2022

The query doesn't find the vulnerability in tinygltf built from changset 0fa56e239c77cc864dc248842e8887d985cf8e3f. Please fix it.

@JarLob
Copy link
Copy Markdown
Contributor

JarLob commented Aug 24, 2022

I see why it doesn't find. Please, disregard.

@JarLob
Copy link
Copy Markdown
Contributor

JarLob commented Aug 24, 2022

I switched the sink to any() to see all reachable nodes (I think it was Pavel who has shown the technique in one of CodeQL videos) and it looks to me the taint stops at bool fileread = fs.ReadWholeFile(&data, &fileerr, filename, fs.user_data); which is tinygltf specific function. I think CodeQL doesn't understand the taint flows to the first std::vector<unsigned char> *out parameter. I think it is ok to not include tinygltf specific steps in the general query.

@JarLob
Copy link
Copy Markdown
Contributor

JarLob commented Sep 12, 2022

@intrigus-lgtm Could you please add the sanitizer?

@intrigus-lgtm
Copy link
Copy Markdown
Contributor Author

@JarLob done, sorry for the delay.

Copy link
Copy Markdown
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I lost track of this. I think its ready to merge into experimental.

@geoffw0
Copy link
Copy Markdown
Contributor

geoffw0 commented Oct 7, 2022

Oh, and I'll run the checks now...

@intrigus-lgtm
Copy link
Copy Markdown
Contributor Author

@geoffw0 anything else missing?

Copy link
Copy Markdown
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@geoffw0 anything else missing?

I don't think so. Lets merge this.

@geoffw0 geoffw0 merged commit 2b3ab18 into github:main Oct 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants