C++: Model Microsoft's "Active Template Library"#18136
C++: Model Microsoft's "Active Template Library"#18136MathiasVP merged 65 commits intogithub:mainfrom
Conversation
…icitly handle the case where the function (or class) is not a template.
257c657 to
de75e03
Compare
|
I've responded to all the comments now, Geoffrey. While fixing some of them I also spotted a bug in one of the models that I fixed in 279a30c |
geoffw0
left a comment
There was a problem hiding this comment.
Another 8 commits reviewed (to the end of CPathT).
| - ["", "CComSafeArray", True, "CComSafeArray", "(const SAFEARRAY &)", "", "Argument[*0]", "Argument[-1].Field[*m_psa]", "value", "manual"] | ||
| - ["", "CComSafeArray", True, "CComSafeArray", "(const SAFEARRAY *)", "", "Argument[*0]", "Argument[-1].Field[*m_psa]", "value", "manual"] | ||
| - ["", "CComSafeArray", True, "Add", "(const SAFEARRAY *)", "", "Argument[*0]", "Argument[-1].Field[*m_psa]", "value", "manual"] | ||
| - ["", "CComSafeArray<T>", True, "Add", "(const T &,BOOL)", "", "Argument[*@0]", "Argument[-1].Field[*m_psa].Field[*@pvData]", "value", "manual"] |
There was a problem hiding this comment.
What's going on here? I mean, what is .Field[*m_psa].Field[*@pvData] and why do we sometimes reference it instead of just .Field[*m_psa]?
There was a problem hiding this comment.
The m_psa field is a public member of CComSafeArray: https://learn.microsoft.com/en-us/cpp/atl/reference/ccomsafearray-class?view=msvc-170#m_psa. So it's possible to do:
CComSafeArray<int> array;
array.Add(42);
LPSAFEARRAY underlyingArray = array.m_psa;
int* data = (int*)underlyingArray->pvData;
printf("%d", *data); // this will print 42So in order to capture this flow we model the effect of calling Add by making it a write to the m_psa->pvData member.
|
Wonderful 🦅 eye'ing, Geoffrey! I've responded to all of them now. |
geoffw0
left a comment
There was a problem hiding this comment.
Review complete. 😅 🎉
Overall this looks fantastic, I expect it will really improve flow (and sources for that matter) in projects that use this stuff heavily.
| - ["", "CAtlFileMappingBase", True, "GetData", "", "", "Argument[-1]", "ReturnValue[*]", "taint", "manual"] | ||
| - ["", "CAtlFileMappingBase", True, "GetHandle", "", "", "Argument[-1]", "ReturnValue", "taint", "manual"] | ||
| - ["", "CAtlFileMappingBase", True, "MapFile", "", "", "Argument[0]", "Argument[-1]", "taint", "manual"] | ||
| - ["", "CAtlFileMappingBase", True, "MapSharedMem", "", "", "Argument[*1]", "Argument[-1]", "taint", "manual"] |
There was a problem hiding this comment.
I feel like there are potentially some queries we could develop or extend into the area of memory mapped files.
| data: # namespace, type, subtypes, name, signature, ext, input, output, kind, provenance | ||
| - ["", "CRegKey", True, "CRegKey", "(CRegKey&)", "", "Argument[*0]", "Argument[-1]", "value", "manual"] | ||
| - ["", "CRegKey", True, "CRegKey", "(HKEY)", "", "Argument[0]", "Argument[-1]", "taint", "manual"] | ||
| - ["", "CRegKey", True, "Attach", "", "", "Argument[0]", "Argument[-1]", "taint", "manual"] |
There was a problem hiding this comment.
I feel we should also model CRegKey::Create, which I believe is commonly used when adding new data structures in the registry. The lpszKeyName (Argument[*1]) should have taint flow into the qualifier.
(it could also be a query sink, you would not want untrusted data controlling this operation)
There was a problem hiding this comment.
Agreed. I've added this summary model in 674dbce. I think this is one that could also be treated with MapKey flow at some point and be made value-preserving.
There was a problem hiding this comment.
I don't think it will be value preserving as you're still appending the key name to an existing registry path represented by the current state of the key. Unless maybe you consider the key to be a set of path edges, in which case I suppose you're adding one value.
There was a problem hiding this comment.
Unless maybe you consider the key to be a set of path edges, in which case I suppose you're adding one value.
Yeah, that was my thinking. But I guess we can settle on those details later. I agree that there are multiple ways to think of this.
|
I believe I've addressed everything @geoffw0. The only discussion point left is #18136 (comment). |
geoffw0
left a comment
There was a problem hiding this comment.
All of my concerns have been addressed, I think we're ready to merge. ✨
This PR adds MaD models for most of the relevant classes in Active Template Library (ATL).
In addition I've also added ATL flow sources whenever I came across something relevant.
There are still some models that could do with more MaD rows, but they depend on a couple of features that aren't yet available. In particular:
Tis instantiated with. You can see an example of where this is used here (where you'd like taint fromthisto the result of the call). A natural way to write this MaD would be:["", "CAtlFileMapping<T>", True, "operator T *", "", "", "Argument[-1]", "ReturnValue[*]", "taint", "manual"]Container container; POS pos = container.find(myValue); sink(container.lookup(pos));findto the return value, and a taint-step fromlookup's argument to its return value, in order to get this taint. However, I would much rather do the very small addition to dataflow (that Java has) which adds aMapKeycontent to express that "this is a key that points to something tainted". So that's another follow-up from this PR.In addition to adding a bunch of models, this PR also fixes two problems in our interpretation of MaD rows:
isConstructedFromin a few places to obtain the uninstantiated template of classes and functions. The problem with that is thatisConstructedFromdoesn't have a result when the function/class isn't a template instantiation. So I fixed that as part of this work, and you can see the effect of this in the commit that follows the fix.operator intinstead ofoperator MyInt. So without some special handling a MaD row foroperator MyIntwould have to use the nameoperator intwhich is slightly confusing. So I added special handling of conversion operators so that we get the nameoperator MyIntinstead. The effect of this can also be seen in the commit that follows the fix.Commit-by-commit review highly encouraged. There is a lot of code, but most of it is just adding tests and models. Every commit can be read and understood in isolation!