-
Notifications
You must be signed in to change notification settings - Fork 2k
Expand file tree
/
Copy pathIdnaIpLiteralSmuggle.qhelp
More file actions
130 lines (120 loc) · 4.72 KB
/
IdnaIpLiteralSmuggle.qhelp
File metadata and controls
130 lines (120 loc) · 4.72 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<overview>
<p>
The Go module <code>golang.org/x/net/idna</code> implements UTS-46 IDNA
processing. During the <code>Lookup</code> and <code>MapForLookup</code>
profiles, <code>(*Profile).ToASCII</code> applies an NFKC-based character map
that folds <strong>100 distinct non-ASCII Unicode digit codepoints</strong>
across 8 families to their ASCII equivalents. The 8 families are:
</p>
<ul>
<li>Latin-1 superscripts (U+00B2, U+00B3, U+00B9): 3 codepoints</li>
<li>Mathematical superscripts (U+2070, U+2074..U+2079): 7 codepoints</li>
<li>Mathematical subscripts (U+2080..U+2089): 10 codepoints</li>
<li>Circled digits (U+2460..U+2468, U+24EA): 10 codepoints</li>
<li>Fullwidth digits (U+FF10..U+FF19): 10 codepoints</li>
<li>Mathematical bold, sans-serif, double-struck, and monospace digits
(U+1D7CE..U+1D7FF): 50 codepoints</li>
<li>Segmented digits (U+1FBF0..U+1FBF9): 10 codepoints</li>
</ul>
<p>
The library contains no IP-literal detection. A caller that applies UTS-46
mapping to an attacker-controlled host string and consumes the result in a
network sink without rechecking against IP-literal parsers receives a
valid ASCII IPv4 literal back as the "domain name" output. Any downstream
allowlist check, SSRF guard, NoProxy match, or TLS-SNI router that does
not re-check the post-IDNA result is bypassed. The anti-pattern also
applies to callers that do a pre-IDNA <code>net.ParseIP</code> check and
think it is sufficient: the smuggled host is not ASCII, so the pre-IDNA
check rejects it as non-IP, and the post-IDNA value (now a numeric
literal) reaches the sink unguarded.
</p>
<p>
IPv6 is out of scope: <code>:</code> is a UTS-46 disallowed character;
bare-IPv6 inputs are rejected by IDNA rune-validation before any
digit-fold mapping runs.
</p>
<p>
Sinks where the smuggled literal becomes exploitable include
<code>net.JoinHostPort</code>, <code>net.Dial</code>,
<code>(*http.Request).URL.Host</code>, <code>(*tls.Config).ServerName</code>,
<code>(*http.Cookie).Domain</code>, and any HTTP client request URL
constructed from the mapped value.
</p>
</overview>
<recommendation>
<p>
Either:
</p>
<ol>
<li>
Use a strict IDNA profile option that returns an error if the mapped
output parses as an IP literal, if your IDNA library exposes one.
</li>
<li>
Apply the explicit safe pattern: after <code>idna.ToASCII</code>, trim a
single trailing dot and call <code>net.ParseIP</code> (or
<code>netip.ParseAddr</code>) on the result, then reject on non-nil. The
trailing-dot trim is required because <code>"0.¹.0.0."</code> maps to
<code>"0.1.0.0."</code>, which <code>net.ParseIP</code> rejects on its
own yet is still an IP literal for routing purposes.
</li>
</ol>
</recommendation>
<example>
<p>
Vulnerable pattern. <code>net.ParseIP</code> is called only before
<code>idna.ToASCII</code>, so the smuggled literal slips through:
</p>
<sample src="IdnaIpLiteralSmuggleBad.go"/>
<p>
Safe pattern. Post-IDNA trailing-dot trim followed by
<code>net.ParseIP</code> recheck:
</p>
<sample src="IdnaIpLiteralSmuggleGood.go"/>
<p>
The safe pattern accepts three equivalent trailing-dot trim forms:
</p>
<ul>
<li><code>strings.TrimRight(ace, ".")</code>: multi-dot form. Handles
the fullwidth and ideographic dot variants that produce multiple
trailing ASCII dots after UTS-46 mapping.</li>
<li><code>strings.TrimSuffix(ace, ".")</code>: single-dot form.
Sufficient for most inputs but incomplete for the multi-dot
variant.</li>
<li><code>if strings.HasSuffix(ace, ".") { ace = ace[:len(ace)-1] }</code>:
manual slice form. Equivalent to <code>TrimSuffix</code> in
effect.</li>
</ul>
<p>
After trimming, call <code>netip.ParseAddr</code> (preferred) or
<code>net.ParseIP</code> on the result and reject if it parses as an IP literal.
</p>
</example>
<references>
<li>
Unicode Technical Standard #46 (IDNA Compatibility Processing):
<a href="https://www.unicode.org/reports/tr46/">https://www.unicode.org/reports/tr46/</a>
</li>
<li>
<code>golang.org/x/net/idna</code> package documentation:
<a href="https://pkg.go.dev/golang.org/x/net/idna">https://pkg.go.dev/golang.org/x/net/idna</a>
</li>
<li>
WHATWG URL Standard, <code>ends_in_a_number</code> host parser check
(prior art for IP-literal detection in URL parsers):
<a href="https://url.spec.whatwg.org/#ends-in-a-number-checker">https://url.spec.whatwg.org/#ends-in-a-number-checker</a>
</li>
<li>
CWE-918: Server-Side Request Forgery (SSRF):
<a href="https://cwe.mitre.org/data/definitions/918.html">https://cwe.mitre.org/data/definitions/918.html</a>
</li>
<li>
CWE-020: Improper Input Validation:
<a href="https://cwe.mitre.org/data/definitions/20.html">https://cwe.mitre.org/data/definitions/20.html</a>
</li>
</references>
</qhelp>