Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 130 additions & 0 deletions go/ql/src/experimental/CWE-918/IdnaIpLiteralSmuggle.qhelp
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>

<overview>
<p>
The Go module <code>golang.org/x/net/idna</code> implements UTS-46 IDNA
processing. During the <code>Lookup</code> and <code>MapForLookup</code>
profiles, <code>(*Profile).ToASCII</code> applies an NFKC-based character map
that folds <strong>100 distinct non-ASCII Unicode digit codepoints</strong>
across 8 families to their ASCII equivalents. The 8 families are:
</p>
<ul>
<li>Latin-1 superscripts (U+00B2, U+00B3, U+00B9): 3 codepoints</li>
<li>Mathematical superscripts (U+2070, U+2074..U+2079): 7 codepoints</li>
<li>Mathematical subscripts (U+2080..U+2089): 10 codepoints</li>
<li>Circled digits (U+2460..U+2468, U+24EA): 10 codepoints</li>
<li>Fullwidth digits (U+FF10..U+FF19): 10 codepoints</li>
<li>Mathematical bold, sans-serif, double-struck, and monospace digits
(U+1D7CE..U+1D7FF): 50 codepoints</li>
<li>Segmented digits (U+1FBF0..U+1FBF9): 10 codepoints</li>
</ul>
<p>
The library contains no IP-literal detection. A caller that applies UTS-46
mapping to an attacker-controlled host string and consumes the result in a
network sink without rechecking against IP-literal parsers receives a
valid ASCII IPv4 literal back as the "domain name" output. Any downstream
allowlist check, SSRF guard, NoProxy match, or TLS-SNI router that does
not re-check the post-IDNA result is bypassed. The anti-pattern also
applies to callers that do a pre-IDNA <code>net.ParseIP</code> check and
think it is sufficient: the smuggled host is not ASCII, so the pre-IDNA
check rejects it as non-IP, and the post-IDNA value (now a numeric
literal) reaches the sink unguarded.
</p>
<p>
IPv6 is out of scope: <code>:</code> is a UTS-46 disallowed character;
bare-IPv6 inputs are rejected by IDNA rune-validation before any
digit-fold mapping runs.
</p>
<p>
Sinks where the smuggled literal becomes exploitable include
<code>net.JoinHostPort</code>, <code>net.Dial</code>,
<code>(*http.Request).URL.Host</code>, <code>(*tls.Config).ServerName</code>,
<code>(*http.Cookie).Domain</code>, and any HTTP client request URL
constructed from the mapped value.
</p>
</overview>

<recommendation>
<p>
Either:
</p>
<ol>
<li>
Use a strict IDNA profile option that returns an error if the mapped
output parses as an IP literal, if your IDNA library exposes one.
</li>
<li>
Apply the explicit safe pattern: after <code>idna.ToASCII</code>, trim a
single trailing dot and call <code>net.ParseIP</code> (or
<code>netip.ParseAddr</code>) on the result, then reject on non-nil. The
trailing-dot trim is required because <code>"0.¹.0.0."</code> maps to
<code>"0.1.0.0."</code>, which <code>net.ParseIP</code> rejects on its
own yet is still an IP literal for routing purposes.
</li>
</ol>
</recommendation>

<example>
<p>
Vulnerable pattern. <code>net.ParseIP</code> is called only before
<code>idna.ToASCII</code>, so the smuggled literal slips through:
</p>

<sample src="IdnaIpLiteralSmuggleBad.go"/>

<p>
Safe pattern. Post-IDNA trailing-dot trim followed by
<code>net.ParseIP</code> recheck:
</p>

<sample src="IdnaIpLiteralSmuggleGood.go"/>

<p>
The safe pattern accepts three equivalent trailing-dot trim forms:
</p>
<ul>
<li><code>strings.TrimRight(ace, ".")</code>: multi-dot form. Handles
the fullwidth and ideographic dot variants that produce multiple
trailing ASCII dots after UTS-46 mapping.</li>
<li><code>strings.TrimSuffix(ace, ".")</code>: single-dot form.
Sufficient for most inputs but incomplete for the multi-dot
variant.</li>
<li><code>if strings.HasSuffix(ace, ".") { ace = ace[:len(ace)-1] }</code>:
manual slice form. Equivalent to <code>TrimSuffix</code> in
effect.</li>
</ul>
<p>
After trimming, call <code>netip.ParseAddr</code> (preferred) or
<code>net.ParseIP</code> on the result and reject if it parses as an IP literal.
</p>
</example>

<references>

<li>
Unicode Technical Standard #46 (IDNA Compatibility Processing):
<a href="https://www.unicode.org/reports/tr46/">https://www.unicode.org/reports/tr46/</a>
</li>
<li>
<code>golang.org/x/net/idna</code> package documentation:
<a href="https://pkg.go.dev/golang.org/x/net/idna">https://pkg.go.dev/golang.org/x/net/idna</a>
</li>
<li>
WHATWG URL Standard, <code>ends_in_a_number</code> host parser check
(prior art for IP-literal detection in URL parsers):
<a href="https://url.spec.whatwg.org/#ends-in-a-number-checker">https://url.spec.whatwg.org/#ends-in-a-number-checker</a>
</li>
<li>
CWE-918: Server-Side Request Forgery (SSRF):
<a href="https://cwe.mitre.org/data/definitions/918.html">https://cwe.mitre.org/data/definitions/918.html</a>
</li>
<li>
CWE-020: Improper Input Validation:
<a href="https://cwe.mitre.org/data/definitions/20.html">https://cwe.mitre.org/data/definitions/20.html</a>
</li>

</references>
</qhelp>
34 changes: 34 additions & 0 deletions go/ql/src/experimental/CWE-918/IdnaIpLiteralSmuggle.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
/**
* @name IDNA digit-fold IP-literal smuggling via UTS-46 NFKC mapping
* @description An untrusted hostname flows through `golang.org/x/net/idna`
* mapping (which folds 100 non-ASCII Unicode digit codepoints to
* ASCII via UTS-46 NFKC) and reaches a security-relevant
* hostname sink without a post-IDNA IP-literal recheck. A
* caller that calls `net.ParseIP` only BEFORE `idna.ToASCII`
* will accept a smuggled IPv4 literal such as `"0.¹.0.0"`
* (which maps to `"0.1.0.0"`). Scope is IPv4 only because
* IPv6 colons are rejected by IDNA rune-validation before
* UTS-46 mapping runs.
* @id go/idna-ip-literal-smuggle
* @kind path-problem
* @problem.severity warning
* @security-severity 8.1
* @precision high
* @tags security
* experimental
* external/cwe/cwe-918
* external/cwe/cwe-020
* @requires codeql/go-all >= 0.6.0
*/

import go
import IdnaIpLiteralSmuggle
import Flow::PathGraph

from
Flow::PathNode source,
Flow::PathNode sink
where Flow::flowPath(source, sink)
select sink.getNode(), source, sink,
"Untrusted hostname from $@ flows through `idna.ToASCII` (which performs UTS-46 NFKC digit folding) and reaches this hostname sink without a post-IDNA `net.ParseIP` recheck (after a trailing-dot trim).",
source.getNode(), "this user-controlled value"
Loading
Loading