Skip to content

Commit 0caeb21

Browse files
committed
doc: add security warnings for untrusted XSLT stylesheets
Document that XSLT stylesheet input is always treated as trusted, which is counter to Nokogiri's "untrusted by default" policy. Also use 🛡 consistently for security-related callouts in ParseOptions and SAX::Document. [skip ci]
1 parent 6f5d025 commit 0caeb21

4 files changed

Lines changed: 31 additions & 13 deletions

File tree

lib/nokogiri/xml/parse_options.rb

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -76,12 +76,12 @@ class ParseOptions
7676
#
7777
# ⚠ This option enables entity substitution, contrary to what the name implies.
7878
#
79-
# <b>It is UNSAFE to set this option</b> when parsing untrusted documents.
79+
# 🛡 <b>It is UNSAFE to set this option</b> when parsing untrusted documents.
8080
NOENT = 1 << 1
8181

8282
# Load external subsets. On by default for XSLT::Stylesheet.
8383
#
84-
# <b>It is UNSAFE to set this option</b> when parsing untrusted documents.
84+
# 🛡 <b>It is UNSAFE to set this option</b> when parsing untrusted documents.
8585
DTDLOAD = 1 << 2
8686

8787
# Default DTD attributes. On by default for XSLT::Stylesheet.
@@ -111,7 +111,7 @@ class ParseOptions
111111
# Forbid network access. On by default for XML::Document, XML::DocumentFragment,
112112
# HTML4::Document, HTML4::DocumentFragment, XSLT::Stylesheet, and XML::Schema.
113113
#
114-
# <b>It is UNSAFE to unset this option</b> when parsing untrusted documents.
114+
# 🛡 <b>It is UNSAFE to unset this option</b> when parsing untrusted documents.
115115
NONET = 1 << 11
116116

117117
# Do not reuse the context dictionary. Off by default.
@@ -128,8 +128,7 @@ class ParseOptions
128128

129129
# Compact small text nodes. Off by default.
130130
#
131-
# ⚠ No modification of the DOM tree is allowed after parsing. libxml2 may crash if you try to
132-
# modify the tree.
131+
# ⚠ No modification of the DOM tree is allowed after parsing.
133132
COMPACT = 1 << 16
134133

135134
# Parse using XML-1.0 before update 5. Off by default
@@ -140,7 +139,7 @@ class ParseOptions
140139

141140
# Relax any hardcoded limit from the parser. Off by default.
142141
#
143-
# <b>It is UNSAFE to set this option</b> when parsing untrusted documents.
142+
# 🛡 <b>It is UNSAFE to set this option</b> when parsing untrusted documents.
144143
HUGE = 1 << 19
145144

146145
# Support line numbers up to <code>long int</code> (default is a <code>short int</code>). On
@@ -151,7 +150,12 @@ class ParseOptions
151150
# The options mask used by default for parsing XML::Document and XML::DocumentFragment
152151
DEFAULT_XML = RECOVER | NONET | BIG_LINES
153152

154-
# The options mask used by default used for parsing XSLT::Stylesheet
153+
# Shorthand options mask useful for parsing XSLT stylesheets:
154+
# sets RECOVER, NONET, NOENT, DTDLOAD, DTDATTR, NOCDATA, BIG_LINES.
155+
#
156+
# 🛡 This option set includes `NOENT` and `DTDLOAD` which are unsafe for untrusted
157+
# documents. <b>Do not parse untrusted XSLT stylesheets.</b> See Nokogiri::XSLT for more
158+
# information.
155159
DEFAULT_XSLT = RECOVER | NONET | NOENT | DTDLOAD | DTDATTR | NOCDATA | BIG_LINES
156160

157161
# The options mask used by default used for parsing HTML4::Document and HTML4::DocumentFragment

lib/nokogiri/xml/sax/document.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ module SAX
3939
# of ParserContext#replace_entities. (Recall that the default value of
4040
# ParserContext#replace_entities is `false`.)
4141
#
42-
# <b>It is UNSAFE to set ParserContext#replace_entities to `true`</b> when parsing untrusted
42+
# 🛡 <b>It is UNSAFE to set ParserContext#replace_entities to `true`</b> when parsing untrusted
4343
# documents.
4444
#
4545
# 💡 For more information on entity types, see [Wikipedia's page on

lib/nokogiri/xslt.rb

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,14 @@ def XSLT(...)
1010
end
1111

1212
###
13-
# See Nokogiri::XSLT::Stylesheet for creating and manipulating
14-
# Stylesheet object.
13+
# See Nokogiri::XSLT::Stylesheet for creating and manipulating Stylesheet objects.
14+
#
15+
# 🛡 <b>Do not use this module for untrusted stylesheet documents.</b> libxslt does not support
16+
# safely processing untrusted stylesheets. Untrusted stylesheets may access the file system and
17+
# network, consume large amounts of CPU, memory, or other system resources, and IO and file
18+
# access are not restricted. Additionally, the stylesheet is parsed by libxml2 with +NOENT+ and
19+
# +DTDLOAD+ enabled (see ParseOptions::DEFAULT_XSLT), meaning that <b>external entities will be
20+
# resolved and external subsets will be loaded</b> during parsing.
1521
module XSLT
1622
class << self
1723
# :call-seq:
@@ -20,6 +26,9 @@ class << self
2026
#
2127
# Parse the stylesheet in +xsl+, registering optional +modules+ as custom class handlers.
2228
#
29+
# 🛡 <b>Do not pass untrusted stylesheet content to this method.</b> See Nokogiri::XSLT for more
30+
# information.
31+
#
2332
# [Parameters]
2433
# - +xsl+ (String) XSL content to be parsed into a stylesheet
2534
# - +modules+ (Hash<String ⇒ Class>) A hash of URI-to-handler relations for linking a

lib/nokogiri/xslt/stylesheet.rb

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,20 @@
33
module Nokogiri
44
module XSLT
55
###
6-
# A Stylesheet represents an XSLT Stylesheet object. Stylesheet creation
7-
# is done through Nokogiri.XSLT. Here is an example of transforming
8-
# an XML::Document with a Stylesheet:
6+
# A Stylesheet represents an XSLT Stylesheet object. Stylesheet creation is done through
7+
# Nokogiri::XSLT.parse (or the convenience method Nokogiri.XSLT). Here is an example of
8+
# transforming an XML::Document with a Stylesheet:
99
#
1010
# doc = Nokogiri::XML(File.read('some_file.xml'))
1111
# xslt = Nokogiri::XSLT(File.read('some_transformer.xslt'))
1212
#
1313
# xslt.transform(doc) # => Nokogiri::XML::Document
1414
#
15+
# 🛡 <b>This class does not support execution of untrusted stylesheets.</b> An untrusted
16+
# stylesheet may consume a large amount of CPU, memory, or other system resources during
17+
# transformation, and IO and file access are not restricted. See Nokogiri::XSLT for more
18+
# information about the security implications of untrusted stylesheets.
19+
#
1520
# Many XSLT transformations include serialization behavior to emit a non-XML document. For these
1621
# cases, please take care to invoke the #serialize method on the result of the transformation:
1722
#

0 commit comments

Comments
 (0)