class Nokogiri::XML::SAX::Parser
This parser is a SAX style parser that reads its input as it deems necessary. The parser takes a Nokogiri::XML::SAX::Document, an optional encoding, then given an XML input, sends messages to the Nokogiri::XML::SAX::Document.
Here is an example of using this parser:
# Create a subclass of Nokogiri::XML::SAX::Document and implement # the events we care about: class MyHandler < Nokogiri::XML::SAX::Document def start_element name, attrs = [] puts "starting: #{name}" end def end_element name puts "ending: #{name}" end end parser = Nokogiri::XML::SAX::Parser.new(MyHandler.new) # Hand an IO object to the parser, which will read the XML from the IO. File.open(path_to_xml) do |f| parser.parse(f) end
For more information about SAX parsers, see Nokogiri::XML::SAX.
Also see Nokogiri::XML::SAX::Document for the available events.
For HTML documents, use the subclass Nokogiri::HTML4::SAX::Parser.
Attributes
The Nokogiri::XML::SAX::Document where events will be sent.
The encoding beings used for this document.
Public Class Methods
Create a new Parser.
- Parameters
-
handler(optionalNokogiri::XML::SAX::Document) The document that will receive events. Will create a newNokogiri::XML::SAX::Documentif not given, which is accessible through thedocumentattribute. -
encoding(optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input. (defaultnilfor auto-detection)
# File lib/nokogiri/xml/sax/parser.rb, line 95 def initialize(doc = Nokogiri::XML::SAX::Document.new, encoding = nil) @encoding = encoding @document = doc @warned = false initialize_native unless Nokogiri.jruby? end
Public Instance Methods
Parse the input, sending events to the SAX::Document at document.
- Parameters
-
input(String, IO) The input to parse.
If input quacks like a readable IO object, this method forwards to Parser.parse_io, otherwise it forwards to Parser.parse_memory.
- Yields
-
If a block is given, the underlying
ParserContextobject will be yielded. This can be used to set options on the parser context before parsing begins.
# File lib/nokogiri/xml/sax/parser.rb, line 119 def parse(input, &block) if input.respond_to?(:read) && input.respond_to?(:close) parse_io(input, &block) else parse_memory(input, &block) end end
Parse a file.
- Parameters
-
filename(String) The path to the file to be parsed. -
encoding(optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, ornilfor auto-detection. (defaultencoding)
- Yields
-
If a block is given, the underlying
ParserContextobject will be yielded. This can be used to set options on the parser context before parsing begins.
# File lib/nokogiri/xml/sax/parser.rb, line 187 def parse_file(filename, encoding = @encoding) raise ArgumentError, "no filename provided" unless filename raise Errno::ENOENT unless File.exist?(filename) raise Errno::EISDIR if File.directory?(filename) ctx = related_class("ParserContext").file(filename, encoding) yield ctx if block_given? ctx.parse_with(self) end
Parse an input stream.
- Parameters
-
io(IO) The readable IO object from which to read input -
encoding(optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, ornilfor auto-detection. (defaultencoding)
- Yields
-
If a block is given, the underlying
ParserContextobject will be yielded. This can be used to set options on the parser context before parsing begins.
# File lib/nokogiri/xml/sax/parser.rb, line 143 def parse_io(io, encoding = @encoding) ctx = related_class("ParserContext").io(io, encoding) yield ctx if block_given? ctx.parse_with(self) end
Parse an input string.
- Parameters
-
input(String) The input string to be parsed. -
encoding(optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, ornilfor auto-detection. (defaultencoding)
- Yields
-
If a block is given, the underlying
ParserContextobject will be yielded. This can be used to set options on the parser context before parsing begins.
# File lib/nokogiri/xml/sax/parser.rb, line 165 def parse_memory(input, encoding = @encoding) ctx = related_class("ParserContext").memory(input, encoding) yield ctx if block_given? ctx.parse_with(self) end
Private Instance Methods
static VALUE
noko_xml_sax_parser__initialize_native(VALUE self)
{
xmlSAXHandlerPtr handler = noko_xml_sax_parser_unwrap(self);
handler->startDocument = noko_xml_sax_parser_start_document_callback;
handler->endDocument = noko_xml_sax_parser_end_document_callback;
handler->startElement = noko_xml_sax_parser_start_element_callback;
handler->endElement = noko_xml_sax_parser_end_element_callback;
handler->startElementNs = noko_xml_sax_parser_start_element_ns_callback;
handler->endElementNs = noko_xml_sax_parser_end_element_ns_callback;
handler->characters = noko_xml_sax_parser_characters_callback;
handler->comment = noko_xml_sax_parser_comment_callback;
handler->warning = noko_xml_sax_parser_warning_callback;
handler->error = noko_xml_sax_parser_error_callback;
handler->cdataBlock = noko_xml_sax_parser_cdata_block_callback;
handler->processingInstruction = noko_xml_sax_parser_processing_instruction_callback;
handler->reference = noko_xml_sax_parser_reference_callback;
/* use some of libxml2's default callbacks to managed DTDs and entities */
handler->getEntity = xmlSAX2GetEntity;
handler->internalSubset = xmlSAX2InternalSubset;
handler->externalSubset = xmlSAX2ExternalSubset;
handler->isStandalone = xmlSAX2IsStandalone;
handler->hasInternalSubset = xmlSAX2HasInternalSubset;
handler->hasExternalSubset = xmlSAX2HasExternalSubset;
handler->resolveEntity = xmlSAX2ResolveEntity;
handler->getParameterEntity = xmlSAX2GetParameterEntity;
handler->entityDecl = xmlSAX2EntityDecl;
handler->unparsedEntityDecl = xmlSAX2UnparsedEntityDecl;
handler->initialized = XML_SAX2_MAGIC;
return self;
}