Nokogiri parses and searches XML/HTML very quickly, and also has correctly implemented CSS3 selector support as well as XPath support.
Parsing a document returns either a Nokogiri::XML::Document, or a Nokogiri::HTML::Document depending on the kind of document you parse.
Here is an example:
require 'nokogiri' require 'open-uri' # Get a Nokogiri::HTML:Document for the page we’re interested in... doc = Nokogiri::HTML(open('http://www.google.com/search?q=tenderlove')) # Do funky things with it using Nokogiri::XML::Node methods... #### # Search for nodes by css doc.css('h3.r a.l').each do |link| puts link.content end
See Nokogiri::XML::Node#css for more information about CSS searching. See Nokogiri::XML::Node#xpath for more information about XPath searching.
The version of Nokogiri you are using
More complete version information about libxml
Parse HTML. Convenience method for Nokogiri::HTML::Document.parse
# File lib/nokogiri/html.rb, line 12 12: def HTML thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_HTML, &block 13: Nokogiri::HTML::Document.parse(thing, url, encoding, options, &block) 14: end
Parse a document and add the Slop decorator. The Slop decorator implements method_missing such that methods may be used instead of CSS or XPath. For example:
doc = Nokogiri::Slop(<<-eohtml) <html> <body> <p>first</p> <p>second</p> </body> </html> eohtml assert_equal('second', doc.html.body.p[1].text)
# File lib/nokogiri.rb, line 114 114: def Slop(*args, &block) 115: Nokogiri(*args, &block).slop! 116: end
Parse XML. Convenience method for Nokogiri::XML::Document.parse
# File lib/nokogiri/xml.rb, line 32 32: def XML thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_XML, &block 33: Nokogiri::XML::Document.parse(thing, url, encoding, options, &block) 34: end
Create a Nokogiri::XSLT::Stylesheet with stylesheet.
Example:
xslt = Nokogiri::XSLT(File.read(ARGV[0]))
# File lib/nokogiri/xslt.rb, line 12 12: def XSLT stylesheet 13: XSLT.parse(stylesheet) 14: end
Create a new Nokogiri::XML::DocumentFragment
# File lib/nokogiri.rb, line 91 91: def make input = nil, opts = {}, &blk 92: if input 93: Nokogiri::HTML.fragment(input).children.first 94: else 95: Nokogiri(&blk) 96: end 97: end
Parse an HTML or XML document. string contains the document.
# File lib/nokogiri.rb, line 72 72: def parse string, url = nil, encoding = nil, options = nil 73: doc = 74: if string.respond_to?(:read) || 75: string =~ /^\s*<[^Hh>]*html/ # Probably html 76: Nokogiri::HTML( 77: string, 78: url, 79: encoding, options || XML::ParseOptions::DEFAULT_HTML 80: ) 81: else 82: Nokogiri::XML(string, url, encoding, 83: options || XML::ParseOptions::DEFAULT_XML) 84: end 85: yield doc if block_given? 86: doc 87: end
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.