7 Responses

  1. Johann
    Johann February 12, 2010 at 9:36 pm | | Reply

    I’ve been using NekoHTML lately. The syntax is a bit different though and I haven’t benchmarked yet if it’s faster or slower than XmlSlurper.

    FYI, the syntax is a little like this in Groovy:


    String get(def uri) {
    builder.request(uri, GET, TEXT, {}).text
    }

    Document document(def uri) {
    DOMParser parser = new DOMParser()
    parser.parse(new InputSource(new StringReader(get(uri))))
    parser.document
    }

    (Builder is an HTTP Builder)

    1. goran
      goran February 16, 2011 at 7:55 pm | | Reply

      great:) helped me rigth now for some simple custom html testing

  2. Federico
    Federico November 21, 2010 at 9:07 am | | Reply

    Thank you!

    Really useful for some web automation!

  3. Thelma Turton
    Thelma Turton June 6, 2011 at 12:15 am | | Reply

    Once I initially commented I clicked the -Notify me when new feedback are added- checkbox and now each time a remark is added I get four emails with the identical comment. Is there any approach you may take away me from that service? Thanks!

  4. Jeremy
    Jeremy August 22, 2011 at 11:03 pm | | Reply

    Very interesting article, makes what I was doing with Java way shorter. But I was wondering how would I select an element that is, say, all tags after a certain class, or all bold text on the page. What I’m trying to scrape isn’t a class unfortunately…

  5. Jeremy
    Jeremy August 24, 2011 at 6:42 pm | | Reply

    I’ve found a partial solution to selecting other elements. You can use ” it.name() == ‘p’ ” for all tags, or replace it with ‘h1′ for all h1 tags. If anyone else has more info on how to more specifically select page elements I’d still like more info…

Leave a Reply