Java Code Examples for org.jsoup.select.NodeTraversor

The following are top voted examples for showing how to use org.jsoup.select.NodeTraversor. These examples are extracted from open source projects. You can vote up the examples you like and your votes will be used in our system to generate more good examples.
Example 1
Project: visual-programming   File: HtmlSerialzer.java   Source Code and License 6 votes vote down vote up
@Override
public _Object deserialize(Reader reader, Config config) {
	StringBuilder sb = new StringBuilder();
	char[] buff = new char[100];
	int len;
	try {
		while ((len = reader.read(buff)) > 0)
			sb.append(buff, 0, len);
	} catch (Exception e) {
		throw new RuntimeException(e);
	}

	Document doc = Jsoup.parse(sb.toString());
	JSoupHtmlNodeVisitor visitor = new JSoupHtmlNodeVisitor();
	NodeTraversor traversor = new NodeTraversor(visitor);
	traversor.traverse(doc);
	return visitor.getObject();
}
 
Example 2
Project: generator-thundr-gae-react   File: HtmlFormattingUtil.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
private static String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor.traverse(formatter, element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 3
Project: Saber-Bot   File: HTMLStripper.java   Source Code and License 5 votes vote down vote up
/**
 * removes HTML tags from a google calendar event's description
 * @param description  an event description possibly containing HTML tags
 * @return an event description free of HTML tags
 */
public static String cleanDescription(String description)
{
    FormattingVisitor formatter = new FormattingVisitor();
    new NodeTraversor(formatter).traverse(Jsoup.parse(description));
    return formatter.toString();
}
 
Example 4
Project: eclipse.jdt.ls   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
	FormattingVisitor formatter = new FormattingVisitor();
	NodeTraversor traversor = new NodeTraversor(formatter);
	traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node

	return formatter.toString();
}
 
Example 5
Project: common   File: W3CDom.java   Source Code and License 5 votes vote down vote up
/**
 * Converts a jsoup document into the provided W3C Document. If required, you can set options on the output document
 * before converting.
 * @param in jsoup doc
 * @param out w3c doc
 * @see org.jsoup.helper.W3CDom#fromJsoup(org.jsoup.nodes.Document)
 */
public void convert(org.jsoup.nodes.Document in, Document out) {
    if (!StringUtil.isBlank(in.location()))
        out.setDocumentURI(in.location());

    org.jsoup.nodes.Element rootEl = in.child(0); // skip the #root node
    NodeTraversor traversor = new NodeTraversor(new W3CBuilder(out));
    traversor.traverse(rootEl);
}
 
Example 6
Project: common   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor traversor = new NodeTraversor(formatter);
    traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 7
Project: common   File: Node.java   Source Code and License 5 votes vote down vote up
/**
 * Perform a depth-first traversal through this node and its descendants.
 * @param nodeVisitor the visitor callbacks to perform on each node
 * @return this node, for chaining
 */
public Node traverse(NodeVisitor nodeVisitor) {
    Validate.notNull(nodeVisitor);
    NodeTraversor traversor = new NodeTraversor(nodeVisitor);
    traversor.traverse(this);
    return this;
}
 
Example 8
Project: cloud-meter   File: JsoupBasedHtmlParser.java   Source Code and License 5 votes vote down vote up
@Override
public Iterator<URL> getEmbeddedResourceURLs(String userAgent, byte[] html, URL baseUrl,
        URLCollection coll, String encoding) throws HTMLParseException {
    try {
        // TODO Handle conditional comments for IE
        String contents = new String(html,encoding);
        Document doc = Jsoup.parse(contents);
        JMeterNodeVisitor nodeVisitor = new JMeterNodeVisitor(new URLPointer(baseUrl), coll);
        new NodeTraversor(nodeVisitor).traverse(doc);
        return coll.iterator();
    } catch (Exception e) {
        throw new HTMLParseException(e);
    }
}
 
Example 9
Project: aMatch   File: QuestionSearch.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 *
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor traversor = new NodeTraversor(formatter);
    traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 10
Project: gestock   File: W3CDom.java   Source Code and License 5 votes vote down vote up
/**
 * Converts a jsoup document into the provided W3C Document. If required, you can set options on the output document
 * before converting.
 * @param in jsoup doc
 * @param out w3c doc
 * @see org.jsoup.helper.W3CDom#fromJsoup(org.jsoup.nodes.Document)
 */
public void convert(org.jsoup.nodes.Document in, Document out) {
    if (!StringUtil.isBlank(in.location()))
        out.setDocumentURI(in.location());

    org.jsoup.nodes.Element rootEl = in.child(0); // skip the #root node
    NodeTraversor traversor = new NodeTraversor(new W3CBuilder(out));
    traversor.traverse(rootEl);
}
 
Example 11
Project: gestock   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor traversor = new NodeTraversor(formatter);
    traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 12
Project: gestock   File: Node.java   Source Code and License 5 votes vote down vote up
/**
 * Perform a depth-first traversal through this node and its descendants.
 * @param nodeVisitor the visitor callbacks to perform on each node
 * @return this node, for chaining
 */
public Node traverse(NodeVisitor nodeVisitor) {
    Validate.notNull(nodeVisitor);
    NodeTraversor traversor = new NodeTraversor(nodeVisitor);
    traversor.traverse(this);
    return this;
}
 
Example 13
Project: Genji   File: Html2LaTeX.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to LaTeX
 *
 * @param element
 *            the root element to format
 * @return formatted text
 */
public static String getLatexText(Element element) {
	FormattingVisitor formatter = new FormattingVisitor();
	NodeTraversor traversor = new NodeTraversor(formatter);
	traversor.traverse(element); // walk the DOM, and call .head() and
									// .tail() for each node

	return formatter.toString();
}
 
Example 14
Project: CN1ML-NetbeansModule   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor traversor = new NodeTraversor(formatter);
    traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 15
Project: CN1ML-NetbeansModule   File: Node.java   Source Code and License 5 votes vote down vote up
/**
 * Perform a depth-first traversal through this node and its descendants.
 * @param nodeVisitor the visitor callbacks to perform on each node
 * @return this node, for chaining
 */
public Node traverse(NodeVisitor nodeVisitor) {
    Validate.notNull(nodeVisitor);
    NodeTraversor traversor = new NodeTraversor(nodeVisitor);
    traversor.traverse(this);
    return this;
}
 
Example 16
Project: astor   File: W3CDom.java   Source Code and License 5 votes vote down vote up
/**
 * Converts a jsoup document into the provided W3C Document. If required, you can set options on the output document
 * before converting.
 * @param in jsoup doc
 * @param out w3c doc
 * @see org.jsoup.helper.W3CDom#fromJsoup(org.jsoup.nodes.Document)
 */
public void convert(org.jsoup.nodes.Document in, Document out) {
    if (!StringUtil.isBlank(in.location()))
        out.setDocumentURI(in.location());

    org.jsoup.nodes.Element rootEl = in.child(0); // skip the #root node
    NodeTraversor traversor = new NodeTraversor(new W3CBuilder(out));
    traversor.traverse(rootEl);
}
 
Example 17
Project: astor   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor traversor = new NodeTraversor(formatter);
    traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 18
Project: astor   File: Node.java   Source Code and License 5 votes vote down vote up
/**
 * Perform a depth-first traversal through this node and its descendants.
 * @param nodeVisitor the visitor callbacks to perform on each node
 * @return this node, for chaining
 */
public Node traverse(NodeVisitor nodeVisitor) {
    Validate.notNull(nodeVisitor);
    NodeTraversor traversor = new NodeTraversor(nodeVisitor);
    traversor.traverse(this);
    return this;
}
 
Example 19
Project: astor   File: W3CDom.java   Source Code and License 5 votes vote down vote up
/**
 * Converts a jsoup document into the provided W3C Document. If required, you can set options on the output document
 * before converting.
 * @param in jsoup doc
 * @param out w3c doc
 * @see org.jsoup.helper.W3CDom#fromJsoup(org.jsoup.nodes.Document)
 */
public void convert(org.jsoup.nodes.Document in, Document out) {
    if (!StringUtil.isBlank(in.location()))
        out.setDocumentURI(in.location());

    org.jsoup.nodes.Element rootEl = in.child(0); // skip the #root node
    NodeTraversor.traverse(new W3CBuilder(out), rootEl);
}
 
Example 20
Project: astor   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor.traverse(formatter, element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 21
Project: astor   File: W3CDom.java   Source Code and License 5 votes vote down vote up
/**
 * Converts a jsoup document into the provided W3C Document. If required, you can set options on the output document
 * before converting.
 * @param in jsoup doc
 * @param out w3c doc
 * @see org.jsoup.helper.W3CDom#fromJsoup(org.jsoup.nodes.Document)
 */
public void convert(org.jsoup.nodes.Document in, Document out) {
    if (!StringUtil.isBlank(in.location()))
        out.setDocumentURI(in.location());

    org.jsoup.nodes.Element rootEl = in.child(0); // skip the #root node
    NodeTraversor.traverse(new W3CBuilder(out), rootEl);
}
 
Example 22
Project: astor   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor.traverse(formatter, element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 23
Project: BoL-API-Parser   File: Node.java   Source Code and License 5 votes vote down vote up
/**
 * Perform a depth-first traversal through this node and its descendants.
 * @param nodeVisitor the visitor callbacks to perform on each node
 * @return this node, for chaining
 */
public Node traverse(NodeVisitor nodeVisitor) {
    Validate.notNull(nodeVisitor);
    NodeTraversor traversor = new NodeTraversor(nodeVisitor);
    traversor.traverse(this);
    return this;
}
 
Example 24
Project: AngelList-Mobile   File: Node.java   Source Code and License 5 votes vote down vote up
/**
 * Perform a depth-first traversal through this node and its descendants.
 * @param nodeVisitor the visitor callbacks to perform on each node
 * @return this node, for chaining
 */
public Node traverse(NodeVisitor nodeVisitor) {
    Validate.notNull(nodeVisitor);
    NodeTraversor traversor = new NodeTraversor(nodeVisitor);
    traversor.traverse(this);
    return this;
}
 
Example 25
Project: jsoup-learning   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor traversor = new NodeTraversor(formatter);
    traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 26
Project: jsoup-learning   File: Node.java   Source Code and License 5 votes vote down vote up
/**
 * Perform a depth-first traversal through this node and its descendants.
 * @param nodeVisitor the visitor callbacks to perform on each node
 * @return this node, for chaining
 */
public Node traverse(NodeVisitor nodeVisitor) {
    Validate.notNull(nodeVisitor);
    NodeTraversor traversor = new NodeTraversor(nodeVisitor);
    traversor.traverse(this);
    return this;
}
 
Example 27
Project: idylfin   File: HtmlToPlainText.java   Source Code and License 5 votes vote down vote up
/**
 * Format an Element to plain-text
 * @param element the root element to format
 * @return formatted text
 */
public String getPlainText(Element element) {
    FormattingVisitor formatter = new FormattingVisitor();
    NodeTraversor traversor = new NodeTraversor(formatter);
    traversor.traverse(element); // walk the DOM, and call .head() and .tail() for each node

    return formatter.toString();
}
 
Example 28
Project: idylfin   File: Node.java   Source Code and License 5 votes vote down vote up
/**
 * Perform a depth-first traversal through this node and its descendants.
 * @param nodeVisitor the visitor callbacks to perform on each node
 * @return this node, for chaining
 */
public Node traverse(NodeVisitor nodeVisitor) {
    Validate.notNull(nodeVisitor);
    NodeTraversor traversor = new NodeTraversor(nodeVisitor);
    traversor.traverse(this);
    return this;
}
 
Example 29
Project: apache-jmeter-2.10   File: JsoupBasedHtmlParser.java   Source Code and License 5 votes vote down vote up
@Override
public Iterator<URL> getEmbeddedResourceURLs(byte[] html, URL baseUrl,
        URLCollection coll, String encoding) throws HTMLParseException {
    try {
        String contents = new String(html,encoding); 
        Document doc = Jsoup.parse(contents);
        JMeterNodeVisitor nodeVisitor = new JMeterNodeVisitor(new URLPointer(baseUrl), coll);
        new NodeTraversor(nodeVisitor).traverse(doc);
        return coll.iterator();
    } catch (Exception e) {
        throw new HTMLParseException(e);
    }
}
 
Example 30
Project: elasticsearch-river-remote   File: GetSitemapHtmlClient.java   Source Code and License 5 votes vote down vote up
protected static String convertNodeToText(Node node) {
	if (node == null)
		return "";
	StringBuilder buffer = new StringBuilder();
	new NodeTraversor(new ToTextNodeVisitor(buffer)).traverse(node);
	return buffer.toString().trim();
}
 
Example 31
Project: elasticsearch-river-remote   File: GetSitemapHtmlClient.java   Source Code and License 5 votes vote down vote up
protected static String convertElementsToText(Elements elements) {
	if (elements == null || elements.isEmpty())
		return "";
	StringBuilder buffer = new StringBuilder();
	NodeTraversor nt = new NodeTraversor(new ToTextNodeVisitor(buffer));
	for (Element element : elements) {
		nt.traverse(element);
	}
	return buffer.toString().trim();
}
 
Example 32
Project: karma-exchange   File: HtmlUtil.java   Source Code and License 5 votes vote down vote up
/**
 * Convert an HTML string to a plain-text string.
 *
 * @param htmlStr a string containing HTML markup
 * @return formatted text
 */
public static String toPlainText(String htmlStr) {
  Document doc = Jsoup.parse(htmlStr);
  PlainTextFormattingVisitor formatter = new PlainTextFormattingVisitor();
  NodeTraversor traversor = new NodeTraversor(formatter);
  traversor.traverse(doc); // walk the DOM, and call .head() and .tail() for each node
  return formatter.toString();
}
 
Example 33
Project: q-mail   File: HeadCleaner.java   Source Code and License 4 votes vote down vote up
private void copySafeNodes(Element source, Element destination) {
    CleaningVisitor cleaningVisitor = new CleaningVisitor(source, destination);
    NodeTraversor traversor = new NodeTraversor(cleaningVisitor);
    traversor.traverse(source);
}
 
Example 34
Project: common   File: Cleaner.java   Source Code and License 4 votes vote down vote up
private int copySafeNodes(Element source, Element dest) {
    CleaningVisitor cleaningVisitor = new CleaningVisitor(source, dest);
    NodeTraversor traversor = new NodeTraversor(cleaningVisitor);
    traversor.traverse(source);
    return cleaningVisitor.numDiscarded;
}
 
Example 35
Project: common   File: Node.java   Source Code and License 4 votes vote down vote up
protected void outerHtml(Appendable accum) {
    new NodeTraversor(new OuterHtmlVisitor(accum, getOutputSettings())).traverse(this);
}
 
Example 36
Project: gestock   File: Cleaner.java   Source Code and License 4 votes vote down vote up
private int copySafeNodes(Element source, Element dest) {
    CleaningVisitor cleaningVisitor = new CleaningVisitor(source, dest);
    NodeTraversor traversor = new NodeTraversor(cleaningVisitor);
    traversor.traverse(source);
    return cleaningVisitor.numDiscarded;
}
 
Example 37
Project: gestock   File: Node.java   Source Code and License 4 votes vote down vote up
protected void outerHtml(StringBuilder accum) {
    new NodeTraversor(new OuterHtmlVisitor(accum, getOutputSettings())).traverse(this);
}
 
Example 38
Project: CN1ML-NetbeansModule   File: Cleaner.java   Source Code and License 4 votes vote down vote up
private int copySafeNodes(Element source, Element dest) {
    CleaningVisitor cleaningVisitor = new CleaningVisitor(source, dest);
    NodeTraversor traversor = new NodeTraversor(cleaningVisitor);
    traversor.traverse(source);
    return cleaningVisitor.numDiscarded;
}
 
Example 39
Project: CN1ML-NetbeansModule   File: Node.java   Source Code and License 4 votes vote down vote up
protected void outerHtml(StringBuilder accum) {
    new NodeTraversor(new OuterHtmlVisitor(accum, getOutputSettings())).traverse(this);
}
 
Example 40
Project: astor   File: Cleaner.java   Source Code and License 4 votes vote down vote up
private int copySafeNodes(Element source, Element dest) {
    CleaningVisitor cleaningVisitor = new CleaningVisitor(source, dest);
    NodeTraversor traversor = new NodeTraversor(cleaningVisitor);
    traversor.traverse(source);
    return cleaningVisitor.numDiscarded;
}