org.jsoup.safety.Cleaner Java Examples

The following examples show how to use org.jsoup.safety.Cleaner. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: Utilities.java    From inception with Apache License 2.0 5 votes vote down vote up
public static String cleanHighlight(String aHighlight) {
    Whitelist wl = new Whitelist();
    wl.addTags("em");
    Document dirty = Jsoup.parseBodyFragment(aHighlight, "");
    Cleaner cleaner = new Cleaner(wl);
    Document clean = cleaner.clean(dirty);
    clean.select("em").tagName("mark");

    return clean.body().html();
}
 
Example #2
Source File: HerdStringUtils.java    From herd with Apache License 2.0 5 votes vote down vote up
/**
 * Strips HTML tags from a given input String, allows some tags to be retained via a whitelist
 *
 * @param fragment the specified String
 * @param whitelistTags the specified whitelist tags
 *
 * @return cleaned String with allowed tags
 */
public static String stripHtml(String fragment, String... whitelistTags)
{
    // Unescape HTML.
    String unEscapedFragment = StringEscapeUtils.unescapeHtml4(fragment);

    // Parse out html tags except those from a given list of whitelist tags
    Document dirty = Jsoup.parseBodyFragment(unEscapedFragment);

    Whitelist whitelist = new Whitelist();

    for (String whitelistTag : whitelistTags)
    {
        // Get the actual tag name from the whitelist tag
        // this is vulnerable in general to complex tags but will suffice for our simple needs
        whitelistTag = StringUtils.removePattern(whitelistTag, "[^\\{IsAlphabetic}]");

        // Add all specified tags to the whitelist while preserving inline css
        whitelist.addTags(whitelistTag).addAttributes(whitelistTag, "class");
    }

    Cleaner cleaner = new Cleaner(whitelist);
    Document clean = cleaner.clean(dirty);
    // Set character encoding to UTF-8 and make sure no line-breaks are added
    clean.outputSettings().escapeMode(Entities.EscapeMode.base).charset(StandardCharsets.UTF_8).prettyPrint(false);

    // return 'cleaned' html body
    return clean.body().html();
}
 
Example #3
Source File: TestController.java    From BlogManagePlatform with Apache License 2.0 4 votes vote down vote up
@RequestMapping("/escape")
public Result escapeEndPoint(@RequestParam("name") String name) {
	return new Cleaner(Whitelist.basic()).isValidBodyHtml(name) ? Result.success() : Result.fail();
}
 
Example #4
Source File: Jsoup.java    From astor with GNU General Public License v2.0 2 votes vote down vote up
/**
 Test if the input HTML has only tags and attributes allowed by the Whitelist. Useful for form validation. The input HTML should
 still be run through the cleaner to set up enforced attributes, and to tidy the output.
 @param bodyHtml HTML to test
 @param whitelist whitelist to test against
 @return true if no tags or attributes were removed; false otherwise
 @see #clean(String, org.jsoup.safety.Whitelist) 
 */
public static boolean isValid(String bodyHtml, Whitelist whitelist) {
    Document dirty = parseBodyFragment(bodyHtml, "");
    Cleaner cleaner = new Cleaner(whitelist);
    return cleaner.isValid(dirty);
}
 
Example #5
Source File: Jsoup.java    From astor with GNU General Public License v2.0 2 votes vote down vote up
/**
 Test if the input body HTML has only tags and attributes allowed by the Whitelist. Useful for form validation.
 <p>The input HTML should still be run through the cleaner to set up enforced attributes, and to tidy the output.
 <p>Assumes the HTML is a body fragment (i.e. will be used in an existing HTML document body.)
 @param bodyHtml HTML to test
 @param whitelist whitelist to test against
 @return true if no tags or attributes were removed; false otherwise
 @see #clean(String, org.jsoup.safety.Whitelist) 
 */
public static boolean isValid(String bodyHtml, Whitelist whitelist) {
    return new Cleaner(whitelist).isValidBodyHtml(bodyHtml);
}
 
Example #6
Source File: Jsoup.java    From astor with GNU General Public License v2.0 2 votes vote down vote up
/**
 Test if the input body HTML has only tags and attributes allowed by the Whitelist. Useful for form validation.
 <p>The input HTML should still be run through the cleaner to set up enforced attributes, and to tidy the output.
 <p>Assumes the HTML is a body fragment (i.e. will be used in an existing HTML document body.)
 @param bodyHtml HTML to test
 @param whitelist whitelist to test against
 @return true if no tags or attributes were removed; false otherwise
 @see #clean(String, org.jsoup.safety.Whitelist) 
 */
public static boolean isValid(String bodyHtml, Whitelist whitelist) {
    return new Cleaner(whitelist).isValidBodyHtml(bodyHtml);
}
 
Example #7
Source File: Jsoup.java    From jsoup-learning with MIT License 2 votes vote down vote up
/**
 Test if the input HTML has only tags and attributes allowed by the Whitelist. Useful for form validation. The input HTML should
 still be run through the cleaner to set up enforced attributes, and to tidy the output.
 @param bodyHtml HTML to test
 @param whitelist whitelist to test against
 @return true if no tags or attributes were removed; false otherwise
 @see #clean(String, org.jsoup.safety.Whitelist) 
 */
public static boolean isValid(String bodyHtml, Whitelist whitelist) {
    Document dirty = parseBodyFragment(bodyHtml, "");
    Cleaner cleaner = new Cleaner(whitelist);
    return cleaner.isValid(dirty);
}