org.apache.commons.io.TaggedIOException Java Examples

The following examples show how to use org.apache.commons.io.TaggedIOException. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: PrintStreamSpewer.java    From extract with MIT License 6 votes vote down vote up
@Override
public void write(final TikaDocument tikaDocument) throws IOException {
	if (outputMetadata) {
		writeMetadata(tikaDocument);
	}

	// A PrintStream should never throw an IOException: the exception would always come from the input stream.
	// There's no need to use a TaggedOutputStream or catch IOExceptions.
	copy(tikaDocument.getReader(), stream);

	// Add an extra newline to signify the end of the text.
	stream.println();

	if (stream.checkError()) {
		throw new TaggedIOException(new IOException("Error writing to print stream."), this);
	}

	// Write out child documents, if any.
	for (EmbeddedTikaDocument embed: tikaDocument.getEmbeds()) {
		write(embed);
	}
}
 
Example #2
Source File: Extractor.java    From extract with MIT License 5 votes vote down vote up
/**
 * Convert the given {@link Exception} into an {@link ExtractionStatus} for addition to a report.
 *
 * Logs an appropriate message depending on the exception.
 *
 * @param e the exception to convert and log
 * @return the resulting status
 */
private ExtractionStatus status(final Exception e, final Spewer spewer) {
	if (TaggedIOException.isTaggedWith(e, spewer)) {
		return ExtractionStatus.FAILURE_NOT_SAVED;
	}

	if (TaggedIOException.isTaggedWith(e, MetadataTransformer.class)) {
		return ExtractionStatus.FAILURE_NOT_PARSED;
	}

	if (e instanceof FileNotFoundException) {
		return ExtractionStatus.FAILURE_NOT_FOUND;
	}

	if (!(e instanceof IOException)) {
		return ExtractionStatus.FAILURE_UNKNOWN;
	}

	final Throwable cause = e.getCause();

	if (cause instanceof EncryptedDocumentException) {
		return ExtractionStatus.FAILURE_NOT_DECRYPTED;
	}

	// TIKA-198: IOExceptions thrown by parsers will be wrapped in a TikaException.
	// This helps us differentiate input stream exceptions from output stream exceptions.
	// https://issues.apache.org/jira/browse/TIKA-198
	if (cause instanceof TikaException) {
		return ExtractionStatus.FAILURE_NOT_PARSED;
	}

	return ExtractionStatus.FAILURE_UNREADABLE;
}
 
Example #3
Source File: Extractor.java    From extract with MIT License 5 votes vote down vote up
/**
 * Extract and spew content from a document. This method is the same as {@link #extract(Path, Spewer)} with
 * the exception that the document will be skipped if the reporter returns {@literal false} for a call to
 * {@link Reporter#skip(Path)}.
 *
 * If the document is not skipped, then the result of the extraction is passed to the reporter in a call to
 * {@link Reporter#save(Path, ExtractionStatus, Exception)}.
 *
 * @param path document to extract from
 * @param spewer endpoint to write to
 * @param reporter used to check whether the document should be skipped and save extraction status
 */
public void extract(final Path path, final Spewer spewer, final Reporter reporter) {
	Objects.requireNonNull(reporter);

	if (reporter.skip(path)) {
		logger.info(String.format("File already extracted; skipping: \"%s\".", path));
		return;
	}

	ExtractionStatus status = ExtractionStatus.SUCCESS;
	Exception exception = null;

	try {
		extract(path, spewer);
	} catch (final Exception e) {
		status = status(e, spewer);
		log(e, status, path);
		exception = e;
	}

	// For tagged IO exceptions, discard the tag, which is either unwanted or not serializable.
	if (null != exception && (exception instanceof TaggedIOException)) {
		exception = ((TaggedIOException) exception).getCause();
	}

	reporter.save(path, status, exception);
}
 
Example #4
Source File: FileSpewer.java    From extract with MIT License 5 votes vote down vote up
private void writeMetadata(final TikaDocument tikaDocument) throws IOException {
	final Metadata metadata = tikaDocument.getMetadata();
	Path outputPath = getOutputPath(tikaDocument);
	outputPath = outputPath.getFileSystem().getPath(outputPath.toString() + ".json");

	logger.info(String.format("Outputting metadata to file: \"%s\".", outputPath));

	try (final JsonGenerator jsonGenerator = new JsonFactory().createGenerator(outputPath.toFile(),
			JsonEncoding.UTF8)) {
		jsonGenerator.useDefaultPrettyPrinter();
		jsonGenerator.writeStartObject();

		new MetadataTransformer(metadata, fields).transform(jsonGenerator::writeStringField, (name, values)-> {
			jsonGenerator.writeArrayFieldStart(name);
			jsonGenerator.writeStartArray();

			for (String value: values) {
				jsonGenerator.writeString(value);
			}
		});

		jsonGenerator.writeEndObject();
		jsonGenerator.writeRaw('\n');
	} catch (IOException e) {
		throw new TaggedIOException(new IOException("Unable to output JSON."), this);
	}
}
 
Example #5
Source File: FileSpewer.java    From extract with MIT License 4 votes vote down vote up
@Override
public void write(final TikaDocument tikaDocument) throws IOException {
	final Path outputPath = getOutputPath(tikaDocument);

	// Add the output extension.
	Path contentsOutputPath;
	if (null != outputExtension) {
		contentsOutputPath = outputPath.getFileSystem().getPath(outputPath.toString() + "." + outputExtension);
	} else {
		contentsOutputPath = outputPath;
	}

	logger.info(String.format("Outputting to file: \"%s\".", contentsOutputPath));

	// Make the required directories.
	final Path outputParent = contentsOutputPath.getParent();
	if (null != outputParent) {
		final File outputFileParent = outputParent.toFile();
		final boolean madeDirs = outputFileParent.mkdirs();

		// The {@link File#mkdirs} method will return false if the path already exists.
		if (!madeDirs && !outputFileParent.isDirectory()) {
			throw new TaggedIOException(new IOException(String.format("Unable to make directories for file: \"%s\".",
					contentsOutputPath)), this);
		}
	}

	TaggedOutputStream tagged = null;

	// #copy buffers the input so there's no need to use an output buffer.
	try (final OutputStream output = Files.newOutputStream(contentsOutputPath)) {
		tagged = new TaggedOutputStream(output);
		copy(tikaDocument.getReader(), tagged);
	} catch (IOException e) {
		if (null != tagged && tagged.isCauseOf(e)) {
			throw new TaggedIOException(new IOException(String.format("Error writing output to file: \"%s\".",
					contentsOutputPath), e), this);
		} else {
			throw e;
		}
	}

	if (outputMetadata) {
		writeMetadata(tikaDocument);
	}
}
 
Example #6
Source File: TaggedOutputStream.java    From lams with GNU General Public License v2.0 2 votes vote down vote up
/**
 * Tags any IOExceptions thrown, wrapping and re-throwing.
 *
 * @param e The IOException thrown
 * @throws IOException if an I/O error occurs
 */
@Override
protected void handleIOException(final IOException e) throws IOException {
    throw new TaggedIOException(e, tag);
}
 
Example #7
Source File: TaggedInputStream.java    From lams with GNU General Public License v2.0 2 votes vote down vote up
/**
 * Tags any IOExceptions thrown, wrapping and re-throwing.
 * 
 * @param e The IOException thrown
 * @throws IOException if an I/O error occurs
 */
@Override
protected void handleIOException(final IOException e) throws IOException {
    throw new TaggedIOException(e, tag);
}
 
Example #8
Source File: TaggedInputStream.java    From lams with GNU General Public License v2.0 2 votes vote down vote up
/**
 * Re-throws the original exception thrown by this stream. This method
 * first checks whether the given exception is a {@link TaggedIOException}
 * wrapper created by this decorator, and then unwraps and throws the
 * original wrapped exception. Returns normally if the exception was
 * not thrown by this stream.
 *
 * @param throwable an exception
 * @throws IOException original exception, if any, thrown by this stream
 */
public void throwIfCauseOf(final Throwable throwable) throws IOException {
    TaggedIOException.throwCauseIfTaggedWith(throwable, tag);
}
 
Example #9
Source File: TaggedInputStream.java    From lams with GNU General Public License v2.0 2 votes vote down vote up
/**
 * Tests if the given exception was caused by this stream.
 *
 * @param exception an exception
 * @return {@code true} if the exception was thrown by this stream,
 *         {@code false} otherwise
 */
public boolean isCauseOf(final Throwable exception) {
    return TaggedIOException.isTaggedWith(exception, tag);
}
 
Example #10
Source File: TaggedOutputStream.java    From aion-germany with GNU General Public License v3.0 2 votes vote down vote up
/**
 * Tests if the given exception was caused by this stream.
 *
 * @param exception an exception
 * @return {@code true} if the exception was thrown by this stream,
 *         {@code false} otherwise
 */
public boolean isCauseOf(Exception exception) {
    return TaggedIOException.isTaggedWith(exception, tag);
}
 
Example #11
Source File: TaggedOutputStream.java    From lams with GNU General Public License v2.0 2 votes vote down vote up
/**
 * Re-throws the original exception thrown by this stream. This method
 * first checks whether the given exception is a {@link TaggedIOException}
 * wrapper created by this decorator, and then unwraps and throws the
 * original wrapped exception. Returns normally if the exception was
 * not thrown by this stream.
 *
 * @param exception an exception
 * @throws IOException original exception, if any, thrown by this stream
 */
public void throwIfCauseOf(final Exception exception) throws IOException {
    TaggedIOException.throwCauseIfTaggedWith(exception, tag);
}
 
Example #12
Source File: TaggedOutputStream.java    From lams with GNU General Public License v2.0 2 votes vote down vote up
/**
 * Tests if the given exception was caused by this stream.
 *
 * @param exception an exception
 * @return {@code true} if the exception was thrown by this stream,
 *         {@code false} otherwise
 */
public boolean isCauseOf(final Exception exception) {
    return TaggedIOException.isTaggedWith(exception, tag);
}
 
Example #13
Source File: TaggedInputStream.java    From aion-germany with GNU General Public License v3.0 2 votes vote down vote up
/**
 * Tags any IOExceptions thrown, wrapping and re-throwing.
 * 
 * @param e The IOException thrown
 * @throws IOException if an I/O error occurs
 */
@Override
protected void handleIOException(IOException e) throws IOException {
    throw new TaggedIOException(e, tag);
}
 
Example #14
Source File: TaggedInputStream.java    From aion-germany with GNU General Public License v3.0 2 votes vote down vote up
/**
 * Re-throws the original exception thrown by this stream. This method
 * first checks whether the given exception is a {@link TaggedIOException}
 * wrapper created by this decorator, and then unwraps and throws the
 * original wrapped exception. Returns normally if the exception was
 * not thrown by this stream.
 *
 * @param throwable an exception
 * @throws IOException original exception, if any, thrown by this stream
 */
public void throwIfCauseOf(Throwable throwable) throws IOException {
    TaggedIOException.throwCauseIfTaggedWith(throwable, tag);
}
 
Example #15
Source File: TaggedInputStream.java    From aion-germany with GNU General Public License v3.0 2 votes vote down vote up
/**
 * Tests if the given exception was caused by this stream.
 *
 * @param exception an exception
 * @return {@code true} if the exception was thrown by this stream,
 *         {@code false} otherwise
 */
public boolean isCauseOf(Throwable exception) {
    return TaggedIOException.isTaggedWith(exception, tag);
}
 
Example #16
Source File: TaggedOutputStream.java    From aion-germany with GNU General Public License v3.0 2 votes vote down vote up
/**
 * Tags any IOExceptions thrown, wrapping and re-throwing.
 *
 * @param e The IOException thrown
 * @throws IOException if an I/O error occurs
 */
@Override
protected void handleIOException(IOException e) throws IOException {
    throw new TaggedIOException(e, tag);
}
 
Example #17
Source File: TaggedOutputStream.java    From aion-germany with GNU General Public License v3.0 2 votes vote down vote up
/**
 * Re-throws the original exception thrown by this stream. This method
 * first checks whether the given exception is a {@link TaggedIOException}
 * wrapper created by this decorator, and then unwraps and throws the
 * original wrapped exception. Returns normally if the exception was
 * not thrown by this stream.
 *
 * @param exception an exception
 * @throws IOException original exception, if any, thrown by this stream
 */
public void throwIfCauseOf(Exception exception) throws IOException {
    TaggedIOException.throwCauseIfTaggedWith(exception, tag);
}