Java Code Examples for org.apache.flink.core.fs.RecoverableWriter#ResumeRecoverable

The following examples show how to use org.apache.flink.core.fs.RecoverableWriter#ResumeRecoverable . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: S3RecoverableFsDataOutputStream.java    From Flink-CEPplus with Apache License 2.0 6 votes vote down vote up
@Override
public RecoverableWriter.ResumeRecoverable persist() throws IOException {
	lock();
	try {
		fileStream.flush();
		openNewPartIfNecessary(userDefinedMinPartSize);

		// We do not stop writing to the current file, we merely limit the upload to the
		// first n bytes of the current file

		return upload.snapshotAndGetRecoverable(fileStream);
	}
	finally {
		unlock();
	}
}
 
Example 2
Source File: HadoopS3RecoverableWriterExceptionITCase.java    From flink with Apache License 2.0 6 votes vote down vote up
@Test(expected = IOException.class)
public void testResumeWithWrongOffset() throws Exception {
	// this is a rather unrealistic scenario, but it is to trigger
	// truncation of the file and try to resume with missing data.

	final RecoverableWriter writer = getFileSystem().createRecoverableWriter();
	final Path path = new Path(basePathForTest, "part-0");

	final RecoverableFsDataOutputStream stream = writer.open(path);
	stream.write(testData1.getBytes(StandardCharsets.UTF_8));

	final RecoverableWriter.ResumeRecoverable recoverable1 = stream.persist();
	stream.write(testData2.getBytes(StandardCharsets.UTF_8));

	final RecoverableWriter.ResumeRecoverable recoverable2 = stream.persist();
	stream.write(testData3.getBytes(StandardCharsets.UTF_8));

	final RecoverableFsDataOutputStream recoveredStream = writer.recover(recoverable1);
	recoveredStream.closeForCommit().commit();

	// this should throw an exception
	final RecoverableFsDataOutputStream newRecoveredStream = writer.recover(recoverable2);
	newRecoveredStream.closeForCommit().commit();
}
 
Example 3
Source File: HadoopS3RecoverableWriterExceptionITCase.java    From Flink-CEPplus with Apache License 2.0 6 votes vote down vote up
@Test(expected = IOException.class)
public void testResumeAfterCommit() throws Exception {
	final RecoverableWriter writer = getFileSystem().createRecoverableWriter();
	final Path path = new Path(basePathForTest, "part-0");

	final RecoverableFsDataOutputStream stream = writer.open(path);
	stream.write(testData1.getBytes(StandardCharsets.UTF_8));

	final RecoverableWriter.ResumeRecoverable recoverable = stream.persist();
	stream.write(testData2.getBytes(StandardCharsets.UTF_8));

	stream.closeForCommit().commit();

	final RecoverableFsDataOutputStream recoveredStream = writer.recover(recoverable);
	recoveredStream.closeForCommit().commit();
}
 
Example 4
Source File: HadoopS3RecoverableWriterExceptionITCase.java    From flink with Apache License 2.0 6 votes vote down vote up
@Test(expected = IOException.class)
public void testResumeWithWrongOffset() throws Exception {
	// this is a rather unrealistic scenario, but it is to trigger
	// truncation of the file and try to resume with missing data.

	final RecoverableWriter writer = getFileSystem().createRecoverableWriter();
	final Path path = new Path(basePathForTest, "part-0");

	final RecoverableFsDataOutputStream stream = writer.open(path);
	stream.write(testData1.getBytes(StandardCharsets.UTF_8));

	final RecoverableWriter.ResumeRecoverable recoverable1 = stream.persist();
	stream.write(testData2.getBytes(StandardCharsets.UTF_8));

	final RecoverableWriter.ResumeRecoverable recoverable2 = stream.persist();
	stream.write(testData3.getBytes(StandardCharsets.UTF_8));

	final RecoverableFsDataOutputStream recoveredStream = writer.recover(recoverable1);
	recoveredStream.closeForCommit().commit();

	// this should throw an exception
	final RecoverableFsDataOutputStream newRecoveredStream = writer.recover(recoverable2);
	newRecoveredStream.closeForCommit().commit();
}
 
Example 5
Source File: RowWiseBucketWriter.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public InProgressFileWriter<IN, BucketID> resumeFrom(
		final BucketID bucketId,
		final RecoverableFsDataOutputStream stream,
		final RecoverableWriter.ResumeRecoverable resumable,
		final long creationTime) {

	Preconditions.checkNotNull(stream);
	Preconditions.checkNotNull(resumable);

	return new RowWisePartWriter<>(bucketId, stream, encoder, creationTime);
}
 
Example 6
Source File: BucketStateSerializer.java    From flink with Apache License 2.0 5 votes vote down vote up
@VisibleForTesting
BucketState<BucketID> deserializeV1(DataInputView in) throws IOException {
	final BucketID bucketId = SimpleVersionedSerialization.readVersionAndDeSerialize(bucketIdSerializer, in);
	final String bucketPathStr = in.readUTF();
	final long creationTime = in.readLong();

	// then get the current resumable stream
	RecoverableWriter.ResumeRecoverable current = null;
	if (in.readBoolean()) {
		current = SimpleVersionedSerialization.readVersionAndDeSerialize(resumableSerializer, in);
	}

	final int committableVersion = in.readInt();
	final int numCheckpoints = in.readInt();
	final HashMap<Long, List<RecoverableWriter.CommitRecoverable>> resumablesPerCheckpoint = new HashMap<>(numCheckpoints);

	for (int i = 0; i < numCheckpoints; i++) {
		final long checkpointId = in.readLong();
		final int noOfResumables = in.readInt();

		final List<RecoverableWriter.CommitRecoverable> resumables = new ArrayList<>(noOfResumables);
		for (int j = 0; j < noOfResumables; j++) {
			final byte[] bytes = new byte[in.readInt()];
			in.readFully(bytes);
			resumables.add(commitableSerializer.deserialize(committableVersion, bytes));
		}
		resumablesPerCheckpoint.put(checkpointId, resumables);
	}

	return new BucketState<>(
			bucketId,
			new Path(bucketPathStr),
			creationTime,
			current,
			resumablesPerCheckpoint);
}
 
Example 7
Source File: BulkPartWriter.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public PartFileWriter<IN, BucketID> resumeFrom(
		final BucketID bucketId,
		final RecoverableFsDataOutputStream stream,
		final RecoverableWriter.ResumeRecoverable resumable,
		final long creationTime) throws IOException {

	Preconditions.checkNotNull(stream);
	Preconditions.checkNotNull(resumable);

	final BulkWriter<IN> writer = writerFactory.create(stream);
	return new BulkPartWriter<>(bucketId, stream, writer, creationTime);
}
 
Example 8
Source File: BucketState.java    From flink with Apache License 2.0 5 votes vote down vote up
BucketState(
		final BucketID bucketId,
		final Path bucketPath,
		final long inProgressFileCreationTime,
		@Nullable final RecoverableWriter.ResumeRecoverable inProgressResumableFile,
		final Map<Long, List<RecoverableWriter.CommitRecoverable>> pendingCommittablesPerCheckpoint
) {
	this.bucketId = Preconditions.checkNotNull(bucketId);
	this.bucketPath = Preconditions.checkNotNull(bucketPath);
	this.inProgressFileCreationTime = inProgressFileCreationTime;
	this.inProgressResumableFile = inProgressResumableFile;
	this.committableFilesPerCheckpoint = Preconditions.checkNotNull(pendingCommittablesPerCheckpoint);
}
 
Example 9
Source File: BucketStateSerializer.java    From Flink-CEPplus with Apache License 2.0 5 votes vote down vote up
@VisibleForTesting
BucketState<BucketID> deserializeV1(DataInputView in) throws IOException {
	final BucketID bucketId = SimpleVersionedSerialization.readVersionAndDeSerialize(bucketIdSerializer, in);
	final String bucketPathStr = in.readUTF();
	final long creationTime = in.readLong();

	// then get the current resumable stream
	RecoverableWriter.ResumeRecoverable current = null;
	if (in.readBoolean()) {
		current = SimpleVersionedSerialization.readVersionAndDeSerialize(resumableSerializer, in);
	}

	final int committableVersion = in.readInt();
	final int numCheckpoints = in.readInt();
	final HashMap<Long, List<RecoverableWriter.CommitRecoverable>> resumablesPerCheckpoint = new HashMap<>(numCheckpoints);

	for (int i = 0; i < numCheckpoints; i++) {
		final long checkpointId = in.readLong();
		final int noOfResumables = in.readInt();

		final List<RecoverableWriter.CommitRecoverable> resumables = new ArrayList<>(noOfResumables);
		for (int j = 0; j < noOfResumables; j++) {
			final byte[] bytes = new byte[in.readInt()];
			in.readFully(bytes);
			resumables.add(commitableSerializer.deserialize(committableVersion, bytes));
		}
		resumablesPerCheckpoint.put(checkpointId, resumables);
	}

	return new BucketState<>(
			bucketId,
			new Path(bucketPathStr),
			creationTime,
			current,
			resumablesPerCheckpoint);
}
 
Example 10
Source File: S3RecoverableFsDataOutputStreamTest.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public RecoverableWriter.ResumeRecoverable snapshotAndGetRecoverable(RefCountedFSOutputStream incompletePartFile) throws IOException {
	lastPersistedIndex = uploadedContent.size();

	if (incompletePartFile.getPos() >= 0L) {
		byte[] bytes = readFileContents(incompletePartFile);
		uncompleted = Optional.of(bytes);
	}

	return null;
}
 
Example 11
Source File: BulkPartWriter.java    From Flink-CEPplus with Apache License 2.0 4 votes vote down vote up
@Override
RecoverableWriter.ResumeRecoverable persist() {
	throw new UnsupportedOperationException("Bulk Part Writers do not support \"pause and resume\" operations.");
}
 
Example 12
Source File: OutputStreamBasedPartFileWriter.java    From flink with Apache License 2.0 4 votes vote down vote up
SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> getResumeSerializer() {
	return resumeSerializer;
}
 
Example 13
Source File: OutputStreamBasedPartFileWriter.java    From flink with Apache License 2.0 4 votes vote down vote up
OutputStreamBasedInProgressFileRecoverableSerializer(SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> resumeSerializer) {
	this.resumeSerializer = resumeSerializer;
}
 
Example 14
Source File: BucketStateSerializer.java    From flink with Apache License 2.0 4 votes vote down vote up
private BucketState<BucketID> deserializeV1(DataInputView in) throws IOException {

		final SimpleVersionedSerializer<RecoverableWriter.CommitRecoverable> commitableSerializer = getCommitableSerializer();
		final SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> resumableSerializer = getResumableSerializer();

		final BucketID bucketId = SimpleVersionedSerialization.readVersionAndDeSerialize(bucketIdSerializer, in);
		final String bucketPathStr = in.readUTF();
		final long creationTime = in.readLong();

		// then get the current resumable stream
		InProgressFileWriter.InProgressFileRecoverable current = null;
		if (in.readBoolean()) {
			current =
				new OutputStreamBasedPartFileWriter.OutputStreamBasedInProgressFileRecoverable(
					SimpleVersionedSerialization.readVersionAndDeSerialize(resumableSerializer, in));
		}

		final int committableVersion = in.readInt();
		final int numCheckpoints = in.readInt();
		final HashMap<Long, List<InProgressFileWriter.PendingFileRecoverable>> pendingFileRecoverablePerCheckpoint = new HashMap<>(numCheckpoints);

		for (int i = 0; i < numCheckpoints; i++) {
			final long checkpointId = in.readLong();
			final int noOfResumables = in.readInt();

			final List<InProgressFileWriter.PendingFileRecoverable> pendingFileRecoverables = new ArrayList<>(noOfResumables);
			for (int j = 0; j < noOfResumables; j++) {
				final byte[] bytes = new byte[in.readInt()];
				in.readFully(bytes);
				pendingFileRecoverables.add(
					new OutputStreamBasedPartFileWriter.OutputStreamBasedPendingFileRecoverable(commitableSerializer.deserialize(committableVersion, bytes)));
			}
			pendingFileRecoverablePerCheckpoint.put(checkpointId, pendingFileRecoverables);
		}

		return new BucketState<>(
			bucketId,
			new Path(bucketPathStr),
			creationTime,
			current,
			pendingFileRecoverablePerCheckpoint);
	}
 
Example 15
Source File: BucketState.java    From flink with Apache License 2.0 4 votes vote down vote up
@Nullable
RecoverableWriter.ResumeRecoverable getInProgressResumableFile() {
	return inProgressResumableFile;
}
 
Example 16
Source File: OutputStreamBasedPartFileWriter.java    From flink with Apache License 2.0 4 votes vote down vote up
public abstract InProgressFileWriter<IN, BucketID> resumeFrom(
final BucketID bucketId,
final RecoverableFsDataOutputStream stream,
final RecoverableWriter.ResumeRecoverable resumable,
final long creationTime) throws IOException;
 
Example 17
Source File: NoOpRecoverableFsDataOutputStream.java    From Flink-CEPplus with Apache License 2.0 4 votes vote down vote up
@Override
public RecoverableWriter.ResumeRecoverable persist() throws IOException {
	return null;
}
 
Example 18
Source File: NoOpRecoverableWriter.java    From flink with Apache License 2.0 4 votes vote down vote up
@Override
public RecoverableFsDataOutputStream recover(RecoverableWriter.ResumeRecoverable resumable) throws IOException {
	return null;
}
 
Example 19
Source File: PartFileWriter.java    From flink with Apache License 2.0 2 votes vote down vote up
/**
 * Used upon recovery from a failure to recover a {@link PartFileWriter writer}.
 * @param bucketId the id of the bucket this writer is writing to.
 * @param stream the filesystem-specific output stream to use when writing to the filesystem.
 * @param resumable the state of the stream we are resurrecting.
 * @param creationTime the creation time of the stream.
 * @return the recovered {@link PartFileWriter writer}.
 * @throws IOException
 */
PartFileWriter<IN, BucketID> resumeFrom(
	final BucketID bucketId,
	final RecoverableFsDataOutputStream stream,
	final RecoverableWriter.ResumeRecoverable resumable,
	final long creationTime) throws IOException;
 
Example 20
Source File: RecoverableMultiPartUpload.java    From flink with Apache License 2.0 2 votes vote down vote up
/**
 * Creates a snapshot of this MultiPartUpload, from which the upload can be resumed.
 *
 * @param incompletePartFile The file containing the in-progress part which has not yet reached the minimum
 *                           part size in order to be uploaded.
 *
 * @return The {@link RecoverableWriter.ResumeRecoverable ResumeRecoverable} which
 * can be used to resume the upload.
 */
RecoverableWriter.ResumeRecoverable snapshotAndGetRecoverable(
		@Nullable final RefCountedFSOutputStream incompletePartFile) throws IOException;