org.apache.hadoop.hdfs.DFSInotifyEventInputStream Java Examples

The following examples show how to use org.apache.hadoop.hdfs.DFSInotifyEventInputStream. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: GetHDFSEvents.java    From localization_nifi with Apache License 2.0 6 votes vote down vote up
private EventBatch getEventBatch(DFSInotifyEventInputStream eventStream, long duration, TimeUnit timeUnit, int retries) throws IOException, InterruptedException, MissingEventsException {
    // According to the inotify API we should retry a few times if poll throws an IOException.
    // Please see org.apache.hadoop.hdfs.DFSInotifyEventInputStream#poll for documentation.
    int i = 0;
    while (true) {
        try {
            i += 1;
            return eventStream.poll(duration, timeUnit);
        } catch (IOException e) {
            if (i > retries) {
                getLogger().debug("Failed to poll for event batch. Reached max retry times.", e);
                throw e;
            } else {
                getLogger().debug("Attempt {} failed to poll for event batch. Retrying.", new Object[]{i});
            }
        }
    }
}
 
Example #2
Source File: GetHDFSEvents.java    From nifi with Apache License 2.0 6 votes vote down vote up
private EventBatch getEventBatch(DFSInotifyEventInputStream eventStream, long duration, TimeUnit timeUnit, int retries) throws IOException, InterruptedException, MissingEventsException {
    // According to the inotify API we should retry a few times if poll throws an IOException.
    // Please see org.apache.hadoop.hdfs.DFSInotifyEventInputStream#poll for documentation.
    int i = 0;
    while (true) {
        try {
            i += 1;
            return eventStream.poll(duration, timeUnit);
        } catch (IOException e) {
            if (i > retries) {
                getLogger().debug("Failed to poll for event batch. Reached max retry times.", e);
                throw e;
            } else {
                getLogger().debug("Attempt {} failed to poll for event batch. Retrying.", new Object[]{i});
            }
        }
    }
}
 
Example #3
Source File: TestGetHDFSEvents.java    From localization_nifi with Apache License 2.0 5 votes vote down vote up
@Before
public void setup() {
    mockNiFiProperties = mock(NiFiProperties.class);
    when(mockNiFiProperties.getKerberosConfigurationFile()).thenReturn(null);
    kerberosProperties = new KerberosProperties(null);
    inotifyEventInputStream = mock(DFSInotifyEventInputStream.class);
    hdfsAdmin = mock(HdfsAdmin.class);
}
 
Example #4
Source File: TestGetHDFSEvents.java    From nifi with Apache License 2.0 5 votes vote down vote up
@Before
public void setup() {
    mockNiFiProperties = mock(NiFiProperties.class);
    when(mockNiFiProperties.getKerberosConfigurationFile()).thenReturn(null);
    kerberosProperties = new KerberosProperties(null);
    inotifyEventInputStream = mock(DFSInotifyEventInputStream.class);
    hdfsAdmin = mock(HdfsAdmin.class);
}
 
Example #5
Source File: HdfsFileWatcherPolicy.java    From kafka-connect-fs with Apache License 2.0 4 votes vote down vote up
@Override
public void run() {
    while (true) {
        try {
            DFSInotifyEventInputStream eventStream = admin.getInotifyEventStream();
            if (fs.getFileStatus(fs.getWorkingDirectory()) != null &&
                    fs.exists(fs.getWorkingDirectory())) {
                EventBatch batch = eventStream.poll();
                if (batch == null) continue;

                for (Event event : batch.getEvents()) {
                    switch (event.getEventType()) {
                        case CREATE:
                            if (!((Event.CreateEvent) event).getPath().endsWith("._COPYING_")) {
                                enqueue(((Event.CreateEvent) event).getPath());
                            }
                            break;
                        case APPEND:
                            if (!((Event.AppendEvent) event).getPath().endsWith("._COPYING_")) {
                                enqueue(((Event.AppendEvent) event).getPath());
                            }
                            break;
                        case RENAME:
                            if (((Event.RenameEvent) event).getSrcPath().endsWith("._COPYING_")) {
                                enqueue(((Event.RenameEvent) event).getDstPath());
                            }
                            break;
                        case CLOSE:
                            if (!((Event.CloseEvent) event).getPath().endsWith("._COPYING_")) {
                                enqueue(((Event.CloseEvent) event).getPath());
                            }
                            break;
                        default:
                            break;
                    }
                }
            }
        } catch (IOException ioe) {
            if (retrySleepMs > 0) {
                time.sleep(retrySleepMs);
            } else {
                log.warn("Error watching path [{}]. Stopping it...", fs.getWorkingDirectory(), ioe);
                throw new IllegalWorkerStateException(ioe);
            }
        } catch (Exception e) {
            log.warn("Stopping watcher due to an unexpected exception when watching path [{}].",
                    fs.getWorkingDirectory(), e);
            throw new IllegalWorkerStateException(e);
        }
    }
}
 
Example #6
Source File: HdfsAdmin.java    From hadoop with Apache License 2.0 2 votes vote down vote up
/**
 * Exposes a stream of namesystem events. Only events occurring after the
 * stream is created are available.
 * See {@link org.apache.hadoop.hdfs.DFSInotifyEventInputStream}
 * for information on stream usage.
 * See {@link org.apache.hadoop.hdfs.inotify.Event}
 * for information on the available events.
 * <p/>
 * Inotify users may want to tune the following HDFS parameters to
 * ensure that enough extra HDFS edits are saved to support inotify clients
 * that fall behind the current state of the namespace while reading events.
 * The default parameter values should generally be reasonable. If edits are
 * deleted before their corresponding events can be read, clients will see a
 * {@link org.apache.hadoop.hdfs.inotify.MissingEventsException} on
 * {@link org.apache.hadoop.hdfs.DFSInotifyEventInputStream} method calls.
 *
 * It should generally be sufficient to tune these parameters:
 * dfs.namenode.num.extra.edits.retained
 * dfs.namenode.max.extra.edits.segments.retained
 *
 * Parameters that affect the number of created segments and the number of
 * edits that are considered necessary, i.e. do not count towards the
 * dfs.namenode.num.extra.edits.retained quota):
 * dfs.namenode.checkpoint.period
 * dfs.namenode.checkpoint.txns
 * dfs.namenode.num.checkpoints.retained
 * dfs.ha.log-roll.period
 * <p/>
 * It is recommended that local journaling be configured
 * (dfs.namenode.edits.dir) for inotify (in addition to a shared journal)
 * so that edit transfers from the shared journal can be avoided.
 *
 * @throws IOException If there was an error obtaining the stream.
 */
public DFSInotifyEventInputStream getInotifyEventStream() throws IOException {
  return dfs.getInotifyEventStream();
}
 
Example #7
Source File: HdfsAdmin.java    From hadoop with Apache License 2.0 2 votes vote down vote up
/**
 * A version of {@link HdfsAdmin#getInotifyEventStream()} meant for advanced
 * users who are aware of HDFS edits up to lastReadTxid (e.g. because they
 * have access to an FSImage inclusive of lastReadTxid) and only want to read
 * events after this point.
 */
public DFSInotifyEventInputStream getInotifyEventStream(long lastReadTxid)
    throws IOException {
  return dfs.getInotifyEventStream(lastReadTxid);
}
 
Example #8
Source File: HdfsAdmin.java    From big-c with Apache License 2.0 2 votes vote down vote up
/**
 * Exposes a stream of namesystem events. Only events occurring after the
 * stream is created are available.
 * See {@link org.apache.hadoop.hdfs.DFSInotifyEventInputStream}
 * for information on stream usage.
 * See {@link org.apache.hadoop.hdfs.inotify.Event}
 * for information on the available events.
 * <p/>
 * Inotify users may want to tune the following HDFS parameters to
 * ensure that enough extra HDFS edits are saved to support inotify clients
 * that fall behind the current state of the namespace while reading events.
 * The default parameter values should generally be reasonable. If edits are
 * deleted before their corresponding events can be read, clients will see a
 * {@link org.apache.hadoop.hdfs.inotify.MissingEventsException} on
 * {@link org.apache.hadoop.hdfs.DFSInotifyEventInputStream} method calls.
 *
 * It should generally be sufficient to tune these parameters:
 * dfs.namenode.num.extra.edits.retained
 * dfs.namenode.max.extra.edits.segments.retained
 *
 * Parameters that affect the number of created segments and the number of
 * edits that are considered necessary, i.e. do not count towards the
 * dfs.namenode.num.extra.edits.retained quota):
 * dfs.namenode.checkpoint.period
 * dfs.namenode.checkpoint.txns
 * dfs.namenode.num.checkpoints.retained
 * dfs.ha.log-roll.period
 * <p/>
 * It is recommended that local journaling be configured
 * (dfs.namenode.edits.dir) for inotify (in addition to a shared journal)
 * so that edit transfers from the shared journal can be avoided.
 *
 * @throws IOException If there was an error obtaining the stream.
 */
public DFSInotifyEventInputStream getInotifyEventStream() throws IOException {
  return dfs.getInotifyEventStream();
}
 
Example #9
Source File: HdfsAdmin.java    From big-c with Apache License 2.0 2 votes vote down vote up
/**
 * A version of {@link HdfsAdmin#getInotifyEventStream()} meant for advanced
 * users who are aware of HDFS edits up to lastReadTxid (e.g. because they
 * have access to an FSImage inclusive of lastReadTxid) and only want to read
 * events after this point.
 */
public DFSInotifyEventInputStream getInotifyEventStream(long lastReadTxid)
    throws IOException {
  return dfs.getInotifyEventStream(lastReadTxid);
}