Java Code Examples for org.apache.hadoop.hdfs.util.Canceler

The following examples show how to use org.apache.hadoop.hdfs.util.Canceler. These examples are extracted from open source projects. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source Project: hadoop-ozone   Source File: KeyValueContainerCheck.java    License: Apache License 2.0 6 votes vote down vote up
/**
 * full checks comprise scanning all metadata inside the container.
 * Including the KV database. These checks are intrusive, consume more
 * resources compared to fast checks and should only be done on Closed
 * or Quasi-closed Containers. Concurrency being limited to delete
 * workflows.
 * <p>
 * fullCheck is a superset of fastCheck
 *
 * @return true : integrity checks pass, false : otherwise.
 */
public boolean fullCheck(DataTransferThrottler throttler, Canceler canceler) {
  boolean valid;

  try {
    valid = fastCheck();
    if (valid) {
      scanData(throttler, canceler);
    }
  } catch (IOException e) {
    handleCorruption(e);
    valid = false;
  }

  return valid;
}
 
Example 2
Source Project: hadoop   Source File: TransferFsImage.java    License: Apache License 2.0 6 votes vote down vote up
/**
 * Requests that the NameNode download an image from this node.  Allows for
 * optional external cancelation.
 *
 * @param fsName the http address for the remote NN
 * @param conf Configuration
 * @param storage the storage directory to transfer the image from
 * @param nnf the NameNodeFile type of the image
 * @param txid the transaction ID of the image to be uploaded
 * @param canceler optional canceler to check for abort of upload
 * @throws IOException if there is an I/O error or cancellation
 */
public static void uploadImageFromStorage(URL fsName, Configuration conf,
    NNStorage storage, NameNodeFile nnf, long txid, Canceler canceler)
    throws IOException {
  URL url = new URL(fsName, ImageServlet.PATH_SPEC);
  long startTime = Time.monotonicNow();
  try {
    uploadImage(url, conf, storage, nnf, txid, canceler);
  } catch (HttpPutFailedException e) {
    if (e.getResponseCode() == HttpServletResponse.SC_CONFLICT) {
      // this is OK - this means that a previous attempt to upload
      // this checkpoint succeeded even though we thought it failed.
      LOG.info("Image upload with txid " + txid + 
          " conflicted with a previous image upload to the " +
          "same NameNode. Continuing...", e);
      return;
    } else {
      throw e;
    }
  }
  double xferSec = Math.max(
      ((float) (Time.monotonicNow() - startTime)) / 1000.0, 0.001);
  LOG.info("Uploaded image with txid " + txid + " to namenode at " + fsName
      + " in " + xferSec + " seconds");
}
 
Example 3
Source Project: hadoop   Source File: TestStandbyCheckpoints.java    License: Apache License 2.0 6 votes vote down vote up
/**
 * Test for the case when the SBN is configured to checkpoint based
 * on a time period, but no transactions are happening on the
 * active. Thus, it would want to save a second checkpoint at the
 * same txid, which is a no-op. This test makes sure this doesn't
 * cause any problem.
 */
@Test(timeout = 300000)
public void testCheckpointWhenNoNewTransactionsHappened()
    throws Exception {
  // Checkpoint as fast as we can, in a tight loop.
  cluster.getConfiguration(1).setInt(
      DFSConfigKeys.DFS_NAMENODE_CHECKPOINT_PERIOD_KEY, 0);
  cluster.restartNameNode(1);
  nn1 = cluster.getNameNode(1);
 
  FSImage spyImage1 = NameNodeAdapter.spyOnFsImage(nn1);
  
  // We shouldn't save any checkpoints at txid=0
  Thread.sleep(1000);
  Mockito.verify(spyImage1, Mockito.never())
    .saveNamespace((FSNamesystem) Mockito.anyObject());
 
  // Roll the primary and wait for the standby to catch up
  HATestUtil.waitForStandbyToCatchUp(nn0, nn1);
  Thread.sleep(2000);
  
  // We should make exactly one checkpoint at this new txid. 
  Mockito.verify(spyImage1, Mockito.times(1)).saveNamespace(
      (FSNamesystem) Mockito.anyObject(), Mockito.eq(NameNodeFile.IMAGE),
      (Canceler) Mockito.anyObject());
}
 
Example 4
Source Project: big-c   Source File: TransferFsImage.java    License: Apache License 2.0 6 votes vote down vote up
/**
 * Requests that the NameNode download an image from this node.  Allows for
 * optional external cancelation.
 *
 * @param fsName the http address for the remote NN
 * @param conf Configuration
 * @param storage the storage directory to transfer the image from
 * @param nnf the NameNodeFile type of the image
 * @param txid the transaction ID of the image to be uploaded
 * @param canceler optional canceler to check for abort of upload
 * @throws IOException if there is an I/O error or cancellation
 */
public static void uploadImageFromStorage(URL fsName, Configuration conf,
    NNStorage storage, NameNodeFile nnf, long txid, Canceler canceler)
    throws IOException {
  URL url = new URL(fsName, ImageServlet.PATH_SPEC);
  long startTime = Time.monotonicNow();
  try {
    uploadImage(url, conf, storage, nnf, txid, canceler);
  } catch (HttpPutFailedException e) {
    if (e.getResponseCode() == HttpServletResponse.SC_CONFLICT) {
      // this is OK - this means that a previous attempt to upload
      // this checkpoint succeeded even though we thought it failed.
      LOG.info("Image upload with txid " + txid + 
          " conflicted with a previous image upload to the " +
          "same NameNode. Continuing...", e);
      return;
    } else {
      throw e;
    }
  }
  double xferSec = Math.max(
      ((float) (Time.monotonicNow() - startTime)) / 1000.0, 0.001);
  LOG.info("Uploaded image with txid " + txid + " to namenode at " + fsName
      + " in " + xferSec + " seconds");
}
 
Example 5
Source Project: big-c   Source File: TestStandbyCheckpoints.java    License: Apache License 2.0 6 votes vote down vote up
/**
 * Test for the case when the SBN is configured to checkpoint based
 * on a time period, but no transactions are happening on the
 * active. Thus, it would want to save a second checkpoint at the
 * same txid, which is a no-op. This test makes sure this doesn't
 * cause any problem.
 */
@Test(timeout = 300000)
public void testCheckpointWhenNoNewTransactionsHappened()
    throws Exception {
  // Checkpoint as fast as we can, in a tight loop.
  cluster.getConfiguration(1).setInt(
      DFSConfigKeys.DFS_NAMENODE_CHECKPOINT_PERIOD_KEY, 0);
  cluster.restartNameNode(1);
  nn1 = cluster.getNameNode(1);
 
  FSImage spyImage1 = NameNodeAdapter.spyOnFsImage(nn1);
  
  // We shouldn't save any checkpoints at txid=0
  Thread.sleep(1000);
  Mockito.verify(spyImage1, Mockito.never())
    .saveNamespace((FSNamesystem) Mockito.anyObject());
 
  // Roll the primary and wait for the standby to catch up
  HATestUtil.waitForStandbyToCatchUp(nn0, nn1);
  Thread.sleep(2000);
  
  // We should make exactly one checkpoint at this new txid. 
  Mockito.verify(spyImage1, Mockito.times(1)).saveNamespace(
      (FSNamesystem) Mockito.anyObject(), Mockito.eq(NameNodeFile.IMAGE),
      (Canceler) Mockito.anyObject());
}
 
Example 6
Source Project: hadoop-ozone   Source File: ContainerDataScanner.java    License: Apache License 2.0 5 votes vote down vote up
public ContainerDataScanner(ContainerScrubberConfiguration conf,
                            ContainerController controller,
                            HddsVolume volume) {
  this.controller = controller;
  this.volume = volume;
  dataScanInterval = conf.getDataScanInterval();
  throttler = new HddsDataTransferThrottler(conf.getBandwidthPerVolume());
  canceler = new Canceler();
  metrics = ContainerDataScrubberMetrics.create(volume.toString());
  setName(String.format(NAME_FORMAT, volume));
  setDaemon(true);
}
 
Example 7
Source Project: hadoop-ozone   Source File: KeyValueContainer.java    License: Apache License 2.0 5 votes vote down vote up
public boolean scanData(DataTransferThrottler throttler, Canceler canceler) {
  if (!shouldScanData()) {
    throw new IllegalStateException("The checksum verification can not be" +
        " done for container in state "
        + containerData.getState());
  }

  long containerId = containerData.getContainerID();
  KeyValueContainerCheck checker =
      new KeyValueContainerCheck(containerData.getMetadataPath(), config,
          containerId);

  return checker.fullCheck(throttler, canceler);
}
 
Example 8
Source Project: hadoop-ozone   Source File: TestContainerScrubberMetrics.java    License: Apache License 2.0 5 votes vote down vote up
private void setupMockContainer(
    Container<ContainerData> c, boolean shouldScanData,
    boolean scanMetaDataSuccess, boolean scanDataSuccess) {
  ContainerData data = mock(ContainerData.class);
  when(data.getContainerID()).thenReturn(containerIdSeq.getAndIncrement());
  when(c.getContainerData()).thenReturn(data);
  when(c.shouldScanData()).thenReturn(shouldScanData);
  when(c.scanMetaData()).thenReturn(scanMetaDataSuccess);
  when(c.scanData(any(DataTransferThrottler.class), any(Canceler.class)))
      .thenReturn(scanDataSuccess);
}
 
Example 9
Source Project: hadoop   Source File: SaveNamespaceContext.java    License: Apache License 2.0 5 votes vote down vote up
SaveNamespaceContext(
    FSNamesystem sourceNamesystem,
    long txid,
    Canceler canceller) {
  this.sourceNamesystem = sourceNamesystem;
  this.txid = txid;
  this.canceller = canceller;
}
 
Example 10
Source Project: hadoop   Source File: TransferFsImage.java    License: Apache License 2.0 5 votes vote down vote up
private static void writeFileToPutRequest(Configuration conf,
    HttpURLConnection connection, File imageFile, Canceler canceler)
    throws FileNotFoundException, IOException {
  connection.setRequestProperty(CONTENT_TYPE, "application/octet-stream");
  connection.setRequestProperty(CONTENT_TRANSFER_ENCODING, "binary");
  OutputStream output = connection.getOutputStream();
  FileInputStream input = new FileInputStream(imageFile);
  try {
    copyFileToStream(output, imageFile, input,
        ImageServlet.getThrottler(conf), canceler);
  } finally {
    IOUtils.closeStream(input);
    IOUtils.closeStream(output);
  }
}
 
Example 11
Source Project: hadoop   Source File: FSImage.java    License: Apache License 2.0 5 votes vote down vote up
/**
 * Save FSimage in the legacy format. This is not for NN consumption,
 * but for tools like OIV.
 */
public void saveLegacyOIVImage(FSNamesystem source, String targetDir,
    Canceler canceler) throws IOException {
  FSImageCompression compression =
      FSImageCompression.createCompression(conf);
  long txid = getLastAppliedOrWrittenTxId();
  SaveNamespaceContext ctx = new SaveNamespaceContext(source, txid,
      canceler);
  FSImageFormat.Saver saver = new FSImageFormat.Saver(ctx);
  String imageFileName = NNStorage.getLegacyOIVImageFileName(txid);
  File imageFile = new File(targetDir, imageFileName);
  saver.save(imageFile, compression);
  archivalManager.purgeOldLegacyOIVImages(targetDir, txid);
}
 
Example 12
Source Project: hadoop   Source File: FSImage.java    License: Apache License 2.0 5 votes vote down vote up
/**
 * Save the contents of the FS image to a new image file in each of the
 * current storage directories.
 */
public synchronized void saveNamespace(FSNamesystem source, NameNodeFile nnf,
    Canceler canceler) throws IOException {
  assert editLog != null : "editLog must be initialized";
  LOG.info("Save namespace ...");
  storage.attemptRestoreRemovedStorage();

  boolean editLogWasOpen = editLog.isSegmentOpen();
  
  if (editLogWasOpen) {
    editLog.endCurrentLogSegment(true);
  }
  long imageTxId = getLastAppliedOrWrittenTxId();
  if (!addToCheckpointing(imageTxId)) {
    throw new IOException(
        "FS image is being downloaded from another NN at txid " + imageTxId);
  }
  try {
    try {
      saveFSImageInAllDirs(source, nnf, imageTxId, canceler);
      storage.writeAll();
    } finally {
      if (editLogWasOpen) {
        editLog.startLogSegment(imageTxId + 1, true);
        // Take this opportunity to note the current transaction.
        // Even if the namespace save was cancelled, this marker
        // is only used to determine what transaction ID is required
        // for startup. So, it doesn't hurt to update it unnecessarily.
        storage.writeTransactionIdFileToStorage(imageTxId + 1);
      }
    }
  } finally {
    removeFromCheckpointing(imageTxId);
  }
}
 
Example 13
Source Project: hadoop   Source File: TestFSImageWithSnapshot.java    License: Apache License 2.0 5 votes vote down vote up
/** Save the fsimage to a temp file */
private File saveFSImageToTempFile() throws IOException {
  SaveNamespaceContext context = new SaveNamespaceContext(fsn, txid,
      new Canceler());
  FSImageFormatProtobuf.Saver saver = new FSImageFormatProtobuf.Saver(context);
  FSImageCompression compression = FSImageCompression.createCompression(conf);
  File imageFile = getImageFile(testDir, txid);
  fsn.readLock();
  try {
    saver.save(imageFile, compression);
  } finally {
    fsn.readUnlock();
  }
  return imageFile;
}
 
Example 14
Source Project: big-c   Source File: SaveNamespaceContext.java    License: Apache License 2.0 5 votes vote down vote up
SaveNamespaceContext(
    FSNamesystem sourceNamesystem,
    long txid,
    Canceler canceller) {
  this.sourceNamesystem = sourceNamesystem;
  this.txid = txid;
  this.canceller = canceller;
}
 
Example 15
Source Project: big-c   Source File: TransferFsImage.java    License: Apache License 2.0 5 votes vote down vote up
private static void writeFileToPutRequest(Configuration conf,
    HttpURLConnection connection, File imageFile, Canceler canceler)
    throws FileNotFoundException, IOException {
  connection.setRequestProperty(CONTENT_TYPE, "application/octet-stream");
  connection.setRequestProperty(CONTENT_TRANSFER_ENCODING, "binary");
  OutputStream output = connection.getOutputStream();
  FileInputStream input = new FileInputStream(imageFile);
  try {
    copyFileToStream(output, imageFile, input,
        ImageServlet.getThrottler(conf), canceler);
  } finally {
    IOUtils.closeStream(input);
    IOUtils.closeStream(output);
  }
}
 
Example 16
Source Project: big-c   Source File: FSImage.java    License: Apache License 2.0 5 votes vote down vote up
/**
 * Save FSimage in the legacy format. This is not for NN consumption,
 * but for tools like OIV.
 */
public void saveLegacyOIVImage(FSNamesystem source, String targetDir,
    Canceler canceler) throws IOException {
  FSImageCompression compression =
      FSImageCompression.createCompression(conf);
  long txid = getLastAppliedOrWrittenTxId();
  SaveNamespaceContext ctx = new SaveNamespaceContext(source, txid,
      canceler);
  FSImageFormat.Saver saver = new FSImageFormat.Saver(ctx);
  String imageFileName = NNStorage.getLegacyOIVImageFileName(txid);
  File imageFile = new File(targetDir, imageFileName);
  saver.save(imageFile, compression);
  archivalManager.purgeOldLegacyOIVImages(targetDir, txid);
}
 
Example 17
Source Project: big-c   Source File: FSImage.java    License: Apache License 2.0 5 votes vote down vote up
/**
 * Save the contents of the FS image to a new image file in each of the
 * current storage directories.
 */
public synchronized void saveNamespace(FSNamesystem source, NameNodeFile nnf,
    Canceler canceler) throws IOException {
  assert editLog != null : "editLog must be initialized";
  LOG.info("Save namespace ...");
  storage.attemptRestoreRemovedStorage();

  boolean editLogWasOpen = editLog.isSegmentOpen();
  
  if (editLogWasOpen) {
    editLog.endCurrentLogSegment(true);
  }
  long imageTxId = getLastAppliedOrWrittenTxId();
  if (!addToCheckpointing(imageTxId)) {
    throw new IOException(
        "FS image is being downloaded from another NN at txid " + imageTxId);
  }
  try {
    try {
      saveFSImageInAllDirs(source, nnf, imageTxId, canceler);
      storage.writeAll();
    } finally {
      if (editLogWasOpen) {
        editLog.startLogSegment(imageTxId + 1, true);
        // Take this opportunity to note the current transaction.
        // Even if the namespace save was cancelled, this marker
        // is only used to determine what transaction ID is required
        // for startup. So, it doesn't hurt to update it unnecessarily.
        storage.writeTransactionIdFileToStorage(imageTxId + 1);
      }
    }
  } finally {
    removeFromCheckpointing(imageTxId);
  }
}
 
Example 18
Source Project: big-c   Source File: TestFSImageWithSnapshot.java    License: Apache License 2.0 5 votes vote down vote up
/** Save the fsimage to a temp file */
private File saveFSImageToTempFile() throws IOException {
  SaveNamespaceContext context = new SaveNamespaceContext(fsn, txid,
      new Canceler());
  FSImageFormatProtobuf.Saver saver = new FSImageFormatProtobuf.Saver(context);
  FSImageCompression compression = FSImageCompression.createCompression(conf);
  File imageFile = getImageFile(testDir, txid);
  fsn.readLock();
  try {
    saver.save(imageFile, compression);
  } finally {
    fsn.readUnlock();
  }
  return imageFile;
}
 
Example 19
Source Project: hadoop-ozone   Source File: ContainerDataScanner.java    License: Apache License 2.0 4 votes vote down vote up
@Override
public synchronized void throttle(long numOfBytes, Canceler c) {
  ContainerDataScanner.this.metrics.incNumBytesScanned(numOfBytes);
  super.throttle(numOfBytes, c);
}
 
Example 20
Source Project: hadoop-ozone   Source File: KeyValueContainerCheck.java    License: Apache License 2.0 4 votes vote down vote up
private void scanData(DataTransferThrottler throttler, Canceler canceler)
    throws IOException {
  /*
   * Check the integrity of the DB inside each container.
   * 1. iterate over each key (Block) and locate the chunks for the block
   * 2. garbage detection (TBD): chunks which exist in the filesystem,
   *    but not in the DB. This function will be implemented in HDDS-1202
   * 3. chunk checksum verification.
   */
  Preconditions.checkState(onDiskContainerData != null,
      "invoke loadContainerData prior to calling this function");

  File metaDir = new File(metadataPath);
  File dbFile = KeyValueContainerLocationUtil
      .getContainerDBFile(metaDir, containerID);

  if (!dbFile.exists() || !dbFile.canRead()) {
    String dbFileErrorMsg = "Unable to access DB File [" + dbFile.toString()
        + "] for Container [" + containerID + "] metadata path ["
        + metadataPath + "]";
    throw new IOException(dbFileErrorMsg);
  }

  onDiskContainerData.setDbFile(dbFile);

  ChunkLayOutVersion layout = onDiskContainerData.getLayOutVersion();

  try(ReferenceCountedDB db =
          BlockUtils.getDB(onDiskContainerData, checkConfig);
      KeyValueBlockIterator kvIter = new KeyValueBlockIterator(containerID,
          new File(onDiskContainerData.getContainerPath()))) {

    while(kvIter.hasNext()) {
      BlockData block = kvIter.nextBlock();
      for(ContainerProtos.ChunkInfo chunk : block.getChunks()) {
        File chunkFile = layout.getChunkFile(onDiskContainerData,
            block.getBlockID(), ChunkInfo.getFromProtoBuf(chunk));

        if (!chunkFile.exists()) {
          // concurrent mutation in Block DB? lookup the block again.
          byte[] bdata = db.getStore().get(
              Longs.toByteArray(block.getBlockID().getLocalID()));
          if (bdata != null) {
            throw new IOException("Missing chunk file "
                + chunkFile.getAbsolutePath());
          }
        } else if (chunk.getChecksumData().getType()
            != ContainerProtos.ChecksumType.NONE) {
          verifyChecksum(block, chunk, chunkFile, layout, throttler,
              canceler);
        }
      }
    }
  }
}
 
Example 21
Source Project: hadoop-ozone   Source File: KeyValueContainerCheck.java    License: Apache License 2.0 4 votes vote down vote up
private static void verifyChecksum(BlockData block,
    ContainerProtos.ChunkInfo chunk, File chunkFile,
    ChunkLayOutVersion layout,
    DataTransferThrottler throttler, Canceler canceler) throws IOException {
  ChecksumData checksumData =
      ChecksumData.getFromProtoBuf(chunk.getChecksumData());
  int checksumCount = checksumData.getChecksums().size();
  int bytesPerChecksum = checksumData.getBytesPerChecksum();
  Checksum cal = new Checksum(checksumData.getChecksumType(),
      bytesPerChecksum);
  ByteBuffer buffer = ByteBuffer.allocate(bytesPerChecksum);
  long bytesRead = 0;
  try (FileChannel channel = FileChannel.open(chunkFile.toPath(),
      ChunkUtils.READ_OPTIONS, ChunkUtils.NO_ATTRIBUTES)) {
    if (layout == ChunkLayOutVersion.FILE_PER_BLOCK) {
      channel.position(chunk.getOffset());
    }
    for (int i = 0; i < checksumCount; i++) {
      // limit last read for FILE_PER_BLOCK, to avoid reading next chunk
      if (layout == ChunkLayOutVersion.FILE_PER_BLOCK &&
          i == checksumCount - 1 &&
          chunk.getLen() % bytesPerChecksum != 0) {
        buffer.limit((int) (chunk.getLen() % bytesPerChecksum));
      }

      int v = channel.read(buffer);
      if (v == -1) {
        break;
      }
      bytesRead += v;
      buffer.flip();

      throttler.throttle(v, canceler);

      ByteString expected = checksumData.getChecksums().get(i);
      ByteString actual = cal.computeChecksum(buffer)
          .getChecksums().get(0);
      if (!expected.equals(actual)) {
        throw new OzoneChecksumException(String
            .format("Inconsistent read for chunk=%s" +
                " checksum item %d" +
                " expected checksum %s" +
                " actual checksum %s" +
                " for block %s",
                ChunkInfo.getFromProtoBuf(chunk),
                i,
                Arrays.toString(expected.toByteArray()),
                Arrays.toString(actual.toByteArray()),
                block.getBlockID()));
      }

    }
    if (bytesRead != chunk.getLen()) {
      throw new OzoneChecksumException(String
          .format("Inconsistent read for chunk=%s expected length=%d"
                  + " actual length=%d for block %s",
              chunk.getChunkName(),
              chunk.getLen(), bytesRead, block.getBlockID()));
    }
  }
}
 
Example 22
Source Project: NNAnalytics   Source File: VersionContext.java    License: Apache License 2.0 4 votes vote down vote up
@Override // VersionInterface
public void saveLegacyOivImage(String dir) throws IOException {
  namesystem.getFSImage().saveLegacyOIVImage(namesystem, dir, new Canceler());
}
 
Example 23
Source Project: NNAnalytics   Source File: VersionContext.java    License: Apache License 2.0 4 votes vote down vote up
@Override // VersionInterface
public void saveLegacyOivImage(String dir) throws IOException {
  namesystem.getFSImage().saveLegacyOIVImage(namesystem, dir, new Canceler());
}
 
Example 24
Source Project: NNAnalytics   Source File: VersionContext.java    License: Apache License 2.0 4 votes vote down vote up
@Override // VersionInterface
public void saveLegacyOivImage(String dir) throws IOException {
  namesystem.getFSImage().saveLegacyOIVImage(namesystem, dir, new Canceler());
}
 
Example 25
Source Project: NNAnalytics   Source File: VersionContext.java    License: Apache License 2.0 4 votes vote down vote up
@Override // VersionInterface
public void saveLegacyOivImage(String dir) throws IOException {
  namesystem.getFSImage().saveLegacyOIVImage(namesystem, dir, new Canceler());
}
 
Example 26
Source Project: NNAnalytics   Source File: VersionContext.java    License: Apache License 2.0 4 votes vote down vote up
@Override // VersionInterface
public void saveLegacyOivImage(String dir) throws IOException {
  namesystem.getFSImage().saveLegacyOIVImage(namesystem, dir, new Canceler());
}
 
Example 27
Source Project: hadoop   Source File: TransferFsImage.java    License: Apache License 2.0 4 votes vote down vote up
private static void copyFileToStream(OutputStream out, File localfile,
    FileInputStream infile, DataTransferThrottler throttler,
    Canceler canceler) throws IOException {
  byte buf[] = new byte[HdfsConstants.IO_FILE_BUFFER_SIZE];
  try {
    CheckpointFaultInjector.getInstance()
        .aboutToSendFile(localfile);

    if (CheckpointFaultInjector.getInstance().
          shouldSendShortFile(localfile)) {
        // Test sending image shorter than localfile
        long len = localfile.length();
        buf = new byte[(int)Math.min(len/2, HdfsConstants.IO_FILE_BUFFER_SIZE)];
        // This will read at most half of the image
        // and the rest of the image will be sent over the wire
        infile.read(buf);
    }
    int num = 1;
    while (num > 0) {
      if (canceler != null && canceler.isCancelled()) {
        throw new SaveNamespaceCancelledException(
          canceler.getCancellationReason());
      }
      num = infile.read(buf);
      if (num <= 0) {
        break;
      }
      if (CheckpointFaultInjector.getInstance()
            .shouldCorruptAByte(localfile)) {
        // Simulate a corrupted byte on the wire
        LOG.warn("SIMULATING A CORRUPT BYTE IN IMAGE TRANSFER!");
        buf[0]++;
      }
      
      out.write(buf, 0, num);
      if (throttler != null) {
        throttler.throttle(num, canceler);
      }
    }
  } catch (EofException e) {
    LOG.info("Connection closed by client");
    out = null; // so we don't close in the finally
  } finally {
    if (out != null) {
      out.close();
    }
  }
}
 
Example 28
Source Project: hadoop   Source File: FSImage.java    License: Apache License 2.0 4 votes vote down vote up
private synchronized void saveFSImageInAllDirs(FSNamesystem source,
    NameNodeFile nnf, long txid, Canceler canceler) throws IOException {
  StartupProgress prog = NameNode.getStartupProgress();
  prog.beginPhase(Phase.SAVING_CHECKPOINT);
  if (storage.getNumStorageDirs(NameNodeDirType.IMAGE) == 0) {
    throw new IOException("No image directories available!");
  }
  if (canceler == null) {
    canceler = new Canceler();
  }
  SaveNamespaceContext ctx = new SaveNamespaceContext(
      source, txid, canceler);
  
  try {
    List<Thread> saveThreads = new ArrayList<Thread>();
    // save images into current
    for (Iterator<StorageDirectory> it
           = storage.dirIterator(NameNodeDirType.IMAGE); it.hasNext();) {
      StorageDirectory sd = it.next();
      FSImageSaver saver = new FSImageSaver(ctx, sd, nnf);
      Thread saveThread = new Thread(saver, saver.toString());
      saveThreads.add(saveThread);
      saveThread.start();
    }
    waitForThreads(saveThreads);
    saveThreads.clear();
    storage.reportErrorsOnDirectories(ctx.getErrorSDs());

    if (storage.getNumStorageDirs(NameNodeDirType.IMAGE) == 0) {
      throw new IOException(
        "Failed to save in any storage directories while saving namespace.");
    }
    if (canceler.isCancelled()) {
      deleteCancelledCheckpoint(txid);
      ctx.checkCancelled(); // throws
      assert false : "should have thrown above!";
    }

    renameCheckpoint(txid, NameNodeFile.IMAGE_NEW, nnf, false);

    // Since we now have a new checkpoint, we can clean up some
    // old edit logs and checkpoints.
    purgeOldStorage(nnf);
  } finally {
    // Notify any threads waiting on the checkpoint to be canceled
    // that it is complete.
    ctx.markComplete();
    ctx = null;
  }
  prog.endPhase(Phase.SAVING_CHECKPOINT);
}
 
Example 29
Source Project: hadoop   Source File: SecondaryNameNode.java    License: Apache License 2.0 4 votes vote down vote up
/**
 * Create a new checkpoint
 * @return if the image is fetched from primary or not
 */
@VisibleForTesting
@SuppressWarnings("deprecated")
public boolean doCheckpoint() throws IOException {
  checkpointImage.ensureCurrentDirExists();
  NNStorage dstStorage = checkpointImage.getStorage();
  
  // Tell the namenode to start logging transactions in a new edit file
  // Returns a token that would be used to upload the merged image.
  CheckpointSignature sig = namenode.rollEditLog();
  
  boolean loadImage = false;
  boolean isFreshCheckpointer = (checkpointImage.getNamespaceID() == 0);
  boolean isSameCluster =
      (dstStorage.versionSupportsFederation(NameNodeLayoutVersion.FEATURES)
          && sig.isSameCluster(checkpointImage)) ||
      (!dstStorage.versionSupportsFederation(NameNodeLayoutVersion.FEATURES)
          && sig.namespaceIdMatches(checkpointImage));
  if (isFreshCheckpointer ||
      (isSameCluster &&
       !sig.storageVersionMatches(checkpointImage.getStorage()))) {
    // if we're a fresh 2NN, or if we're on the same cluster and our storage
    // needs an upgrade, just take the storage info from the server.
    dstStorage.setStorageInfo(sig);
    dstStorage.setClusterID(sig.getClusterID());
    dstStorage.setBlockPoolID(sig.getBlockpoolID());
    loadImage = true;
  }
  sig.validateStorageInfo(checkpointImage);

  // error simulation code for junit test
  CheckpointFaultInjector.getInstance().afterSecondaryCallsRollEditLog();

  RemoteEditLogManifest manifest =
    namenode.getEditLogManifest(sig.mostRecentCheckpointTxId + 1);

  // Fetch fsimage and edits. Reload the image if previous merge failed.
  loadImage |= downloadCheckpointFiles(
      fsName, checkpointImage, sig, manifest) |
      checkpointImage.hasMergeError();
  try {
    doMerge(sig, manifest, loadImage, checkpointImage, namesystem);
  } catch (IOException ioe) {
    // A merge error occurred. The in-memory file system state may be
    // inconsistent, so the image and edits need to be reloaded.
    checkpointImage.setMergeError();
    throw ioe;
  }
  // Clear any error since merge was successful.
  checkpointImage.clearMergeError();

  
  //
  // Upload the new image into the NameNode. Then tell the Namenode
  // to make this new uploaded image as the most current image.
  //
  long txid = checkpointImage.getLastAppliedTxId();
  TransferFsImage.uploadImageFromStorage(fsName, conf, dstStorage,
      NameNodeFile.IMAGE, txid);

  // error simulation code for junit test
  CheckpointFaultInjector.getInstance().afterSecondaryUploadsNewImage();

  LOG.warn("Checkpoint done. New Image Size: " 
           + dstStorage.getFsImageName(txid).length());

  if (legacyOivImageDir != null && !legacyOivImageDir.isEmpty()) {
    try {
      checkpointImage.saveLegacyOIVImage(namesystem, legacyOivImageDir,
          new Canceler());
    } catch (IOException e) {
      LOG.warn("Failed to write legacy OIV image: ", e);
    }
  }
  return loadImage;
}
 
Example 30
Source Project: hadoop   Source File: TestStandbyCheckpoints.java    License: Apache License 2.0 4 votes vote down vote up
/**
 * Make sure that clients will receive StandbyExceptions even when a
 * checkpoint is in progress on the SBN, and therefore the StandbyCheckpointer
 * thread will have FSNS lock. Regression test for HDFS-4591.
 */
@Test(timeout=300000)
public void testStandbyExceptionThrownDuringCheckpoint() throws Exception {
  
  // Set it up so that we know when the SBN checkpoint starts and ends.
  FSImage spyImage1 = NameNodeAdapter.spyOnFsImage(nn1);
  DelayAnswer answerer = new DelayAnswer(LOG);
  Mockito.doAnswer(answerer).when(spyImage1)
      .saveNamespace(Mockito.any(FSNamesystem.class),
          Mockito.eq(NameNodeFile.IMAGE), Mockito.any(Canceler.class));

  // Perform some edits and wait for a checkpoint to start on the SBN.
  doEdits(0, 1000);
  nn0.getRpcServer().rollEditLog();
  answerer.waitForCall();
  assertTrue("SBN is not performing checkpoint but it should be.",
      answerer.getFireCount() == 1 && answerer.getResultCount() == 0);
  
  // Make sure that the lock has actually been taken by the checkpointing
  // thread.
  ThreadUtil.sleepAtLeastIgnoreInterrupts(1000);
  try {
    // Perform an RPC to the SBN and make sure it throws a StandbyException.
    nn1.getRpcServer().getFileInfo("/");
    fail("Should have thrown StandbyException, but instead succeeded.");
  } catch (StandbyException se) {
    GenericTestUtils.assertExceptionContains("is not supported", se);
  }

  // Make sure new incremental block reports are processed during
  // checkpointing on the SBN.
  assertEquals(0, cluster.getNamesystem(1).getPendingDataNodeMessageCount());
  doCreate();
  Thread.sleep(1000);
  assertTrue(cluster.getNamesystem(1).getPendingDataNodeMessageCount() > 0);
  
  // Make sure that the checkpoint is still going on, implying that the client
  // RPC to the SBN happened during the checkpoint.
  assertTrue("SBN should have still been checkpointing.",
      answerer.getFireCount() == 1 && answerer.getResultCount() == 0);
  answerer.proceed();
  answerer.waitForResult();
  assertTrue("SBN should have finished checkpointing.",
      answerer.getFireCount() == 1 && answerer.getResultCount() == 1);
}