Java Code Examples for org.apache.hadoop.fs.FileSystem#resolvePath()

The following examples show how to use org.apache.hadoop.fs.FileSystem#resolvePath() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: UpdateColumnJob.java    From indexr with Apache License 2.0 6 votes vote down vote up
@Override
public List<InputSplit> getSplits(final JobContext jobContext) throws IOException, InterruptedException {
    FileSystem fileSystem = FileSystem.get(jobContext.getConfiguration());
    Config config = JsonUtil.fromJson(jobContext.getConfiguration().get(CONFKEY), Config.class);

    List<InputSplit> segmentSplits = new ArrayList<>();
    Path rootPath = new Path(config.tableRoot);
    rootPath = fileSystem.resolvePath(rootPath);
    String dirStr = rootPath.toString() + "/";
    SegmentHelper.literalAllSegments(fileSystem, rootPath, f -> {
        String segmentPath = f.getPath().toString();
        long segmentSize = f.getLen();
        segmentSplits.add(new SegmentSplit(new SegmentFile(
                segmentPath,
                StringUtils.removeStart(segmentPath, dirStr),
                segmentSize)));
    });
    return segmentSplits;
}
 
Example 2
Source File: HRegionFileSystem.java    From hbase with Apache License 2.0 6 votes vote down vote up
/**
 * Bulk load: Add a specified store file to the specified family.
 * If the source file is on the same different file-system is moved from the
 * source location to the destination location, otherwise is copied over.
 *
 * @param familyName Family that will gain the file
 * @param srcPath {@link Path} to the file to import
 * @param seqNum Bulk Load sequence number
 * @return The destination {@link Path} of the bulk loaded file
 * @throws IOException
 */
Pair<Path, Path> bulkLoadStoreFile(final String familyName, Path srcPath, long seqNum)
    throws IOException {
  // Copy the file if it's on another filesystem
  FileSystem srcFs = srcPath.getFileSystem(conf);
  srcPath = srcFs.resolvePath(srcPath);
  FileSystem realSrcFs = srcPath.getFileSystem(conf);
  FileSystem desFs = fs instanceof HFileSystem ? ((HFileSystem)fs).getBackingFs() : fs;

  // We can't compare FileSystem instances as equals() includes UGI instance
  // as part of the comparison and won't work when doing SecureBulkLoad
  // TODO deal with viewFS
  if (!FSUtils.isSameHdfs(conf, realSrcFs, desFs)) {
    LOG.info("Bulk-load file " + srcPath + " is on different filesystem than " +
        "the destination store. Copying file over to destination filesystem.");
    Path tmpPath = createTempName();
    FileUtil.copy(realSrcFs, srcPath, fs, tmpPath, false, conf);
    LOG.info("Copied " + srcPath + " to temporary path on destination filesystem: " + tmpPath);
    srcPath = tmpPath;
  }

  return new Pair<>(srcPath, preCommitStoreFile(familyName, srcPath, seqNum, true));
}
 
Example 3
Source File: TezCommonUtils.java    From incubator-tez with Apache License 2.0 6 votes vote down vote up
/**
 * <p>
 * This function returns the staging directory defined in the config with
 * property name <code>TezConfiguration.TEZ_AM_STAGING_DIR</code>. If the
 * property is not defined in the conf, Tez uses the value defined as
 * <code>TezConfiguration.TEZ_AM_STAGING_DIR_DEFAULT</code>. In addition, the
 * function makes sure if the staging directory exists. If not, it creates the
 * directory with permission <code>TEZ_AM_DIR_PERMISSION</code>.
 * </p>
 * 
 * @param conf
 *          TEZ configuration
 * @return Fully qualified staging directory
 */
public static Path getTezBaseStagingPath(Configuration conf) {
  String stagingDirStr = conf.get(TezConfiguration.TEZ_AM_STAGING_DIR,
      TezConfiguration.TEZ_AM_STAGING_DIR_DEFAULT);
  Path baseStagingDir;
  try {
    Path p = new Path(stagingDirStr);
    FileSystem fs = p.getFileSystem(conf);
    if (!fs.exists(p)) {
      mkDirForAM(fs, p);
      LOG.info("Stage directory " + p + " doesn't exist and is created");
    }
    baseStagingDir = fs.resolvePath(p);
  } catch (IOException e) {
    throw new TezUncheckedException(e);
  }
  return baseStagingDir;
}
 
Example 4
Source File: TezCommonUtils.java    From tez with Apache License 2.0 6 votes vote down vote up
/**
 * <p>
 * This function returns the staging directory defined in the config with
 * property name <code>TezConfiguration.TEZ_AM_STAGING_DIR</code>. If the
 * property is not defined in the conf, Tez uses the value defined as
 * <code>TezConfiguration.TEZ_AM_STAGING_DIR_DEFAULT</code>. In addition, the
 * function makes sure if the staging directory exists. If not, it creates the
 * directory with permission <code>TEZ_AM_DIR_PERMISSION</code>.
 * </p>
 * 
 * @param conf
 *          TEZ configuration
 * @return Fully qualified staging directory
 */
public static Path getTezBaseStagingPath(Configuration conf) {
  String stagingDirStr = conf.get(TezConfiguration.TEZ_AM_STAGING_DIR,
      TezConfiguration.TEZ_AM_STAGING_DIR_DEFAULT);
  Path baseStagingDir;
  try {
    Path p = new Path(stagingDirStr);
    FileSystem fs = p.getFileSystem(conf);
    if (!fs.exists(p)) {
      mkDirForAM(fs, p);
      LOG.info("Stage directory " + p + " doesn't exist and is created");
    }
    baseStagingDir = fs.resolvePath(p);
  } catch (IOException e) {
    throw new TezUncheckedException(e);
  }
  return baseStagingDir;
}
 
Example 5
Source File: FileSegmentManager.java    From indexr with Apache License 2.0 5 votes vote down vote up
public FileSegmentManager(String tableName, FileSystem fileSystem, Path segmentRootPath) throws Exception {
    this.tableName = tableName;
    this.fileSystem = fileSystem;

    this.segmentRootPath = segmentRootPath;
    if (!fileSystem.exists(segmentRootPath)) {
        fileSystem.mkdirs(segmentRootPath);
    }
    this.segmentRootPath = fileSystem.resolvePath(segmentRootPath);
}
 
Example 6
Source File: SegmentHelper.java    From indexr with Apache License 2.0 5 votes vote down vote up
public static List<String> listSegmentNames(FileSystem fileSystem, Path dir) throws IOException {
    dir = fileSystem.resolvePath(dir);
    String dirStr = dir.toString() + "/";
    List<String> names = new ArrayList<>(2048);
    literalAllSegments(fileSystem, dir, f -> {
        String name = StringUtils.removeStart(f.getPath().toString(), dirStr);
        names.add(name);
    });
    return names;
}
 
Example 7
Source File: TestMRInputHelpers.java    From tez with Apache License 2.0 5 votes vote down vote up
@Test(timeout = 5000)
public void testInputSplitLocalResourceCreationWithDifferentFS() throws Exception {
  FileSystem localFs = FileSystem.getLocal(conf);
  Path LOCAL_TEST_ROOT_DIR = new Path("target"
      + Path.SEPARATOR + TestMRHelpers.class.getName() + "-localtmpDir");

  try {
    localFs.mkdirs(LOCAL_TEST_ROOT_DIR);

    Path splitsDir = localFs.resolvePath(LOCAL_TEST_ROOT_DIR);

    DataSourceDescriptor dataSource = generateDataSourceDescriptorMapRed(splitsDir);

    Map<String, LocalResource> localResources = dataSource.getAdditionalLocalFiles();

    Assert.assertEquals(2, localResources.size());
    Assert.assertTrue(localResources.containsKey(
        MRInputHelpers.JOB_SPLIT_RESOURCE_NAME));
    Assert.assertTrue(localResources.containsKey(
        MRInputHelpers.JOB_SPLIT_METAINFO_RESOURCE_NAME));

    for (LocalResource lr : localResources.values()) {
      Assert.assertFalse(lr.getResource().getScheme().contains(remoteFs.getScheme()));
    }
  } finally {
    localFs.delete(LOCAL_TEST_ROOT_DIR, true);
  }
}
 
Example 8
Source File: MRApps.java    From hadoop with Apache License 2.0 4 votes vote down vote up
private static void parseDistributedCacheArtifacts(
    Configuration conf,
    Map<String, LocalResource> localResources,
    LocalResourceType type,
    URI[] uris, long[] timestamps, long[] sizes, boolean visibilities[])
throws IOException {

  if (uris != null) {
    // Sanity check
    if ((uris.length != timestamps.length) || (uris.length != sizes.length) ||
        (uris.length != visibilities.length)) {
      throw new IllegalArgumentException("Invalid specification for " +
          "distributed-cache artifacts of type " + type + " :" +
          " #uris=" + uris.length +
          " #timestamps=" + timestamps.length +
          " #visibilities=" + visibilities.length
          );
    }
    
    for (int i = 0; i < uris.length; ++i) {
      URI u = uris[i];
      Path p = new Path(u);
      FileSystem remoteFS = p.getFileSystem(conf);
      p = remoteFS.resolvePath(p.makeQualified(remoteFS.getUri(),
          remoteFS.getWorkingDirectory()));
      // Add URI fragment or just the filename
      Path name = new Path((null == u.getFragment())
        ? p.getName()
        : u.getFragment());
      if (name.isAbsolute()) {
        throw new IllegalArgumentException("Resource name must be relative");
      }
      String linkName = name.toUri().getPath();
      LocalResource orig = localResources.get(linkName);
      org.apache.hadoop.yarn.api.records.URL url = 
        ConverterUtils.getYarnUrlFromURI(p.toUri());
      if(orig != null && !orig.getResource().equals(url)) {
        LOG.warn(
            getResourceDescription(orig.getType()) + 
            toString(orig.getResource()) + " conflicts with " + 
            getResourceDescription(type) + toString(url) + 
            " This will be an error in Hadoop 2.0");
        continue;
      }
      localResources.put(linkName, LocalResource.newInstance(ConverterUtils
        .getYarnUrlFromURI(p.toUri()), type, visibilities[i]
          ? LocalResourceVisibility.PUBLIC : LocalResourceVisibility.PRIVATE,
        sizes[i], timestamps[i]));
    }
  }
}
 
Example 9
Source File: DFSck.java    From hadoop with Apache License 2.0 4 votes vote down vote up
private Path getResolvedPath(String dir) throws IOException {
  Configuration conf = getConf();
  Path dirPath = new Path(dir);
  FileSystem fs = dirPath.getFileSystem(conf);
  return fs.resolvePath(dirPath);
}
 
Example 10
Source File: MRApps.java    From big-c with Apache License 2.0 4 votes vote down vote up
private static void parseDistributedCacheArtifacts(
    Configuration conf,
    Map<String, LocalResource> localResources,
    LocalResourceType type,
    URI[] uris, long[] timestamps, long[] sizes, boolean visibilities[])
throws IOException {

  if (uris != null) {
    // Sanity check
    if ((uris.length != timestamps.length) || (uris.length != sizes.length) ||
        (uris.length != visibilities.length)) {
      throw new IllegalArgumentException("Invalid specification for " +
          "distributed-cache artifacts of type " + type + " :" +
          " #uris=" + uris.length +
          " #timestamps=" + timestamps.length +
          " #visibilities=" + visibilities.length
          );
    }
    
    for (int i = 0; i < uris.length; ++i) {
      URI u = uris[i];
      Path p = new Path(u);
      FileSystem remoteFS = p.getFileSystem(conf);
      p = remoteFS.resolvePath(p.makeQualified(remoteFS.getUri(),
          remoteFS.getWorkingDirectory()));
      // Add URI fragment or just the filename
      Path name = new Path((null == u.getFragment())
        ? p.getName()
        : u.getFragment());
      if (name.isAbsolute()) {
        throw new IllegalArgumentException("Resource name must be relative");
      }
      String linkName = name.toUri().getPath();
      LocalResource orig = localResources.get(linkName);
      org.apache.hadoop.yarn.api.records.URL url = 
        ConverterUtils.getYarnUrlFromURI(p.toUri());
      if(orig != null && !orig.getResource().equals(url)) {
        LOG.warn(
            getResourceDescription(orig.getType()) + 
            toString(orig.getResource()) + " conflicts with " + 
            getResourceDescription(type) + toString(url) + 
            " This will be an error in Hadoop 2.0");
        continue;
      }
      localResources.put(linkName, LocalResource.newInstance(ConverterUtils
        .getYarnUrlFromURI(p.toUri()), type, visibilities[i]
          ? LocalResourceVisibility.PUBLIC : LocalResourceVisibility.PRIVATE,
        sizes[i], timestamps[i]));
    }
  }
}
 
Example 11
Source File: DFSck.java    From big-c with Apache License 2.0 4 votes vote down vote up
private Path getResolvedPath(String dir) throws IOException {
  Configuration conf = getConf();
  Path dirPath = new Path(dir);
  FileSystem fs = dirPath.getFileSystem(conf);
  return fs.resolvePath(dirPath);
}
 
Example 12
Source File: FileSegmentPool.java    From indexr with Apache License 2.0 4 votes vote down vote up
public FileSegmentPool(String tableName,
                       FileSystem fileSystem,
                       Path segmentRootPath,
                       java.nio.file.Path localDataRoot,
                       ScheduledExecutorService notifyService) throws Exception {
    super(tableName, fileSystem, segmentRootPath);
    if (!fileSystem.exists(segmentRootPath)) {
        fileSystem.mkdirs(segmentRootPath);
    }
    this.segmentRootPath = fileSystem.resolvePath(segmentRootPath);
    this.segmentRootPathStr = segmentRootPath.toString() + "/";
    this.fileSystem = fileSystem;
    this.localCachePath = IndexRConfig.localCacheSegmentFdPath(localDataRoot, tableName);
    this.updateFilePath = IndexRConfig.segmentUpdateFilePath(segmentRootPath);

    if (!fileSystem.exists(updateFilePath)) {
        fileSystem.create(updateFilePath, true);
    }
    if (!Files.exists(localCachePath.getParent())) {
        Files.createDirectories(localCachePath.getParent());
    }

    // Load segments before doing any query.
    boolean ok = Try.on(this::loadFromLocalCache,
            1, logger,
            String.format("Load %s segmentFds from local cache failed", tableName));
    if (!ok) {
        mustRefresh = true;
    }

    this.refreshSegment = notifyService.scheduleWithFixedDelay(
            () -> this.refresh(false),
            TimeUnit.SECONDS.toMillis(1),
            RefreshSegmentPeriod + random.nextInt(1000),
            TimeUnit.MILLISECONDS);
    this.refreshLocality = notifyService.scheduleWithFixedDelay(
            this::refreshLocalities,
            TimeUnit.SECONDS.toMillis(1) + random.nextInt(5000),
            RefreshLocalityPeriod + random.nextInt(5000),
            TimeUnit.MILLISECONDS);
}