org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream Java Examples
The following examples show how to use
org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream.
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: GzipPayloadCoding.java From packagedrone with Eclipse Public License 1.0 | 6 votes |
@Override public OutputStream createOutputStream ( final OutputStream out, final Optional<String> optionalFlags ) throws IOException { final String flags; final int compressionLevel; if ( optionalFlags.isPresent () && ( flags = optionalFlags.get () ).length () > 0 ) { compressionLevel = Integer.parseInt ( flags.substring ( 0, 1 ) ); } else { compressionLevel = Deflater.BEST_COMPRESSION; } final GzipParameters parameters = new GzipParameters (); parameters.setCompressionLevel ( compressionLevel ); return new GzipCompressorOutputStream ( out, parameters ); }
Example #2
Source File: LineWriter.java From wikiforia with GNU General Public License v2.0 | 6 votes |
/** * Open a concurrent gzip compressed line writer (fastest compression) * @param target target location * @param limit the limit per split * @param maxWriters the maximum number of writers * @return parallel writer */ public static ParallelSplitWriter<String> openFastGzipParallelWriter(File target, final int limit, final int maxWriters) { return new ParallelSplitWriter<String>(target, maxWriters) { @Override protected Writer<String> newWriter(File path) { GzipParameters parameters = new GzipParameters(); parameters.setCompressionLevel(Deflater.BEST_SPEED); parameters.setFilename(path.getName()); try { return new LineWriter(new GzipCompressorOutputStream(new FileOutputStream(path.getAbsolutePath() + ".gz"), parameters), limit); } catch (IOException e) { throw new IOError(e); } } }; }
Example #3
Source File: StreamUtils.java From incubator-gobblin with Apache License 2.0 | 6 votes |
/** * Similiar to {@link #tar(FileSystem, Path, Path)} except the source and destination {@link FileSystem} can be different. * * @see #tar(FileSystem, Path, Path) */ public static void tar(FileSystem sourceFs, FileSystem destFs, Path sourcePath, Path destPath) throws IOException { try (FSDataOutputStream fsDataOutputStream = destFs.create(destPath); TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream( new GzipCompressorOutputStream(fsDataOutputStream), ConfigurationKeys.DEFAULT_CHARSET_ENCODING.name())) { FileStatus fileStatus = sourceFs.getFileStatus(sourcePath); if (sourceFs.isDirectory(sourcePath)) { dirToTarArchiveOutputStreamRecursive(fileStatus, sourceFs, Optional.<Path> absent(), tarArchiveOutputStream); } else { try (FSDataInputStream fsDataInputStream = sourceFs.open(sourcePath)) { fileToTarArchiveOutputStream(fileStatus, fsDataInputStream, new Path(sourcePath.getName()), tarArchiveOutputStream); } } } }
Example #4
Source File: DefaultArchiveExtractorTest.java From flow with Apache License 2.0 | 6 votes |
@Test(expected = ArchiveExtractionException.class) public void extractTarAsZip_ArchiveExtractionExceptionIsThrown() throws IOException, ArchiveExtractionException{ File archiveFile = new File(baseDir, "archive.zip"); archiveFile.createNewFile(); Path tempArchive = archiveFile.toPath(); try (OutputStream fo = Files.newOutputStream( tempArchive); OutputStream gzo = new GzipCompressorOutputStream( fo); ArchiveOutputStream o = new TarArchiveOutputStream(gzo)) { o.putArchiveEntry( o.createArchiveEntry(new File(ROOT_FILE), ROOT_FILE)); o.closeArchiveEntry(); } new DefaultArchiveExtractor().extract(archiveFile, targetDir); }
Example #5
Source File: DefaultArchiveExtractorTest.java From flow with Apache License 2.0 | 6 votes |
@Test public void extractTarGz_contentsAreExtracted() throws IOException, ArchiveExtractionException { File archiveFile = new File(baseDir, "archive.tar.gz"); archiveFile.createNewFile(); Path tempArchive = archiveFile.toPath(); try (OutputStream fo = Files.newOutputStream( tempArchive); OutputStream gzo = new GzipCompressorOutputStream( fo); ArchiveOutputStream o = new TarArchiveOutputStream(gzo)) { o.putArchiveEntry( o.createArchiveEntry(new File(ROOT_FILE), ROOT_FILE)); o.closeArchiveEntry(); o.putArchiveEntry( o.createArchiveEntry(new File(SUBFOLDER_FILE), SUBFOLDER_FILE)); o.closeArchiveEntry(); } new DefaultArchiveExtractor().extract(archiveFile, targetDir); Assert.assertTrue("Archive root.file was not extracted", new File(targetDir, ROOT_FILE).exists()); Assert.assertTrue("Archive subfolder/folder.file was not extracted", new File(targetDir, SUBFOLDER_FILE).exists()); }
Example #6
Source File: ReadMetadata.java From gatk with BSD 3-Clause "New" or "Revised" License | 6 votes |
/** * Serializes a read-metadata by itself into a file. * @param meta the read-metadata to serialize. * @param whereTo the name of the file or resource where it will go to. * @throws IllegalArgumentException if either {@code meta} or {@code whereTo} * is {@code null}. * @throws UserException if there was a problem during serialization. */ public static void writeStandalone(final ReadMetadata meta, final String whereTo) { try { final OutputStream outputStream = BucketUtils.createFile(whereTo); final OutputStream actualStream = IOUtil.hasBlockCompressedExtension(whereTo) ? new GzipCompressorOutputStream(outputStream) : outputStream; final Output output = new Output(actualStream); final Kryo kryo = new Kryo(); final Serializer serializer = new Serializer(); output.writeString(MAGIC_STRING); output.writeString(VERSION_STRING); serializer.write(kryo, output, meta); output.close(); } catch (final IOException ex) { throw new UserException.CouldNotCreateOutputFile(whereTo, ex); } }
Example #7
Source File: CompressionTools.java From aws-codepipeline-plugin-for-jenkins with Apache License 2.0 | 6 votes |
public static void compressTarGzFile( final File temporaryTarGzFile, final Path pathToCompress, final BuildListener listener) throws IOException { try (final TarArchiveOutputStream tarGzArchiveOutputStream = new TarArchiveOutputStream( new BufferedOutputStream( new GzipCompressorOutputStream( new FileOutputStream(temporaryTarGzFile))))) { compressArchive( pathToCompress, tarGzArchiveOutputStream, new ArchiveEntryFactory(CompressionType.TarGz), CompressionType.TarGz, listener); } }
Example #8
Source File: FileHelper.java From incubator-heron with Apache License 2.0 | 6 votes |
public static boolean createTarGz(File archive, File... files) { try ( FileOutputStream fileOutputStream = new FileOutputStream(archive); BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream); GzipCompressorOutputStream gzipOuputStream = new GzipCompressorOutputStream(bufferedOutputStream); TarArchiveOutputStream archiveOutputStream = new TarArchiveOutputStream(gzipOuputStream) ) { for (File file : files) { addFileToArchive(archiveOutputStream, file, ""); } archiveOutputStream.finish(); } catch (IOException ioe) { LOG.error("Failed to create archive {} file.", archive, ioe); return false; } return true; }
Example #9
Source File: TarGzCompressionUtils.java From incubator-pinot with Apache License 2.0 | 6 votes |
public static String createTarGzOfDirectory(String directoryPath, String tarGzPath, String entryPrefix) throws IOException { if (!tarGzPath.endsWith(TAR_GZ_FILE_EXTENSION)) { tarGzPath = tarGzPath + TAR_GZ_FILE_EXTENSION; } try (FileOutputStream fOut = new FileOutputStream(new File(tarGzPath)); BufferedOutputStream bOut = new BufferedOutputStream(fOut); GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(bOut); TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) { tOut.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU); addFileToTarGz(tOut, directoryPath, entryPrefix); } catch (IOException e) { LOGGER.error("Failed to create tar.gz file for {} at path: {}", directoryPath, tarGzPath, e); Utils.rethrowException(e); } return tarGzPath; }
Example #10
Source File: AthenaAuditWriter.java From emodb with Apache License 2.0 | 6 votes |
/** * This method takes all closed log files and GZIPs and renames them in preparation for transfer. If the operation * fails the original file is unmodified so the next call should attempt to prepare the file again. This means * the same file may be transferred more than once, but this guarantees that so long as the host remains active the * file will eventually be transferred. */ private void prepareClosedLogFilesForTransfer() { for (final File logFile : _stagingDir.listFiles((dir, name) -> name.startsWith(_logFilePrefix) && name.endsWith(CLOSED_FILE_SUFFIX))) { boolean moved; String fileName = logFile.getName().substring(0, logFile.getName().length() - CLOSED_FILE_SUFFIX.length()) + COMPRESSED_FILE_SUFFIX; try (FileInputStream fileIn = new FileInputStream(logFile); FileOutputStream fileOut = new FileOutputStream(new File(logFile.getParentFile(), fileName)); GzipCompressorOutputStream gzipOut = new GzipCompressorOutputStream(fileOut)) { ByteStreams.copy(fileIn, gzipOut); moved = true; } catch (IOException e) { _log.warn("Failed to compress audit log file: {}", logFile, e); moved = false; } if (moved) { if (!logFile.delete()) { _log.warn("Failed to delete audit log file: {}", logFile); } } } }
Example #11
Source File: GeneratorService.java From vertx-starter with Apache License 2.0 | 6 votes |
public Buffer onProjectRequested(VertxProject project) throws Exception { ArchiveOutputStreamFactory factory; ArchiveFormat archiveFormat = project.getArchiveFormat(); if (archiveFormat == ArchiveFormat.TGZ) { factory = baos -> new TarArchiveOutputStream(new GzipCompressorOutputStream(baos)); } else if (archiveFormat == ArchiveFormat.ZIP) { factory = baos -> new ZipArchiveOutputStream(baos); } else { throw new IllegalArgumentException("Unsupported archive format: " + archiveFormat.getFileExtension()); } try (TempDir tempDir = TempDir.create(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); ArchiveOutputStream out = factory.create(baos)) { createProject(project, tempDir); generateArchive(tempDir, out); out.finish(); out.close(); return Buffer.buffer(baos.toByteArray()); } }
Example #12
Source File: CompressedXmiWriter.java From argument-reasoning-comprehension-task with Apache License 2.0 | 6 votes |
@Override public void initialize(UimaContext aContext) throws ResourceInitializationException { super.initialize(aContext); // some param check if (!outputFile.getName().endsWith(".tar.gz")) { throw new ResourceInitializationException( new IllegalArgumentException("Output file must have .tar.gz extension")); } typeSystemWritten = false; try { outputStream = new TarArchiveOutputStream(new GzipCompressorOutputStream( new BufferedOutputStream(new FileOutputStream(outputFile)))); } catch (IOException ex) { throw new ResourceInitializationException(ex); } }
Example #13
Source File: TarGzCompressionUtilsTest.java From incubator-pinot with Apache License 2.0 | 6 votes |
private void createInvalidTarFile(File nonDirFile, File tarGzPath) { try (FileOutputStream fOut = new FileOutputStream(new File(tarGzPath.getPath())); BufferedOutputStream bOut = new BufferedOutputStream(fOut); GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(bOut); TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) { tOut.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU); // Mock the file that doesn't use the correct file name. String badEntryName = "../foo/bar"; TarArchiveEntry tarEntry = new TarArchiveEntry(nonDirFile, badEntryName); tOut.putArchiveEntry(tarEntry); IOUtils.copy(new FileInputStream(nonDirFile), tOut); tOut.closeArchiveEntry(); } catch (IOException e) { Assert.fail("Unexpected Exception!!"); } }
Example #14
Source File: TarGzipPacker.java From twister2 with Apache License 2.0 | 6 votes |
/** * create TarGzipPacker object */ public static TarGzipPacker createTarGzipPacker(String targetDir, Config config) { // this should be received from config String archiveFilename = SchedulerContext.jobPackageFileName(config); Path archiveFile = Paths.get(targetDir + "/" + archiveFilename); try { // construct output stream OutputStream outStream = Files.newOutputStream(archiveFile); GzipCompressorOutputStream gzipOutputStream = new GzipCompressorOutputStream(outStream); TarArchiveOutputStream tarOutputStream = new TarArchiveOutputStream(gzipOutputStream); return new TarGzipPacker(archiveFile, tarOutputStream); } catch (IOException ioe) { LOG.log(Level.SEVERE, "Archive file can not be created: " + archiveFile, ioe); return null; } }
Example #15
Source File: ArchiveUtils.java From support-diagnostics with Apache License 2.0 | 6 votes |
public static boolean createTarArchive(String dir, String archiveFileName) { try { File srcDir = new File(dir); String filename = dir + "-" + archiveFileName + ".tar.gz"; FileOutputStream fout = new FileOutputStream(filename); CompressorOutputStream cout = new GzipCompressorOutputStream(fout); TarArchiveOutputStream taos = new TarArchiveOutputStream(cout); taos.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_STAR); taos.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU); archiveResultsTar(archiveFileName, taos, srcDir, "", true); taos.close(); logger.info(Constants.CONSOLE, "Archive: " + filename + " was created"); } catch (Exception ioe) { logger.error( "Couldn't create archive.", ioe); return false; } return true; }
Example #16
Source File: NPMPackageGenerator.java From org.hl7.fhir.core with Apache License 2.0 | 6 votes |
private void start() throws IOException { OutputStream = new ByteArrayOutputStream(); bufferedOutputStream = new BufferedOutputStream(OutputStream); gzipOutputStream = new GzipCompressorOutputStream(bufferedOutputStream); tar = new TarArchiveOutputStream(gzipOutputStream); indexer = new NpmPackageIndexBuilder(); indexer.start(); }
Example #17
Source File: PGzipOutputStream.java From jphp with Apache License 2.0 | 5 votes |
@Signature public void __construct(Environment env, OutputStream outputStream, @Nullable @Arg(type = ARRAY) Memory parameters) throws IOException { if (parameters.isNull()) { this.outputStream = new GzipCompressorOutputStream(outputStream); } else { this.outputStream = new GzipCompressorOutputStream( outputStream, parameters.toValue(ArrayMemory.class).toBean(env, GzipParameters.class) ); } }
Example #18
Source File: IOUtils.java From gatk with BSD 3-Clause "New" or "Revised" License | 5 votes |
public static void writeTarGz(String name, File... files) throws IOException { try (TarArchiveOutputStream taos = new TarArchiveOutputStream(new GzipCompressorOutputStream(new FileOutputStream(name)))){ // TAR has an 8 gig file limit by default, this gets around that taos.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_STAR); // TAR originally didn't support long file names, so enable the support for it taos.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU); taos.setAddPaxHeadersForNonAsciiNames(true); for (File file : files){ addToTar(taos, file, "."); } } }
Example #19
Source File: ProjectGenerationController.java From initializr with Apache License 2.0 | 5 votes |
private TarArchiveOutputStream createTarArchiveOutputStream(OutputStream output) { try { return new TarArchiveOutputStream(new GzipCompressorOutputStream(output)); } catch (IOException ex) { throw new IllegalStateException(ex); } }
Example #20
Source File: MockStringContentFactory.java From Wikidata-Toolkit with Apache License 2.0 | 5 votes |
/** * Turns a string into a sequence of bytes, possibly compressed. In any * case, the character encoding used for converting the string into bytes is * UTF8. * * @param string * @param compressionType * @return * @throws IOException */ public static byte[] getBytesFromString(String string, CompressionType compressionType) throws IOException { switch (compressionType) { case NONE: return string.getBytes(StandardCharsets.UTF_8); case BZ2: case GZIP: ByteArrayOutputStream out = new ByteArrayOutputStream(); OutputStreamWriter ow; if (compressionType == CompressionType.GZIP) { ow = new OutputStreamWriter( new GzipCompressorOutputStream(out), StandardCharsets.UTF_8); } else { ow = new OutputStreamWriter( new BZip2CompressorOutputStream(out), StandardCharsets.UTF_8); } ow.write(string); ow.close(); return out.toByteArray(); default: throw new RuntimeException("Unknown compression type " + compressionType); } }
Example #21
Source File: JsonSerializationProcessor.java From Wikidata-Toolkit with Apache License 2.0 | 5 votes |
/** * Constructor. Initializes various helper objects we use for the JSON * serialization, and opens the file that we want to write to. * * @throws IOException * if there is a problem opening the output file */ public JsonSerializationProcessor() throws IOException { //Configuration of the filter DocumentDataFilter documentDataFilter = new DocumentDataFilter(); // Only copy English labels, descriptions, and aliases: documentDataFilter.setLanguageFilter(Collections.singleton("en")); // Only copy statements of some properties: Set<PropertyIdValue> propertyFilter = new HashSet<>(); propertyFilter.add(Datamodel.makeWikidataPropertyIdValue("P18")); // image propertyFilter.add(Datamodel.makeWikidataPropertyIdValue("P106")); // occupation propertyFilter.add(Datamodel.makeWikidataPropertyIdValue("P569")); // birthdate documentDataFilter.setPropertyFilter(propertyFilter); // Do not copy any sitelinks: documentDataFilter.setSiteLinkFilter(Collections.emptySet()); // The filter is used to remove some parts from the documents we // serialize. this.datamodelFilter = new DatamodelFilter(new DataObjectFactoryImpl(), documentDataFilter); // The (compressed) file we write to. OutputStream outputStream = new GzipCompressorOutputStream( new BufferedOutputStream( ExampleHelpers .openExampleFileOuputStream(OUTPUT_FILE_NAME))); this.jsonSerializer = new JsonSerializer(outputStream); this.jsonSerializer.open(); }
Example #22
Source File: DirectoryManagerTest.java From Wikidata-Toolkit with Apache License 2.0 | 5 votes |
@Test public void getCompressionInputStreamGzip() throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); OutputStreamWriter ow = new OutputStreamWriter( new GzipCompressorOutputStream(out), StandardCharsets.UTF_8); ow.write("Test data"); ow.close(); ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray()); InputStream cin = dm.getCompressorInputStream(in, CompressionType.GZIP); assertEquals("Test data", new BufferedReader(new InputStreamReader(cin)).readLine()); }
Example #23
Source File: GzipCommonsCompressor.java From yosegi with Apache License 2.0 | 5 votes |
@Override public OutputStream createOutputStream( final OutputStream out , final long decompressSize, final CompressResult compressResult ) throws IOException { GzipParameters op = new GzipParameters(); int level = getCompressLevel( compressResult.getCompressionPolicy() ); int optLevel = compressResult.getCurrentLevel(); if ( ( level - optLevel ) < 1 ) { compressResult.setEnd(); optLevel = compressResult.getCurrentLevel(); } op.setCompressionLevel( level - optLevel ); return new GzipCompressorOutputStream( out , op ); }
Example #24
Source File: SourceBlobPackager.java From heroku-maven-plugin with MIT License | 5 votes |
public static Path pack(SourceBlobDescriptor sourceBlobDescriptor, OutputAdapter outputAdapter) throws IOException { Path tarFilePath = Files.createTempFile("heroku-deploy", "source-blob.tgz"); TarArchiveOutputStream tarArchiveOutputStream = new TarArchiveOutputStream( new GzipCompressorOutputStream(new FileOutputStream(tarFilePath.toFile()))); tarArchiveOutputStream.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU); outputAdapter.logInfo("-----> Packaging application..."); for (Path sourceBlobPath : sourceBlobDescriptor.getContents().keySet()) { SourceBlobDescriptor.SourceBlobContent content = sourceBlobDescriptor.getContents().get(sourceBlobPath); if (content.isHidden()) { outputAdapter.logDebug(" - including: " + sourceBlobPath + " (hidden)"); } else { outputAdapter.logInfo(" - including: " + sourceBlobPath); } addIncludedPathToArchive(sourceBlobPath.toString(), content, tarArchiveOutputStream); } tarArchiveOutputStream.close(); outputAdapter.logInfo("-----> Creating build..."); outputAdapter.logInfo(" - file: " + tarFilePath); outputAdapter.logInfo(String.format(" - size: %dMB", Files.size(tarFilePath) / 1024 / 1024)); return tarFilePath; }
Example #25
Source File: CompressedDirectory.java From docker-client with Apache License 2.0 | 5 votes |
/** * This method creates a gzip tarball of the specified directory. File permissions will be * retained. The file will be created in a temporary directory using the {@link * Files#createTempFile(String, String, java.nio.file.attribute.FileAttribute[])} method. The * returned object is auto-closeable, and upon closing it, the archive file will be deleted. * * @param directory the directory to compress * @return a Path object representing the compressed directory * @throws IOException if the compressed directory could not be created. */ public static CompressedDirectory create(final Path directory) throws IOException { final Path file = Files.createTempFile("docker-client-", ".tar.gz"); final Path dockerIgnorePath = directory.resolve(".dockerignore"); final ImmutableList<DockerIgnorePathMatcher> ignoreMatchers = parseDockerIgnore(dockerIgnorePath); try (final OutputStream fileOut = Files.newOutputStream(file); final GzipCompressorOutputStream gzipOut = new GzipCompressorOutputStream(fileOut); final TarArchiveOutputStream tarOut = new TarArchiveOutputStream(gzipOut)) { tarOut.setLongFileMode(LONGFILE_POSIX); tarOut.setBigNumberMode(BIGNUMBER_POSIX); Files.walkFileTree(directory, EnumSet.of(FileVisitOption.FOLLOW_LINKS), Integer.MAX_VALUE, new Visitor(directory, ignoreMatchers, tarOut)); } catch (Throwable t) { // If an error occurs, delete temporary file before rethrowing exclude. try { Files.delete(file); } catch (IOException e) { // So we don't lose track of the reason the file was deleted... might be important t.addSuppressed(e); } throw t; } return new CompressedDirectory(file); }
Example #26
Source File: TarEdgeArchiveBuilder.java From datacollector with Apache License 2.0 | 5 votes |
@Override public void finish() throws IOException { try ( TarArchiveOutputStream tarArchiveOutput = new TarArchiveOutputStream(new GzipCompressorOutputStream(outputStream)); TarArchiveInputStream tarArchiveInput = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream(edgeArchive))) ) { tarArchiveOutput.setLongFileMode(TarArchiveOutputStream.LONGFILE_POSIX); TarArchiveEntry entry = tarArchiveInput.getNextTarEntry(); while (entry != null) { tarArchiveOutput.putArchiveEntry(entry); IOUtils.copy(tarArchiveInput, tarArchiveOutput); tarArchiveOutput.closeArchiveEntry(); entry = tarArchiveInput.getNextTarEntry(); } for (PipelineConfigurationJson pipelineConfiguration : pipelineConfigurationList) { addArchiveEntry(tarArchiveOutput, pipelineConfiguration, pipelineConfiguration.getPipelineId(), PIPELINE_JSON_FILE ); addArchiveEntry(tarArchiveOutput, pipelineConfiguration.getInfo(), pipelineConfiguration.getPipelineId(), PIPELINE_INFO_FILE ); } tarArchiveOutput.finish(); } }
Example #27
Source File: TarGzStripper.java From reproducible-build-maven-plugin with Apache License 2.0 | 5 votes |
@Override protected TarArchiveOutputStream createOutputStream(File out) throws FileNotFoundException, IOException { final TarArchiveOutputStream stream = new TarArchiveOutputStream( new GzipCompressorOutputStream(new FileOutputStream(out))); stream.setLongFileMode(TarArchiveOutputStream.LONGFILE_POSIX); return stream; }
Example #28
Source File: JsonSerializationProcessor.java From Wikidata-Toolkit-Examples with Apache License 2.0 | 5 votes |
/** * Constructor. Initializes various helper objects we use for the JSON * serialization, and opens the file that we want to write to. * * @throws IOException if there is a problem opening the output file */ public JsonSerializationProcessor() throws IOException { // The filter is used to copy selected parts of the data. We use this // to remove some parts from the documents we serialize. DocumentDataFilter filter = new DocumentDataFilter(); // Only copy English labels, descriptions, and aliases: filter.setLanguageFilter(Collections.singleton("en")); // Only copy statements of some properties: Set<PropertyIdValue> propertyFilter = new HashSet<>(); propertyFilter.add(Datamodel.makeWikidataPropertyIdValue("P18")); // image propertyFilter.add(Datamodel.makeWikidataPropertyIdValue("P106")); // occupation propertyFilter.add(Datamodel.makeWikidataPropertyIdValue("P569")); // birthdate filter.setPropertyFilter(propertyFilter); // Do not copy any sitelinks: filter.setSiteLinkFilter(Collections.emptySet()); this.datamodelFilter = new DatamodelFilter(new DataObjectFactoryImpl(), new DocumentDataFilter()); // The (compressed) file we write to. OutputStream outputStream = new GzipCompressorOutputStream( new BufferedOutputStream( ExampleHelpers .openExampleFileOuputStream(OUTPUT_FILE_NAME))); this.jsonSerializer = new JsonSerializer(outputStream); this.jsonSerializer.open(); }
Example #29
Source File: RdfSerializationExample.java From Wikidata-Toolkit-Examples with Apache License 2.0 | 5 votes |
public static void main(String[] args) throws IOException { // Define where log messages go ExampleHelpers.configureLogging(); // Print information about this program printDocumentation(); // Initialize sites; only needed to link to Wikipedia pages in RDF DumpProcessingController dumpProcessingController = new DumpProcessingController( "wikidatawiki"); dumpProcessingController.setOfflineMode(ExampleHelpers.OFFLINE_MODE); Sites sites = dumpProcessingController.getSitesInformation(); // Prepare a compressed output stream to write the data to // (admittedly, this is slightly over-optimized for an example) OutputStream bufferedFileOutputStream = new BufferedOutputStream( ExampleHelpers .openExampleFileOuputStream("wikidata-simple-statements.nt.gz"), 1024 * 1024 * 5); GzipParameters gzipParameters = new GzipParameters(); gzipParameters.setCompressionLevel(7); OutputStream compressorOutputStream = new GzipCompressorOutputStream( bufferedFileOutputStream, gzipParameters); OutputStream exportOutputStream = asynchronousOutputStream(compressorOutputStream); // Create a serializer processor RdfSerializer serializer = new RdfSerializer(RDFFormat.NTRIPLES, exportOutputStream, sites, PropertyRegister.getWikidataPropertyRegister()); // Serialize simple statements (and nothing else) for all items serializer.setTasks(RdfSerializer.TASK_ITEMS | RdfSerializer.TASK_SIMPLE_STATEMENTS); // Run serialization serializer.open(); ExampleHelpers.processEntitiesFromWikidataDump(serializer); serializer.close(); }
Example #30
Source File: GzipCborFileWriter.java From ache with Apache License 2.0 | 5 votes |
private void createNewGzipFileStream(File archive) throws IOException { fileOutput = new FileOutputStream(archive); bufOutput = new BufferedOutputStream(fileOutput); gzipOutput = new GzipCompressorOutputStream(bufOutput); tarOutput = new TarArchiveOutputStream(gzipOutput); tarOutput.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU); }