it.unimi.dsi.io.InputBitStream Java Examples

The following examples show how to use it.unimi.dsi.io.InputBitStream. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: AbstractFixedByteArrayBuffer.java    From database with GNU General Public License v2.0 6 votes vote down vote up
public InputBitStream getInputBitStream() {

//        /*
//         * We have to double-wrap the buffer to ensure that it reads from just
//         * the slice since InputBitStream does not have a constructor which
//         * accepts a slice of the form (byte[], off, len). [It would be nice if
//         * InputBitStream handled the slice natively since should be faster per
//         * its own javadoc.]
//         * 
//         * Note: The reflection test semantics are not quite what I would want.
//         * If you specify [false] then the code does not even test for the
//         * RepositionableStream interface. Ideally, it would always do that but
//         * skip the reflection on the getChannel() method when it was false.
//         */
//        return new InputBitStream(getDataInput(), 0/* unbuffered */, true/* reflectionTest */);
        
        /*
         * This directly wraps the slice.  This is much faster.
         */
        return new InputBitStream(array(), off, len);

    }
 
Example #2
Source File: CodecTestCase.java    From database with GNU General Public License v2.0 6 votes vote down vote up
protected void checkPrefixCodec( PrefixCodec codec, Random r ) throws IOException {
	int[] symbol = new int[ 100 ];
	BooleanArrayList bits = new BooleanArrayList();
	for( int i = 0; i < symbol.length; i++ ) symbol[ i ] = r.nextInt( codec.size() ); 
	for( int i = 0; i < symbol.length; i++ ) {
		BitVector word = codec.codeWords()[ symbol[ i ] ];
		for( int j = 0; j < word.size(); j++ ) bits.add( word.get( j ) );
	}

	BooleanIterator booleanIterator = bits.iterator();
	Decoder decoder = codec.decoder();
	for( int i = 0; i < symbol.length; i++ ) {
		assertEquals( decoder.decode( booleanIterator ), symbol[ i ] );
	}
	
	FastByteArrayOutputStream fbaos = new FastByteArrayOutputStream();
	OutputBitStream obs = new OutputBitStream( fbaos, 0 );
	obs.write( bits.iterator() );
	obs.flush();
	InputBitStream ibs = new InputBitStream( fbaos.array );
	
	for( int i = 0; i < symbol.length; i++ ) {
		assertEquals( decoder.decode( ibs ), symbol[ i ] );
	}
}
 
Example #3
Source File: SemiExternalGammaList.java    From database with GNU General Public License v2.0 6 votes vote down vote up
/** Creates a new semi-external list.
 * 
 * @param longs a bit stream containing &gamma;-encoded longs.
 * @param step the step used to build random-access entry points, or -1 to get {@link #DEFAULT_STEP}.
 * @param numLongs the overall number of offsets (i.e., the number of terms).
 */

public SemiExternalGammaList( final InputBitStream longs, final int step, final int numLongs ) throws IOException {
	this.step = step == -1 ? DEFAULT_STEP : step;
	int slots = ( numLongs + this.step - 1 ) / this.step;
	this.position = new long[ slots ];
	this.numLongs = numLongs;
	this.ibs = longs;
	ibs.position( 0 );
	ibs.readBits( 0 );
	final int lastSlot = position.length - 1;
	for ( int i = 0; i <= lastSlot; i++ ) {
		position[ i ] = ibs.readBits();
		if ( i != lastSlot ) ibs.skipGammas( this.step );
	}
}
 
Example #4
Source File: RocksDao.java    From fasten with Apache License 2.0 5 votes vote down vote up
private void initKryo() {
      kryo = new Kryo();
      kryo.register(BVGraph.class, new BVGraphSerializer(kryo));
kryo.register(Boolean.class);
      kryo.register(byte[].class);
      kryo.register(InputBitStream.class);
      kryo.register(NullInputStream.class);
      kryo.register(EliasFanoMonotoneLongBigList.class, new JavaSerializer());
      kryo.register(MutableString.class, new FieldSerializer<>(kryo, MutableString.class));
      kryo.register(Properties.class);
      kryo.register(long[].class);
      kryo.register(Long2IntOpenHashMap.class);
kryo.register(GOV3LongFunction.class, new JavaSerializer());
  }
 
Example #5
Source File: CanonicalHuffmanRabaCoder.java    From database with GNU General Public License v2.0 5 votes vote down vote up
/**
         * This is an efficient binary search performed without materializing the
         * coded byte[][].
         */
        @Override
        public int search(final byte[] probe) {

            if (probe == null)
                throw new IllegalArgumentException();
            
            if(!isKeys())
                throw new UnsupportedOperationException();
            
            final InputBitStream ibs = data.getInputBitStream();

            try {

                return binarySearch(ibs, probe);
                
            } catch (IOException ex) {

                throw new RuntimeException(ex);

// close not required for IBS backed by byte[] and has high overhead.
//            } finally {
//
//                try {
//                    ibs.close();
//                } catch (IOException ex) {
//                    log.error(ex);
//                }

            }

        }
 
Example #6
Source File: AbstractKeyArrayIndexProcedure.java    From database with GNU General Public License v2.0 5 votes vote down vote up
@Override
        public void readExternal(final ObjectInput in) throws IOException,
                ClassNotFoundException {

            final byte version = in.readByte();

            switch (version) {
            case VERSION0:
                break;
            default:
                throw new UnsupportedOperationException("Unknown version: "
                        + version);
            }

            @SuppressWarnings("resource")
            final InputBitStream ibs = new InputBitStream((InputStream) in,
                    0/* unbuffered */, false/* reflectionTest */);

            n = ibs.readNibble();

//            a = LongArrayBitVector.getInstance(n);
            a = new boolean[n];

            for (int i = 0; i < n; i++) {

                final boolean bit = ibs.readBit() == 1 ? true : false;
//                a.set(i, bit);

                if (a[i] = bit)
                    onCount++;
                
            }
            
        }
 
Example #7
Source File: TestCanonicalHuffmanRabaCoder.java    From database with GNU General Public License v2.0 5 votes vote down vote up
public void test_confirm_InputBitStream_compatible() throws IOException {
	
	final byte[] tbuf = new byte[] {
			(byte) 0xAA, 
			(byte) 0xAA, 
			(byte) 0xAA, 
			(byte) 0xAA, 
			(byte) 0xAA, 
			(byte) 0xAA, 
			(byte) 0xAA, 
			(byte) 0xAA
	};
	
    // wrap with InputBitStream.
    final InputBitStream ibs = new InputBitStream(tbuf);
    
    // 1010
    assertTrue(compare(ibs, tbuf, 0, 4) == 0xA);
    // 1010 1010
    assertTrue(compare(ibs, tbuf, 0, 8) == 0xAA);
    // 0101
    assertTrue(compare(ibs, tbuf, 1, 4) == 0x5);
    // 01 0101
    assertTrue(compare(ibs, tbuf, 1, 6) == 0x15);
    // 1010 1010
    assertTrue(compare(ibs, tbuf, 0, 32) == 0xAAAAAAAA);
    assertTrue(compare(ibs, tbuf, 1, 32) == 0x55555555);
    
    // Now try some 64bit comparisons
    assertTrue(compare64(ibs, tbuf, 0, 48) == 0xAAAAAAAAAAAAL);
    assertTrue(compare64(ibs, tbuf, 1, 48) == 0x555555555555L);

}
 
Example #8
Source File: CanonicalFast64CodeWordDecoder.java    From database with GNU General Public License v2.0 5 votes vote down vote up
public int decode( final InputBitStream ibs ) throws IOException {
    final int[] lengthIncrement = this.lengthIncrement;
    final long[] lastCodeWordPlusOne = this.lastCodeWordPlusOne;
    int curr = 0, l; 
    long x;

    x = ibs.readLong( lengthIncrement[ curr ] );
    
    for(;;) {
        if ( x < lastCodeWordPlusOne[ curr ] ) return symbol[ (int)( howManyUpToBlock[ curr ] - lastCodeWordPlusOne[ curr ] + x ) ];
        l = lengthIncrement[ ++curr ];
        if ( l == 1 ) x = x << 1 | ibs.readBit();
        else x = x << l | ibs.readLong( l );
    }
}
 
Example #9
Source File: ImmutableExternalPrefixMap.java    From database with GNU General Public License v2.0 5 votes vote down vote up
private void readObject( final ObjectInputStream s ) throws IOException, ClassNotFoundException {
	s.defaultReadObject();
	if ( selfContained ) {
		final File temp = File.createTempFile( this.getClass().getName(), ".dump" );
		temp.deleteOnExit();
		tempDumpStreamFilename = temp.toString();
		// TODO: propose Jakarta CopyUtils extension with length control and refactor.
		FileOutputStream fos = new FileOutputStream( temp );
		final byte[] b = new byte[ 64 * 1024 ];
		int len;
		while( ( len = s.read( b ) ) >= 0 ) fos.write( b, 0, len );			fos.close();
		dumpStream = new InputBitStream( temp, (int)( blockSize / 8 ) );
	}
}
 
Example #10
Source File: KnowledgeBase.java    From fasten with Apache License 2.0 5 votes vote down vote up
/** Initializes the kryo instance used for serialization. */
private void initKryo() {
	kryo = new Kryo();
	kryo.register(BVGraph.class, new BVGraphSerializer(kryo));
	kryo.register(byte[].class);
	kryo.register(InputBitStream.class);
	kryo.register(NullInputStream.class);
	kryo.register(EliasFanoMonotoneLongBigList.class, new JavaSerializer());
	kryo.register(MutableString.class, new FieldSerializer<>(kryo, MutableString.class));
	kryo.register(Properties.class);
	kryo.register(long[].class);
	kryo.register(Long2IntOpenHashMap.class);
}
 
Example #11
Source File: BitStreamCODEC.java    From RankSys with Mozilla Public License 2.0 5 votes vote down vote up
@Override
public int dec(final byte[] in, final int[] out, final int outOffset, final int len) {
    try (InputBitStream ibs = new InputBitStream(in)) {
        read(ibs, out, outOffset, len);
    } catch (IOException ex) {
        LOG.log(Level.SEVERE, null, ex);
    }

    return 0;
}
 
Example #12
Source File: EliasFanoBitStreamCODEC.java    From RankSys with Mozilla Public License 2.0 5 votes vote down vote up
@Override
protected void read(InputBitStream ibs, int[] out, int offset, int len) throws IOException {
    int d = 0;
    final int l = ibs.readInt(32);
    for (int i = offset; i < offset + len; i++) {
        final int hx = (d += ibs.readUnary());
        out[i] = hx << l | ibs.readInt(l);
    }
}
 
Example #13
Source File: RiceBitStreamCODEC.java    From RankSys with Mozilla Public License 2.0 5 votes vote down vote up
@Override
protected void read(InputBitStream ibs, int[] out, int offset, int len) throws IOException {
    final int log2b = ibs.readInt(32);
    for (int i = offset; i < offset + len; i++) {
        final int q = ibs.readUnary();
        out[i] = log2b == 0 ? q : (q << log2b) | ibs.readInt(log2b);
    }
}
 
Example #14
Source File: TestCanonicalHuffmanRabaCoder.java    From database with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Verify we can regenerate the {@link Fast64CodeWordCoder} from the code
 * word[]. This is tested by coding and decoding random symbol sequences.
 * For this test we need to reconstruct the {@link Fast64CodeWordCoder}. To
 * do that, we need to use the codeWord[] and create a long[] having the
 * same values as the codeWords, but expressed as 64-bit integers.
 * 
 * @param frequency
 *            The frequency[] should include a reasonable proportion of
 *            symbols with a zero frequency in order to replicate the
 *            expected conditions when coding non-random data such as are
 *            found in the keys of a B+Tree.
 * 
 * @throws IOException
 */
public void doRecoderRoundTripTest(final int frequency[]) throws IOException {
    
    final DecoderInputs decoderInputs = new DecoderInputs();
    
    final HuffmanCodec codec = new HuffmanCodec(frequency, decoderInputs);

    final PrefixCoder expected = codec.coder();
    
    final PrefixCoder actual = new Fast64CodeWordCoder(codec.codeWords());
    
    if (log.isDebugEnabled())
        log.debug(printCodeBook(codec.codeWords()));

    /*
     * First verify that both coders produce the same coded values for a
     * symbol sequence of random length drawn from the full set of symbols
     * of random length [1:nsymbols].
     */
    final int[] value = new int[r.nextInt(frequency.length) + 1];
    for(int i=0; i<value.length; i++) {
        // any of the symbols in [0:nsymbols-1].
        value[i] = r.nextInt(frequency.length);
    }

    /*
     * Now code the symbol sequence using both coders and then compare the
     * coded values. They should be the same.
     */
    final byte[] codedValue;
    {
        final FastByteArrayOutputStream ebaos = new FastByteArrayOutputStream();
        final FastByteArrayOutputStream abaos = new FastByteArrayOutputStream();
        final OutputBitStream eobs = new OutputBitStream(ebaos);
        final OutputBitStream aobs = new OutputBitStream(abaos);
        for (int i = 0; i < value.length; i++) {
            final int symbol = value[i];
            expected.encode(symbol, eobs);
            actual.encode(symbol, aobs);
        }
        eobs.flush();
        aobs.flush();
        assertEquals(0, BytesUtil.compareBytesWithLenAndOffset(0/* aoff */,
                ebaos.length, ebaos.array, 0/* boff */, abaos.length,
                abaos.array));
        codedValue = new byte[abaos.length];
        System.arraycopy(abaos.array/*src*/, 0/*srcPos*/, codedValue/*dest*/, 0/*destPos*/, abaos.length/*len*/);
    }

    /*
     * Now verify that the coded sequence decodes to the original symbol
     * sequence using a Decoder which is reconstructed from the bit length
     * and symbol arrays of the codec.
     */
    final CanonicalFast64CodeWordDecoder actualDecoder = new CanonicalFast64CodeWordDecoder(
            decoderInputs.getLengths(), decoderInputs.getSymbols());

    {

        final InputBitStream ibs = new InputBitStream(codedValue);
        
        for (int i = 0; i < value.length; i++) {

            assertEquals(value[i]/* symbol */, actualDecoder.decode(ibs));

        }
        
    }

}
 
Example #15
Source File: ZetaBitStreamCODEC.java    From RankSys with Mozilla Public License 2.0 4 votes vote down vote up
@Override
protected void read(InputBitStream ibs, int[] out, int offset, int len) throws IOException {
    ibs.readZetas(k, out, len);
}
 
Example #16
Source File: CanonicalHuffmanRabaCoder.java    From database with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Reconstruct the {@link DecoderInputs} from the data written by
 * {@link #writeDecoderInputs(BitVector[], OutputBitStream)}.
 * 
 * @param nsymbols
 *            The #of symbols.
 * @param ibs
 *            The input bit stream.
 * @param sb
 *            Debugging information is added to this buffer (optional).
 * 
 * @return The decoded bit lengths and the corresponding symbol indices for
 *         the canonical huffman code.
 * 
 * @throws IOException
 */
static protected DecoderInputs readDecoderInputs(final int nsymbols,
        final InputBitStream ibs, final StringBuilder sb)
        throws IOException {

    final int min = ibs.readNibble();

    final int max = ibs.readNibble();

    if (sb != null)
        sb.append("min=" + min + ", max=" + max+"\n");

    final int[] length = new int[nsymbols];
    final int[] symbol = new int[nsymbols];

    // the current code length
    int codeSize = min;
    int lastSymbol = 0;
    while (codeSize <= max) {
        final int sizeCount = ibs.readNibble();
        if (sb != null)
            sb.append("codeSize="+codeSize+", sizeCount="+sizeCount+", symbols=[");
        for (int i = 0; i < sizeCount; i++, lastSymbol++) {
            final int tmp = ibs.readNibble();
            if (sb != null)
                sb.append(" "+tmp);
            length[lastSymbol] = codeSize;
            symbol[lastSymbol] = tmp;
        }
        if (sb != null)
            sb.append(" ]\n");
        codeSize++;
    }

    final int shortestCodeWordLength = length[0];

    final BitVector shortestCodeWord = LongArrayBitVector.getInstance()
            .length(shortestCodeWordLength);

    for (int i = shortestCodeWordLength-1; i >= 0; i--) {

        shortestCodeWord.set(i, ibs.readBit());
        
    }

    if (sb != null) {
        sb.append("shortestCodeWord=" + shortestCodeWord+"\n");
    }

    return new DecoderInputs(shortestCodeWord, length, symbol);

}
 
Example #17
Source File: GammaBitStreamCODEC.java    From RankSys with Mozilla Public License 2.0 4 votes vote down vote up
@Override
protected void read(InputBitStream ibs, int[] out, int offset, int len) throws IOException {
    ibs.readGammas(out, len);
}
 
Example #18
Source File: FixedLengthBitStreamCODEC.java    From RankSys with Mozilla Public License 2.0 4 votes vote down vote up
@Override
protected void read(InputBitStream ibs, int[] out, int offset, int len) throws IOException {
    for (int i = offset; i < offset + len; i++) {
        out[i] = ibs.readInt(b);
    }
}
 
Example #19
Source File: TreeDecoder.java    From database with GNU General Public License v2.0 4 votes vote down vote up
public int decode( final InputBitStream ibs ) throws IOException {
	Node n = root;
	while( ! ( n instanceof LeafNode ) ) 
		n = ibs.readBit() == 0 ? n.left : n.right;
	return ((LeafNode)n).symbol;
}
 
Example #20
Source File: ImmutableExternalPrefixMap.java    From database with GNU General Public License v2.0 3 votes vote down vote up
/** Sets the dump stream of this external prefix map to a given filename.
 *
 * <P>This method sets the dump file used by this map, and should be only
 * called after deserialisation, providing exactly the file generated at
 * creation time. Essentially anything can happen if you do not follow the rules.
 *
 * <P>Note that this method will attempt to close the old stream, if present.
 *   
 * @param dumpStreamFilename the name of the dump file.
 * @see #setDumpStream(InputBitStream)
 */

public void setDumpStream( final CharSequence dumpStreamFilename ) throws FileNotFoundException{
	ensureNotSelfContained();
	safelyCloseDumpStream();
	iteratorIsUsable = false;
	final long newLength = new File( dumpStreamFilename.toString() ).length();
	if ( newLength != dumpStreamLength )
		throw new IllegalArgumentException( "The size of the new dump file (" + newLength + ") does not match the original length (" + dumpStreamLength + ")" );
	dumpStream = new InputBitStream( dumpStreamFilename.toString(), (int)( blockSize / 8 ) );
}
 
Example #21
Source File: TestCanonicalHuffmanRabaCoder.java    From database with GNU General Public License v2.0 3 votes vote down vote up
int compare(InputBitStream ibs, byte[] buf, int offset, int bits) throws IOException {
	ibs.position(offset);
	int v1 = ibs.readInt(bits);
	int v2 = BytesUtil.getBits(buf, offset, bits);
	
	assertTrue(v1 == v2);
	
	return v1;

}
 
Example #22
Source File: TestCanonicalHuffmanRabaCoder.java    From database with GNU General Public License v2.0 3 votes vote down vote up
long compare64(InputBitStream ibs, byte[] buf, int offset, int bits) throws IOException {
	ibs.position(offset);
	long v1 = ibs.readLong(bits);
	long v2 = BytesUtil.getBits64(buf, offset, bits);
	
	assertTrue(v1 == v2);
	
	return v1;

}
 
Example #23
Source File: ImmutableExternalPrefixMap.java    From database with GNU General Public License v2.0 3 votes vote down vote up
/** Sets the dump stream of this external prefix map to a given input bit stream.
 *
 * <P>This method sets the dump file used by this map, and should be only
 * called after deserialisation, providing a repositionable stream containing
 * exactly the file generated at
 * creation time. Essentially anything can happen if you do not follow the rules.
 *  
 * <P>Using this method you can load an external prefix map in core memory, enjoying
 * the compactness of the data structure, but getting much more speed. 
 * 
 * <P>Note that this method will attemp to close the old stream, if present.
 *   
 * @param dumpStream a repositionable input bit stream containing exactly the dump stream generated
 * at creation time.
 * @see #setDumpStream(CharSequence)
 */
public void setDumpStream( final InputBitStream dumpStream ) {
	ensureNotSelfContained();
	safelyCloseDumpStream();
	iteratorIsUsable = false;
	this.dumpStream = dumpStream;
}
 
Example #24
Source File: Decoder.java    From database with GNU General Public License v2.0 2 votes vote down vote up
/** Decodes the next symbol from the given input bit stream.
 * 
 * <P>Note that {@link InputBitStream} implements {@link BooleanIterator}.
 * 
 * @param ibs an input bit stream.
 * @return the next symbol decoded from <code>ibs</code>.
 */
int decode( InputBitStream ibs ) throws IOException;
 
Example #25
Source File: SemiExternalGammaList.java    From database with GNU General Public License v2.0 2 votes vote down vote up
/** Creates a new semi-external list.
 * 
 * <p>This quick-and-dirty constructor estimates the number of longs by checking
 * for an {@link EOFException}.
 * 
 * @param longs a bit stream containing &gamma;-encoded longs.
 */

public SemiExternalGammaList( final InputBitStream longs ) throws IOException {
	this( longs, DEFAULT_STEP, estimateNumberOfLongs( longs ) );
}
 
Example #26
Source File: BitStreamCODEC.java    From RankSys with Mozilla Public License 2.0 2 votes vote down vote up
/**
 * Reads a BitStream to an array.
 *
 * @param ibs input bit stream
 * @param out output array
 * @param offset array offset
 * @param len number of integers to read
 * @throws IOException when IO error
 */
protected abstract void read(final InputBitStream ibs, final int[] out, final int offset, final int len) throws IOException;
 
Example #27
Source File: IFixedDataRecord.java    From database with GNU General Public License v2.0 2 votes vote down vote up
/**
 * Return a bit stream that will read from the slice.
 * <p>
 * Note: You DO NOT need to close this stream since it is backed by a
 * byte[]. In fact, {@link InputBitStream#close()} when backed by a byte[]
 * appears to have relatively high overhead, which is weird.
 */
public InputBitStream getInputBitStream();