Java Code Examples for htsjdk.samtools.util.StringUtil#bytesToString()

The following examples show how to use htsjdk.samtools.util.StringUtil#bytesToString() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: CollapseTagWithContextTest.java    From Drop-seq with MIT License 6 votes vote down vote up
private final String alterBaseString(final String baseString, final int numChanges) {
     final byte[] bases = StringUtil.stringToBytes(baseString);
     if (numChanges > baseString.length())
throw new IllegalArgumentException("Too many changes requested");
     final Set<Integer> mutatedPositions = new HashSet<>();
     int changesSoFar = 0;
     while (changesSoFar < numChanges) {
         int positionToChange = random.nextInt(bases.length);
         while (mutatedPositions.contains(positionToChange))
	positionToChange = random.nextInt(bases.length);
         mutatedPositions.add(positionToChange);
         bases[positionToChange] = alterBase(bases[positionToChange]);
         ++changesSoFar;
     }
     return StringUtil.bytesToString(bases);
 }
 
Example 2
Source File: ClusterIntensityFileReader.java    From picard with MIT License 5 votes vote down vote up
public ClusterIntensityFileHeader(final byte[] headerBytes, final File file) {
    if(headerBytes.length < HEADER_SIZE) {
        throw new PicardException("Bytes past to header constructor are too short excpected(" + HEADER_SIZE + ") received (" + headerBytes.length);
    }

    ByteBuffer buf = ByteBuffer.allocate(headerBytes.length); //for doing some byte conversions
    buf.order(ByteOrder.LITTLE_ENDIAN);
    buf.put(headerBytes);
    buf.position(0);

    final byte[] identifierBuf = new byte[IDENTIFIER.length];
    buf.get(identifierBuf);
    if (!Arrays.equals(identifierBuf, IDENTIFIER)) {
        throw new PicardException("Cluster intensity file " + file + " contains unexpected header: " +
                StringUtil.bytesToString(identifierBuf));
    }
    final byte fileVersion = buf.get();
    if (fileVersion != FILE_VERSION) {
        throw new PicardException("Cluster intensity file " + file + " contains unexpected version: " + fileVersion);
    }
    elementSize = buf.get();
    if (elementSize < 1 || elementSize > 2) {
        throw new PicardException("Cluster intensity file " + file + " contains unexpected element size: " + elementSize);
    }
    // convert these to unsigned
    firstCycle = UnsignedTypeUtil.uShortToInt(buf.getShort());
    numCycles = UnsignedTypeUtil.uShortToInt(buf.getShort());
    if (numCycles == 0) {
        throw new PicardException("Cluster intensity file " + file + " has zero cycles.");
    }
    numClusters = buf.getInt();
    if (numClusters < 0) {
        // It is possible for there to be no clusters in a tile.
        throw new PicardException("Cluster intensity file " + file + " has negative number of clusters: " +numClusters);
    }
}
 
Example 3
Source File: AbstractAlignmentMerger.java    From picard with MIT License 5 votes vote down vote up
private static void moveClippedBasesToTag(final SAMRecord rec, final int clipFrom) {
    if (rec.getAttribute(HARD_CLIPPED_BASES_TAG) != null || rec.getAttribute(HARD_CLIPPED_BASE_QUALITIES_TAG) != null) {
        throw new PicardException("Record " + rec.getReadName() + " already contains tags for restoring hard-clipped bases.  This operation will permanently erase information if it proceeds.");
    }

    final byte[] bases = rec.getReadBases();
    final byte[] baseQualities = rec.getBaseQualities();
    final int readLength = rec.getReadLength();

    final int clipPositionFrom, clipPositionTo;
    if (rec.getReadNegativeStrandFlag()) {
        clipPositionFrom = 0;
        clipPositionTo = readLength - clipFrom + 1;
    } else {
        clipPositionFrom = clipFrom - 1;
        clipPositionTo = readLength;
    }

    String basesToKeepInTag = StringUtil.bytesToString(Arrays.copyOfRange(bases, clipPositionFrom, clipPositionTo));
    String qualitiesToKeepInTag = SAMUtils.phredToFastq(Arrays.copyOfRange(baseQualities, clipPositionFrom, clipPositionTo));

    if (rec.getReadNegativeStrandFlag()) {
        // Ensures that the qualities and bases in the tags are stored in their original order, as produced by the sequencer
        basesToKeepInTag = SequenceUtil.reverseComplement(basesToKeepInTag);
        qualitiesToKeepInTag =  new StringBuilder(qualitiesToKeepInTag).reverse().toString();
    }
    rec.setAttribute(HARD_CLIPPED_BASES_TAG, basesToKeepInTag);
    rec.setAttribute(HARD_CLIPPED_BASE_QUALITIES_TAG, qualitiesToKeepInTag);
}
 
Example 4
Source File: LiftoverVcf.java    From picard with MIT License 5 votes vote down vote up
/**
 *  utility function to attempt to add a variant. Checks that the reference allele still matches the reference (which may have changed)
 *
 * @param vc new {@link VariantContext}
 * @param refSeq {@link ReferenceSequence} of new reference
 * @param source the original {@link VariantContext} to use for putting the original location information into vc
 */
private void tryToAddVariant(final VariantContext vc, final ReferenceSequence refSeq, final VariantContext source) {
    if (!refSeq.getName().equals(vc.getContig())) {
        throw new IllegalStateException("The contig of the VariantContext, " + vc.getContig() + ", doesnt match the ReferenceSequence: " + refSeq.getName());
    }

    // Check that the reference allele still agrees with the reference sequence
    final boolean mismatchesReference;
    final Allele allele = vc.getReference();
    final byte[] ref = refSeq.getBases();
    final String refString = StringUtil.bytesToString(ref, vc.getStart() - 1, vc.getEnd() - vc.getStart() + 1);

    if (!refString.equalsIgnoreCase(allele.getBaseString())) {
        // consider that the ref and the alt may have been swapped in a simple biallelic SNP
        if (vc.isBiallelic() && vc.isSNP() && refString.equalsIgnoreCase(vc.getAlternateAllele(0).getBaseString())) {
            totalTrackedAsSwapRefAlt++;
            if (RECOVER_SWAPPED_REF_ALT) {
                addAndTrack(LiftoverUtils.swapRefAlt(vc, TAGS_TO_REVERSE, TAGS_TO_DROP), source);
                return;
            }
        }
        mismatchesReference = true;
    } else {
        mismatchesReference = false;
    }


    if (mismatchesReference) {
        rejectedRecords.add(new VariantContextBuilder(source)
                .filter(FILTER_MISMATCHING_REF_ALLELE)
                .attribute(ATTEMPTED_LOCUS, String.format("%s:%d-%d", vc.getContig(), vc.getStart(), vc.getEnd()))
                .attribute(ATTEMPTED_ALLELES, vc.getReference().toString() + "->" + String.join(",", vc.getAlternateAlleles().stream().map(Allele::toString).collect(Collectors.toList())))
                .make());
        failedAlleleCheck++;
        trackLiftedVariantContig(rejectsByContig, source.getContig());
    } else {
        addAndTrack(vc, source);
    }
}
 
Example 5
Source File: BaitDesigner.java    From picard with MIT License 5 votes vote down vote up
@Override
public String toString() {
    return "Bait{" +
            "name=" + getName() +
            ", bases=" + StringUtil.bytesToString(bases) +
            '}';
}
 
Example 6
Source File: BaitDesigner.java    From picard with MIT License 5 votes vote down vote up
/** Gets the bait sequence, with primers, as a String, RC'd as appropriate. */
private String getBaitSequence(final Bait bait, final boolean rc) {
    String sequence = (LEFT_PRIMER == null ? "" : LEFT_PRIMER) +
            StringUtil.bytesToString(bait.getBases()) +
            (RIGHT_PRIMER == null ? "" : RIGHT_PRIMER);

    if (rc) sequence = SequenceUtil.reverseComplement(sequence);
    return sequence;
}
 
Example 7
Source File: IlluminaUtil.java    From picard with MIT License 5 votes vote down vote up
/**
 * Concatenates all the barcode sequences with BARCODE_DELIMITER
 * @param barcodes
 * @return A single string representation of all the barcodes
 */
public static String byteArrayToString(final byte[][] barcodes, String delim) {
    final String[] bcs = new String[barcodes.length];
    for (int i = 0; i < barcodes.length; i++) {
        bcs[i] = StringUtil.bytesToString(barcodes[i]);
    }
    return stringSeqsToString(bcs, delim);
}
 
Example 8
Source File: AssemblyBasedSVDiscoveryTestDataProviderForBreakEndVariants.java    From gatk with BSD 3-Clause "New" or "Revised" License 5 votes vote down vote up
private static Tuple2<TestDataBreakEndVariants, TestDataBreakEndVariants> forIntraChromosomeRefOrderSwapWithInsertion() {

        String contigName = "forIntraChromosomeRefOrderSwapWithInsertion";
        byte[] contigSequence = "TGTGGTGTGGGTGTGTGCGTGTGTGTGGACTGTGTGGTGTGGGTGTGTGCGTGTGTGTGGACTGTGTGGTGTGGGTGTGTGCGTGTGTGTGTGACCGTGTGGAGTGTGTCTGTGTGCATGTGTGGGCTGTGTGGTGCGTGTGTGCTTATGTTTGGCGTGTGTGTGTGTGTGTGTGTGGACTGTGTGGTCTGTGTGTGTGTGCGTGTGTGTGGACTCTGTGGTGTGTGCATGTGTGCATGTATGTGCGTGTGTGTGACTGTGTGGTGTGTGTCTGTGTGCACGTGTGGGCTTATATTTGGTGTGTGTGCATGTGTGGACCGTGTGGTGTGTGTGTGTGCGTGTGTGGACTGTAGTGTGTGTGCACGTGTGTGTGTGCTTGTGTGTGGAGTGTGTGTGTGTGGACTGTAGTGTGTGTGCGTGACTGTGGTGTGTGTGCATGACTGTGTGGTGTGTGTGTGCATGTGTGTGGATTGTGTGGTGTGTGTGGACTGTGGGTGTGTGGTGCGTGTGTGTGCTTGTGTGTGGTGTGTGTGCGTGTGTGGGGACTGTGTGGTGCGTGTGTGTGCTTGTGTGTGGACTGTGGATTGTGTGGTGTGTGTGTGCGCACGTGTGTGTGCGTGTCTGTGTGGTGTGTGGACTGTGTGGTGTGTGTGGACTGTGGTGTGTGTGTGCGTGACTGTGTGGTGTGTGTGTGCGCGTGTGTGTGACTGCTTGGTGTGTGTGGACTGTGGTGTGTGTGGTGTGTGTGTGCTTGTGTGTGGTGTGTG".getBytes();
        String insSeq = StringUtil.bytesToString(Arrays.copyOfRange(contigSequence, 71, 617));
        AlignmentInterval firstAlignment = new AlignmentInterval(new SimpleInterval("chr20:44405003-44405064"), 10, 71, TextCigarCodec.decode("9H62M696H"), true, 32, 3, 47, ContigAlignmentsModifier.AlnModType.NONE);
        AlignmentInterval secondAlignment = new AlignmentInterval(new SimpleInterval("chr20:44404789-44404936"), 618, 767, TextCigarCodec.decode("617S66M2I82M"), true, 60, 4, 45, ContigAlignmentsModifier.AlnModType.NONE);
        SimpleChimera simpleChimera = new SimpleChimera(contigName, firstAlignment, secondAlignment, StrandSwitch.NO_SWITCH, true, Collections.emptyList(), NO_GOOD_MAPPING_TO_NON_CANONICAL_CHROMOSOME);
        SimpleInterval expectedLeftBreakpoint = new SimpleInterval("chr20:44404789-44404789");
        SimpleInterval expectedRightBreakpoint = new SimpleInterval("chr20:44405064-44405064");
        final BreakpointComplications expectedBreakpointComplications = new BreakpointComplications.IntraChrRefOrderSwapBreakpointComplications("", insSeq);
        NovelAdjacencyAndAltHaplotype expectedNovelAdjacencyAndAltSeq = new NovelAdjacencyAndAltHaplotype(expectedLeftBreakpoint, expectedRightBreakpoint, StrandSwitch.NO_SWITCH, expectedBreakpointComplications, TypeInferredFromSimpleChimera.INTRA_CHR_REF_ORDER_SWAP, EMPTY_BYTE_ARRAY);
        final List<SvType> expectedSVTypes = Arrays.asList(
                makeBNDType("chr20", 44404789, "BND_chr20_44404789_44405064_1", Allele.create("G", true), Allele.create("]chr20:44405064]"+insSeq+"G"), Collections.emptyMap(), true, BreakEndVariantType.SupportedType.INTRA_CHR_REF_ORDER_SWAP),
                makeBNDType("chr20", 44405064, "BND_chr20_44404789_44405064_2", Allele.create("G", true), Allele.create("G"+insSeq+"[chr20:44404789["), Collections.emptyMap(), false, BreakEndVariantType.SupportedType.INTRA_CHR_REF_ORDER_SWAP)
        );
        final List<VariantContext> expectedVariants = Arrays.asList(
                addStandardAttributes(makeBND(expectedLeftBreakpoint, expectedRightBreakpoint, Allele.create("G", true), insSeq, "", true, false, true), contigName, 32, 62, "", insSeq, "BND_chr20_44404789_44405064_2").make(),
                addStandardAttributes(makeBND(expectedLeftBreakpoint, expectedRightBreakpoint, Allele.create("G", true), insSeq, "", false, true, false), contigName, 32, 62, "", insSeq, "BND_chr20_44404789_44405064_1").make()
        );

        final TestDataBreakEndVariants forIntraChromosomeRefOrderSwapWithInsertion_plus =
                new TestDataBreakEndVariants(firstAlignment, secondAlignment, contigName, contigSequence, true, simpleChimera, expectedNovelAdjacencyAndAltSeq, expectedSVTypes, expectedVariants, BreakpointsInference.IntraChrRefOrderSwapBreakpointsInference.class);


        firstAlignment = new AlignmentInterval(new SimpleInterval("chr20:44404789-44404936"), 1, 150, TextCigarCodec.decode("82M2I66M617S"), false, 60, 4, 45, ContigAlignmentsModifier.AlnModType.NONE);
        secondAlignment = new AlignmentInterval(new SimpleInterval("chr20:44405003-44405064"), 697, 758, TextCigarCodec.decode("696H62M9H"), false, 32, 3, 47, ContigAlignmentsModifier.AlnModType.NONE);
        simpleChimera = new SimpleChimera(contigName, firstAlignment, secondAlignment, StrandSwitch.NO_SWITCH, false, Collections.emptyList(), NO_GOOD_MAPPING_TO_NON_CANONICAL_CHROMOSOME);

        final TestDataBreakEndVariants forIntraChromosomeRefOrderSwapWithInsertion_minus =
                new TestDataBreakEndVariants(firstAlignment, secondAlignment, contigName, getReverseComplimentCopy(contigSequence), false, simpleChimera, expectedNovelAdjacencyAndAltSeq, expectedSVTypes, expectedVariants, BreakpointsInference.IntraChrRefOrderSwapBreakpointsInference.class);

        return new Tuple2<>(forIntraChromosomeRefOrderSwapWithInsertion_plus, forIntraChromosomeRefOrderSwapWithInsertion_minus);
    }
 
Example 9
Source File: TrimSequenceTemplate.java    From Drop-seq with MIT License 4 votes vote down vote up
/**
 * If a barcode has ignore bases, then expand those bases to A/C/G/T.
 * Otherwise, return the barcode. This is recursive, so multiple ignored
 * bases will be expanded.
 *
 * @return
 */
public static Collection<TrimSequenceTemplate> expandBarcode(
		final TrimSequenceTemplate b, final byte[] ignoredBases) {
	Collection<TrimSequenceTemplate> result = new ArrayList<>();
	result.add(b);
	byte[] bases = StringUtil.stringToBytes(b.getSequence());
	for (int i = 0; i < bases.length; i++) {
		boolean ignoreBaseFound = baseInBaseList(bases[i], ignoredBases);
		if (ignoreBaseFound) {
			result.remove(b);
			bases[i] = A;
			TrimSequenceTemplate newBC = new TrimSequenceTemplate(
					StringUtil.bytesToString(bases),
					StringUtil.bytesToString(ignoredBases));
			Collection<TrimSequenceTemplate> r = expandBarcode(newBC,
					ignoredBases);
			result.addAll(r);

			bases[i] = C;
			newBC = new TrimSequenceTemplate(
					StringUtil.bytesToString(bases),
					StringUtil.bytesToString(ignoredBases));
			r = expandBarcode(newBC, ignoredBases);
			result.addAll(r);

			bases[i] = G;
			newBC = new TrimSequenceTemplate(
					StringUtil.bytesToString(bases),
					StringUtil.bytesToString(ignoredBases));
			r = expandBarcode(newBC, ignoredBases);
			result.addAll(r);

			bases[i] = T;
			newBC = new TrimSequenceTemplate(
					StringUtil.bytesToString(bases),
					StringUtil.bytesToString(ignoredBases));
			r = expandBarcode(newBC, ignoredBases);
			result.addAll(r);
			break; // stop looping

		}

	}
	return (result);
}
 
Example 10
Source File: CollapseTagWithContextTest.java    From Drop-seq with MIT License 4 votes vote down vote up
private final String makeRandomBaseString(final int length) {
     final byte[] bases = new byte[length];
     for(int i = 0; i < length; ++i)
bases[i] = getRandomBase();
     return StringUtil.bytesToString(bases);
 }
 
Example 11
Source File: Snp.java    From picard with MIT License 4 votes vote down vote up
public String getAlleleString() {
    return StringUtil.bytesToString(new byte[] {allele1, StringUtil.toLowerCase(allele2)});
}
 
Example 12
Source File: LiftoverUtils.java    From picard with MIT License 3 votes vote down vote up
/**
 * Checks whether the reference allele in the provided variant context actually matches the reference sequence
 *
 * @param alleles           list of alleles from which to find the reference allele
 * @param referenceSequence the ref sequence
 * @param start             the start position of the actual indel
 * @param end               the end position of the actual indel
 * @return true if they match, false otherwise
 */
private static boolean referenceAlleleDiffersFromReferenceForIndel(final List<Allele> alleles,
                                                                   final ReferenceSequence referenceSequence,
                                                                   final int start,
                                                                   final int end) {
    final String refString = StringUtil.bytesToString(referenceSequence.getBases(), start - 1, end - start + 1);
    final Allele refAllele = alleles.stream().filter(Allele::isReference).findAny().orElseThrow(() -> new IllegalStateException("Error: no reference allele was present"));
    return !refString.equalsIgnoreCase(refAllele.getBaseString());
}
 
Example 13
Source File: SAMPileupFeature.java    From gatk with BSD 3-Clause "New" or "Revised" License 2 votes vote down vote up
/**
 * Returns pile of observed bases over the genomic location.
 *
 * Note: this call costs O(n) and allocates fresh array each time
 */
public String getBasesString() {
    return StringUtil.bytesToString(getBases());
}
 
Example 14
Source File: GATKRead.java    From gatk with BSD 3-Clause "New" or "Revised" License 2 votes vote down vote up
/**
 * @return All bases in the read as a single String, or {@link ReadConstants#NULL_SEQUENCE_STRING}
 *         if the read is empty.
 */
default String getBasesString() {
    return isEmpty() ? ReadConstants.NULL_SEQUENCE_STRING : StringUtil.bytesToString(getBases());
}