Java Code Examples for htsjdk.variant.variantcontext.Allele#getBases()

The following examples show how to use htsjdk.variant.variantcontext.Allele#getBases() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: ArtificialReadUtils.java    From gatk with BSD 3-Clause "New" or "Revised" License 6 votes vote down vote up
/**
 * See {@link ArtificialReadUtils#createSplicedInsertionPileupElement}, except that this method returns a
 * pileup element containing base-by-base replacement.  As a result, the length of the read will not change.
 *
 * @param offsetIntoRead See {@link ArtificialReadUtils#createSplicedInsertionPileupElement}
 * @param newAllele The new bases that should be in the read at the specified position.  If this allele causes the
 *                  replacement to extend beyond the end of the read
 *                  (i.e. offsetIntoRead + length(newAllele) is greater than length of read),
 *                  the replacement will be truncated.
 * @param lengthOfRead See {@link ArtificialReadUtils#createSplicedInsertionPileupElement}
 * @return pileupElement with an artificial read containing the new bases specified by te given allele.
 */
public static PileupElement createNonIndelPileupElement(final int offsetIntoRead, final Allele newAllele, final int lengthOfRead) {
    ParamUtils.isPositive(lengthOfRead, "length of read is invalid for creating an artificial read, must be greater than 0.");
    ParamUtils.inRange(offsetIntoRead, 0, lengthOfRead-1, "offset into read is invalid for creating an artificial read, must be 0-" + (lengthOfRead-1) + ".");
    Utils.nonNull(newAllele);

    final String cigarString = lengthOfRead + "M";

    final Cigar cigar = TextCigarCodec.decode(cigarString);
    final GATKRead gatkRead = ArtificialReadUtils.createArtificialRead(cigar);
    final byte[] newBases = gatkRead.getBases();
    final int upperBound = Math.min(offsetIntoRead + newAllele.getBases().length, lengthOfRead);
    for (int i = offsetIntoRead; i < upperBound; i++) {
        newBases[i] = newAllele.getBases()[i - offsetIntoRead];
    }
    gatkRead.setBases(newBases);
    return PileupElement.createPileupForReadAndOffset(gatkRead, offsetIntoRead);
}
 
Example 2
Source File: ReadOrientationFilter.java    From gatk with BSD 3-Clause "New" or "Revised" License 6 votes vote down vote up
@VisibleForTesting
double artifactProbability(final ReferenceContext referenceContext, final VariantContext vc, final Genotype g) {
    // As of June 2018, genotype is hom ref iff we have the normal sample, but this may change in the future
    // TODO: handle MNVs
    if (g.isHomRef() || (!vc.isSNP() && !vc.isMNP()) ){
        return 0;
    } else if (!artifactPriorCollections.containsKey(g.getSampleName())) {
        return 0;
    }

    final double[] tumorLods = VariantContextGetters.getAttributeAsDoubleArray(vc, GATKVCFConstants.TUMOR_LOG_10_ODDS_KEY, () -> null, -1);
    final int indexOfMaxTumorLod = MathUtils.maxElementIndex(tumorLods);
    final Allele altAllele = vc.getAlternateAllele(indexOfMaxTumorLod);
    final byte[] altBases = altAllele.getBases();

    // for MNVs, treat each base as an independent substitution and take the maximum of all error probabilities
    return IntStream.range(0, altBases.length).mapToDouble(n -> {
        final Nucleotide altBase = Nucleotide.valueOf(new String(new byte[] {altBases[n]}));

        return artifactProbability(referenceContext, vc.getStart() + n, g, indexOfMaxTumorLod, altBase);
    }).max().orElse(0.0);

}
 
Example 3
Source File: FuncotatorUtils.java    From gatk with BSD 3-Clause "New" or "Revised" License 6 votes vote down vote up
/**
 * Get the start position of the difference between the reference and alternate alleles in the given {@link VariantContext}.
 * @param variant {@link VariantContext} in which to determine the start of the different bases.
 * @param altAllele The alternate {@link Allele} against which to check the reference allele in {@code variant}.
 * @return The start position of the difference between the reference and alternate alleles in the given {@link VariantContext}.
 */
public static int getIndelAdjustedAlleleChangeStartPosition(final VariantContext variant, final Allele altAllele) {

    // If the variant is an indel, we need to check only the bases that are added/deleted for overlap.
    // The convention for alleles in Funcotator is to preserve a leading base for an indel, so we just need
    // to create a new variant that has its start position shifted by the leading base.
    // NOTE: because there could be degenerate VCF files that have more than one leading base overlapping, we need
    //       to detect how many leading bases there are that overlap, rather than assuming there is only one.
    final int varStart;
    if ( GATKVariantContextUtils.typeOfVariant(variant.getReference(), altAllele).equals(VariantContext.Type.INDEL) &&
         !GATKVariantContextUtils.isComplexIndel(variant.getReference(), altAllele) ) {
        int startOffset = 0;
        while ( (startOffset < variant.getReference().length()) && (startOffset < altAllele.length()) && (variant.getReference().getBases()[ startOffset ] == altAllele.getBases()[ startOffset ]) ) {
            ++startOffset;
        }
        varStart = variant.getStart() + startOffset;
    }
    else {
        // Not an indel?  Then we should have no overlapping bases:
        varStart = variant.getStart();
    }

    return varStart;
}
 
Example 4
Source File: ArtificialReadUtils.java    From gatk with BSD 3-Clause "New" or "Revised" License 5 votes vote down vote up
/**
 * Create a pileupElement with the given insertion added to the bases.
 *
 * Assumes the insertion is prepended with one "reference" base.
 *
 * @param offsetIntoRead the offset into the read where the insertion Allele should appear.  As a reminder, the
 *                       insertion allele should have a single ref base prepend.  Must be 0 - (lengthOfRead-1)
 * @param insertionAllele the allele as you would see in a VCF for the insertion.  So, it is prepended with a ref base.  Never {@code null}
 * @param lengthOfRead the length of the artificial read.  Does not include any length differences due to the spliced indel.  Must be greater than zero.
 * @return pileupElement with an artificial read containing the insertion.
 */
public static PileupElement createSplicedInsertionPileupElement(int offsetIntoRead, final Allele insertionAllele, final int lengthOfRead) {

    ParamUtils.isPositive(lengthOfRead, "length of read is invalid for creating an artificial read, must be greater than 0.");
    ParamUtils.inRange(offsetIntoRead, 0, lengthOfRead-1, "offset into read is invalid for creating an artificial read, must be 0-" + (lengthOfRead-1) + ".");
    Utils.nonNull(insertionAllele);

    int remainingReadLength = lengthOfRead - ((offsetIntoRead + 1) + (insertionAllele.getBases().length - 1));
    String cigarString = (offsetIntoRead + 1) + "M" + (insertionAllele.getBases().length - 1) + "I";
    if (remainingReadLength > 0) {
        cigarString += (remainingReadLength + "M");
    }

    final Cigar cigar = TextCigarCodec.decode(cigarString);
    final GATKRead gatkRead = ArtificialReadUtils.createArtificialRead(cigar);
    final PileupElement pileupElement = PileupElement.createPileupForReadAndOffset(gatkRead, offsetIntoRead);

    // Splice in that insertion.
    final byte[] bases = gatkRead.getBases();
    final int newReadLength = lengthOfRead + insertionAllele.getBases().length - 1;
    final byte[] destBases = new byte[newReadLength];
    final byte[] basesToInsert = ArrayUtils.subarray(insertionAllele.getBases(), 1, insertionAllele.getBases().length);
    System.arraycopy(bases, 0, destBases, 0, offsetIntoRead);

    // Make sure that the one prepended "reference base" matches the input.
    destBases[offsetIntoRead] = insertionAllele.getBases()[0];

    System.arraycopy(basesToInsert, 0, destBases, offsetIntoRead+1, basesToInsert.length);

    if ((offsetIntoRead + 1) < lengthOfRead) {
        System.arraycopy(bases, offsetIntoRead + 1, destBases, offsetIntoRead + basesToInsert.length + 1, bases.length - 1 - offsetIntoRead);
    }

    gatkRead.setBases(destBases);

    return pileupElement;
}
 
Example 5
Source File: ArtificialReadUtils.java    From gatk with BSD 3-Clause "New" or "Revised" License 5 votes vote down vote up
/** See {@link ArtificialReadUtils#createSplicedInsertionPileupElement}, except that this method returns a
 * pileup element containing the specified deletion.
 *
 * @param offsetIntoRead See {@link ArtificialReadUtils#createSplicedInsertionPileupElement}
 * @param referenceAllele  the reference allele as you would see in a VCF for the deletion.
 *                         In other words, it is the deletion prepended with a single ref base.  Never {@code null}
 * @param lengthOfRead See {@link ArtificialReadUtils#createSplicedInsertionPileupElement}
 * @return pileupElement with an artificial read containing the deletion.
 */
public static PileupElement createSplicedDeletionPileupElement(int offsetIntoRead, final Allele referenceAllele, final int lengthOfRead) {
    ParamUtils.isPositive(lengthOfRead, "length of read is invalid for creating an artificial read, must be greater than 0.");
    ParamUtils.inRange(offsetIntoRead, 0, lengthOfRead-1, "offset into read is invalid for creating an artificial read, must be 0-" + (lengthOfRead-1) + ".");
    Utils.nonNull(referenceAllele);

    // Do not include the prepended "ref"
    final int numberOfSpecifiedBasesToDelete = referenceAllele.getBases().length - 1;
    final int numberOfBasesToActuallyDelete = Math.min(numberOfSpecifiedBasesToDelete, lengthOfRead - offsetIntoRead - 1);

    final int newReadLength = lengthOfRead - numberOfBasesToActuallyDelete;

    String cigarString = (offsetIntoRead + 1) + "M";

    if (numberOfBasesToActuallyDelete > 0) {
        cigarString += numberOfBasesToActuallyDelete + "D";
    }
    final int remainingBases = lengthOfRead - (offsetIntoRead + 1) - numberOfBasesToActuallyDelete;
    if (remainingBases > 0) {
        cigarString += remainingBases + "M";
    }

    final Cigar cigar = TextCigarCodec.decode(cigarString);
    final GATKRead gatkRead = ArtificialReadUtils.createArtificialRead(cigar);
    final PileupElement pileupElement = PileupElement.createPileupForReadAndOffset(gatkRead, offsetIntoRead);

    // The Cigar string has basically already told the initial generation of a read to delete bases.
    final byte[] bases = gatkRead.getBases();

    // Make sure that the one prepended "reference base" matches the input.
    bases[offsetIntoRead] = referenceAllele.getBases()[0];

    gatkRead.setBases(bases);

    return pileupElement;
}
 
Example 6
Source File: FastaAlternateReferenceMaker.java    From gatk with BSD 3-Clause "New" or "Revised" License 4 votes vote down vote up
private byte[] handlePosition(SimpleInterval interval, byte base, FeatureContext features) {
    if (deletionBasesRemaining > 0) {
        deletionBasesRemaining--;
        return NO_BASES;
    }

    // If we have a mask at this site, use it
    if ( snpmaskPriority ){
        if (isMasked(features) )
            return N_BYTES;
    }

    // Check to see if we have a called snp
    for ( final VariantContext vc : features.getValues(variants) ) {
        if ( vc.isFiltered() || vc.getStart() != interval.getStart()  )
            continue;

        if ( vc.isSimpleDeletion()) {
            deletionBasesRemaining = vc.getReference().length() - 1;
            // delete the next n bases, not this one
            return baseToByteArray(base);
        } else if ( vc.isSimpleInsertion() || vc.isSNP() ) {
            // Get the first alt allele that is not a spanning deletion. If none present, use the empty allele
            final Optional<Allele> optionalAllele = getFirstConcreteAltAllele(vc.getAlternateAlleles());
            final Allele allele = optionalAllele.orElseGet(() -> Allele.create(EMPTY_BASE, false));
            if ( vc.isSimpleInsertion() ) {
                return allele.getBases();
            } else {
                final String iupacBase = (iupacSample != null) ? getIUPACBase(vc.getGenotype(iupacSample)) : allele.toString();
                return iupacBase.getBytes();
            }
        }
    }

    if ( !snpmaskPriority ){
        if ( isMasked(features)) {
            return N_BYTES;
        }
    }

    // if we got here then we're just ref
    return baseToByteArray(base);
}
 
Example 7
Source File: SubsettedLikelihoodMatrix.java    From gatk with BSD 3-Clause "New" or "Revised" License votes vote down vote up
public static boolean basesMatch(final Allele a, final Allele b) { return a.getBases() == b.getBases() || Arrays.equals(a.getBases(), b.getBases()); }