Java Code Examples for htsjdk.variant.variantcontext.VariantContext#getEnd()

The following examples show how to use htsjdk.variant.variantcontext.VariantContext#getEnd() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: SegmentedCpxVariantSimpleVariantExtractor.java    From gatk with BSD 3-Clause "New" or "Revised" License 6 votes vote down vote up
@VisibleForTesting
static boolean deletionConsistencyCheck(final VariantContext simple, final Set<SimpleInterval> missingSegments) {
    if (missingSegments.isEmpty()) return false;

    final SimpleInterval deletedRange = new SimpleInterval(simple.getContig(), simple.getStart() + 1, simple.getEnd());
    // dummy number for chr to be used in constructing SVInterval, since 2 input AI's both map to the same chr by this point
    final int dummyChr = 0;
    final SVInterval intervalOne = new SVInterval(dummyChr, deletedRange.getStart() - 1, deletedRange.getEnd());

    for (final SimpleInterval missing : missingSegments) {
        if ( ! missing.overlaps(deletedRange) )
            return false;
        final SVInterval intervalTwo = new SVInterval(dummyChr, missing.getStart() - 1, missing.getEnd());
        // allow 1-base fuzziness from either end
        if ( Math.abs(missing.size() - deletedRange.size()) > 2 )
            return false;
        if( 2 >= Math.abs( Math.min(missing.size(), deletedRange.size()) - intervalTwo.overlapLen(intervalOne) ) ){
            return true;
        }
    }
    return false;
}
 
Example 2
Source File: GVCFBlockCombiner.java    From gatk with BSD 3-Clause "New" or "Revised" License 6 votes vote down vote up
/**
 * Add hom-ref site from vc to this gVCF hom-ref state tracking, emitting any pending states if appropriate
 *
 * @param vc a non-null VariantContext
 * @param g  a non-null genotype from VariantContext
 * @return a VariantContext to be emitted, or null if non is appropriate
 */
protected VariantContext addHomRefSite(final VariantContext vc, final Genotype g) {

    if (nextAvailableStart != -1) {
        //there's a use case here related to ReblockGVCFs for overlapping deletions on different haplotypes
        if ( vc.getStart() <= nextAvailableStart && vc.getContig().equals(contigOfNextAvailableStart) ) {
            if (vc.getEnd() <= nextAvailableStart) {
                return null;
            }
        }
        // otherwise, reset to non-relevant
        nextAvailableStart = -1;
        contigOfNextAvailableStart = null;
    }

    final VariantContext result;
    if (genotypeCanBeMergedInCurrentBlock(g)) {
        currentBlock.add(vc.getStart(), vc.getAttributeAsInt(VCFConstants.END_KEY, vc.getStart()), g);
        result = null;
    } else {
        result = currentBlock != null ? currentBlock.toVariantContext(sampleName, floorBlocks): null;
        currentBlock = createNewBlock(vc, g);
    }
    return result;
}
 
Example 3
Source File: FuncotatorEngine.java    From gatk with BSD 3-Clause "New" or "Revised" License 6 votes vote down vote up
/**
 * Gets the correct {@link ReferenceContext} for the {@code variant} being processed based on if the B37->HG19 conversion is required.
 * @param variant {@link VariantContext} to check for B37/HG19 compliance.
 * @param referenceContext {@link ReferenceContext} on which the given {@code variant} was originally based before the variant transformation.
 * @return A {@link ReferenceContext} that is guaranteed to match the given {@code variant} for HG19/B37 compliance.
 */
public ReferenceContext getCorrectReferenceContext(final VariantContext variant, final ReferenceContext referenceContext) {

    final ReferenceContext correctReferenceContext;

    // Check to see if we need to revert the ReferenceContext's interval to the original variant interval
    // (This would only happen in the case where we were given b37 variants with hg19 data sources):
    if ( mustConvertInputContigsToHg19 ) {

        // Convert our contig back to B37 here so it matches the variant:
        final SimpleInterval interval = new SimpleInterval(
                FuncotatorUtils.convertHG19ContigToB37Contig(variant.getContig()), variant.getStart(), variant.getEnd()
        );

        correctReferenceContext = new ReferenceContext(referenceContext, interval);
    }
    else {
        correctReferenceContext = referenceContext;
    }

    return correctReferenceContext;
}
 
Example 4
Source File: VariantEval.java    From gatk with BSD 3-Clause "New" or "Revised" License 6 votes vote down vote up
private void addVariant(VariantContext vc, ReadsContext readsContext, ReferenceContext referenceContext, FeatureContext featureContext) {
    if (i == null || !vc.getContig().equals(i.getContig()) || vc.getStart() != i.getStart()) {
        callDoApply();

        i = new SimpleInterval(vc.getContig(), vc.getStart(), vc.getEnd());
        this.readsContext = readsContext;
        this.referenceContext = referenceContext;
        this.featureContext = featureContext;
    }
    else if (vc.getEnd() > i.getEnd()) {
        //expand region
        i = new SimpleInterval(i.getContig(), i.getStart(), vc.getEnd());

        this.readsContext = new ReadsContext(this.readsContext, i);
        this.referenceContext = new ReferenceContext(this.referenceContext, i);
        this.featureContext = new FeatureContext(this.featureContext, i);
    }
}
 
Example 5
Source File: CollectSamErrorMetrics.java    From picard with MIT License 6 votes vote down vote up
/**
 * Compares a VariantContext to a Locus providing information regarding possible overlap, or relative location
 *
 * @param dictionary     The {@link SAMSequenceDictionary} to use for ordering the sequences
 * @param variantContext the {@link VariantContext} to compare
 * @param locus          the {@link Locus} to compare
 * @return negative if variantContext comes before locus (with no overlap)
 * zero if variantContext and locus overlap
 * positive if variantContext comes after locus (with no overlap)
 * <p/>
 * if variantContext and locus are in the same contig the return value will be the number of bases apart they are,
 * otherwise it will be MIN_INT/MAX_INT
 */
public static int CompareVariantContextToLocus(final SAMSequenceDictionary dictionary, final VariantContext variantContext, final Locus locus) {

    final int indexDiff = dictionary.getSequenceIndex(variantContext.getContig()) - locus.getSequenceIndex();
    if (indexDiff != 0) {
        return indexDiff < 0 ? Integer.MIN_VALUE : Integer.MAX_VALUE;
    }

    //same SequenceIndex, can compare by genomic position
    if (locus.getPosition() < variantContext.getStart()) {
        return variantContext.getStart() - locus.getPosition();
    }
    if (locus.getPosition() > variantContext.getEnd()) {
        return variantContext.getEnd() - locus.getPosition();
    }
    return 0;
}
 
Example 6
Source File: VariantIteratorProducer.java    From picard with MIT License 6 votes vote down vote up
@Override
public boolean apply(final VariantContext vc) {
    if (vc.getStart() == vc.getEnd()) {
        // This isn't a multi-allelic segment, so it can only be produced by a single segment.
        return true;
    }
    final Collection<VcfFileSegment> intersectingSegments =
            multiSegmentDetectorPerFile.get(sourceSegment.vcf()).getOverlaps(new Interval(vc.getContig(), vc.getStart(), vc.getEnd()));
    if (intersectingSegments.size() < 2) {
        // There's only one segment that produces this variant
        return true;
    }

    // The convention is: only emit the VC if it is produced from the first segment that can produce it in the list.
    final int sourceSegmentIndex = segments.indexOf(sourceSegment);
    LOG.debug("Found wide variant spanning multiple source segments: ", vc);
    for (final VcfFileSegment intersectingSegment : intersectingSegments) {
        if (segments.indexOf(intersectingSegment) < sourceSegmentIndex) {
            // There is a segment that produces this variant earlier in the segment list, exclude it.
            return false;
        }
    }
    LOG.debug("Emitting wide variant because it belongs to first segment: ", vc);
    return true;
}
 
Example 7
Source File: LiftoverVcfTest.java    From picard with MIT License 5 votes vote down vote up
@Test(dataProvider = "indelFlipData")
public void testFlipIndel(final VariantContext source, final ReferenceSequence reference, final VariantContext result) {

    final LiftOver liftOver = new LiftOver(CHAIN_FILE);
    final Interval originalLocus = new Interval(source.getContig(), source.getStart(), source.getEnd());
    final Interval target = liftOver.liftOver(originalLocus);
    if (target != null && !target.isNegativeStrand()) {
        throw new RuntimeException("not reversed");
    }

    final VariantContext flipped = LiftoverUtils.liftVariant(source, target, reference, false, false);

    VcfTestUtils.assertEquals(flipped, result);
}
 
Example 8
Source File: UpdateVCFSequenceDictionary.java    From gatk with BSD 3-Clause "New" or "Revised" License 5 votes vote down vote up
@Override
public void apply(final VariantContext vc, final ReadsContext readsContext, final ReferenceContext ref, final FeatureContext featureContext) {
    // Validate each variant against the source dictionary manually
    SAMSequenceRecord samSeqRec = sourceDictionary.getSequence(vc.getContig());
    if (samSeqRec == null) {
        throw new CommandLineException.BadArgumentValue(
            String.format(
                "The input variant file contains a variant (ID: \"%s\") with a reference to a sequence (\"%s\") " +
                "that is not present in the provided dictionary",
                vc.getID(),
                vc.getContig()
            )
        );
    } else if (vc.getEnd() > samSeqRec.getSequenceLength()) {
        throw new CommandLineException.BadArgumentValue(
            String.format(
                "The input variant file contains a variant (ID: \"%s\") with a reference to a sequence (\"%s\") " +
                "that ends at a position (%d) that exceeds the length of that sequence (%d) in the provided dictionary",
                vc.getID(),
                vc.getContig(),
                vc.getEnd(),
                samSeqRec.getSequenceLength()
            )
        );
    }
    vcfWriter.add(vc);
}
 
Example 9
Source File: VCFRecordReader.java    From Hadoop-BAM with MIT License 5 votes vote down vote up
private boolean overlaps(VariantContext v) {
	if (intervals == null) {
		return true;
	}
	final Interval interval = new Interval(v.getContig(), v.getStart(), v.getEnd());
	return overlapDetector.overlapsAny(interval);
}
 
Example 10
Source File: GencodeFuncotationFactoryUnitTest.java    From gatk with BSD 3-Clause "New" or "Revised" License 5 votes vote down vote up
@Test
public void testFivePrimeFlankSorting() {
    // This variant context would get a sorting that was different than the ground truth in FuncotatorIntegrationTest#nonTrivialLargeDataValidationTest
    final VariantContext variantContext = new VariantContextBuilder("", "1", 871276, 871276,
            Arrays.asList(Allele.create("G", true), Allele.create("A"))).make();
    final String transcriptGtfFile = GENCODE_HG19_BIG_GTF_FILE;
    final String transcriptFastaFile = GENCODE_HG19_BIG_FASTA_FILE;
    final SimpleInterval variantInterval = new SimpleInterval(variantContext);
    final ReferenceContext referenceContext = new ReferenceContext(ReferenceDataSource.of(Paths.get(b37Reference)), variantInterval);
    final VariantContext vcHg19 = new VariantContextBuilder(variantContext).chr(FuncotatorUtils.convertB37ContigToHg19Contig(variantContext.getContig())).make();
    final SimpleInterval vcHg19Interval = new SimpleInterval(vcHg19.getContig(), vcHg19.getStart(), vcHg19.getEnd());

    final FeatureInput<GencodeGtfFeature> gencodeFeatureInput = new FeatureInput<>(transcriptGtfFile, GencodeFuncotationFactory.DEFAULT_NAME, Collections.emptyMap());
    final Map<FeatureInput<? extends Feature>, Class<? extends Feature>> featureInputMap = new HashMap<>();
    featureInputMap.put(gencodeFeatureInput, GencodeGtfFeature.class);
    final FeatureContext featureContext = FeatureContext.createFeatureContextForTesting(featureInputMap, "dummyName", vcHg19Interval, VariantWalker.DEFAULT_DRIVING_VARIANTS_LOOKAHEAD_BASES, 0, 0, null);

    // Sorts canonically
    try (final GencodeFuncotationFactory funcotationFactory = new GencodeFuncotationFactory(
            IOUtils.getPath(transcriptFastaFile),
            "VERSION",
            GencodeFuncotationFactory.DEFAULT_NAME,
            FuncotatorArgumentDefinitions.TRANSCRIPT_SELECTION_MODE_DEFAULT_VALUE,
            Collections.emptySet(),
            new LinkedHashMap<>(),
            gencodeFeatureInput,
            new FlankSettings(5000, 0),
            "TEST")) {

        final List<Funcotation> gencodeFuncotationList = funcotationFactory.createFuncotations(vcHg19, referenceContext, featureContext);

        System.out.println(gencodeFuncotationList.size());
    }
}
 
Example 11
Source File: SVVCFReader.java    From gatk with BSD 3-Clause "New" or "Revised" License 5 votes vote down vote up
public static SVIntervalTree<String> readBreakpointsFromTruthVCF(final String truthVCF,
                                                                 final SAMSequenceDictionary dictionary,
                                                                 final int padding ) {
    SVIntervalTree<String> breakpoints = new SVIntervalTree<>();
    try ( final FeatureDataSource<VariantContext> dataSource =
                  new FeatureDataSource<>(truthVCF, null, 0, VariantContext.class) ) {
        for ( final VariantContext vc : dataSource ) {
            final StructuralVariantType svType = vc.getStructuralVariantType();
            if ( svType == null ) continue;
            final String eventName = vc.getID();
            final int contigID = dictionary.getSequenceIndex(vc.getContig());
            if ( contigID < 0 ) {
                throw new UserException("VCF contig " + vc.getContig() + " does not appear in dictionary.");
            }
            final int start = vc.getStart();
            switch ( svType ) {
                case DEL:
                case INV:
                case CNV:
                    final int end = vc.getEnd();
                    breakpoints.put(new SVInterval(contigID,start-padding, end+padding), eventName);
                    break;
                case INS:
                case DUP:
                case BND:
                    breakpoints.put(new SVInterval(contigID,start-padding, start+padding), eventName);
                    break;
            }
        }
    }
    return breakpoints;
}
 
Example 12
Source File: ReadPosRankSumTest.java    From gatk with BSD 3-Clause "New" or "Revised" License 5 votes vote down vote up
public static OptionalDouble getReadPosition(final GATKRead read, final VariantContext vc) {
    Utils.nonNull(read);

    // edge case: if a read starts with an insertion then the variant context's end is the reference base BEFORE the read start,
    // which appears like no overlap
    if (read.getStart() == vc.getEnd() + 1 && read.getCigarElements().stream().map(CigarElement::getOperator)
            .filter(o -> !o.isClipping()).findFirst().orElse(null) == CigarOperator.INSERTION) {
        return OptionalDouble.of(0);
    }

    final Pair<Integer, CigarOperator> offset = ReadUtils.getReadIndexForReferenceCoordinate(read, vc.getStart());

    if (offset.getLeft() == ReadUtils.READ_INDEX_NOT_FOUND) {
        return OptionalDouble.empty();
    }

    // hard clips at this point in the code are perfectly good bases that were clipped to make the read fit the assembly region
    final Cigar cigar = read.getCigar();
    final CigarElement firstElement = cigar.getFirstCigarElement();
    final CigarElement lastElement = cigar.getLastCigarElement();
    final int leadingHardClips = firstElement.getOperator() == CigarOperator.HARD_CLIP ? firstElement.getLength() : 0;
    final int trailingHardClips = lastElement.getOperator() == CigarOperator.HARD_CLIP ? lastElement.getLength() : 0;


    final int leftDistance = leadingHardClips + offset.getLeft();
    final int rightDistance = read.getLength() - 1 - offset.getLeft() + trailingHardClips;
    return OptionalDouble.of(Math.min(leftDistance, rightDistance));
}
 
Example 13
Source File: DbSnpBitSetUtil.java    From picard with MIT License 4 votes vote down vote up
/** Private helper method to read through the VCF and create one or more bit sets. */
private static void loadVcf(final File dbSnpFile,
                            final SAMSequenceDictionary sequenceDictionary,
                            final Map<DbSnpBitSetUtil, Set<VariantType>> bitSetsToVariantTypes,
                            final IntervalList intervals,
                            final Optional<Log> log) {

    final Optional<ProgressLogger> progress = log.map(l -> new ProgressLogger(l, (int) 1e5, "Read", "variants"));
    final VCFFileReader variantReader = new VCFFileReader(dbSnpFile, intervals != null);
    final Iterator<VariantContext> variantIterator;
    if (intervals != null) {
        variantIterator = new ByIntervalListVariantContextIterator(variantReader, intervals);
    }
    else {
        variantIterator = variantReader.iterator();
    }

    while (variantIterator.hasNext()) {
        final VariantContext kv = variantIterator.next();

        for (final Map.Entry<DbSnpBitSetUtil, Set<VariantType>> tuple : bitSetsToVariantTypes.entrySet()) {
            final DbSnpBitSetUtil bitset            = tuple.getKey();
            final Set<VariantType> variantsToMatch  = tuple.getValue();

            BitSet bits = bitset.sequenceToBitSet.get(kv.getContig());
            if (bits == null) {
                final int nBits;
                if (sequenceDictionary == null) nBits = kv.getEnd() + 1;
                else nBits = sequenceDictionary.getSequence(kv.getContig()).getSequenceLength() + 1;
                bits = new BitSet(nBits);
                bitset.sequenceToBitSet.put(kv.getContig(), bits);
            }
            if (variantsToMatch.isEmpty() ||
                    (kv.isSNP() && variantsToMatch.contains(VariantType.SNP)) ||
                    (kv.isIndel() && variantsToMatch.contains(VariantType.insertion)) ||
                    (kv.isIndel() && variantsToMatch.contains(VariantType.deletion))) {

                for (int i = kv.getStart(); i <= kv.getEnd(); i++)  bits.set(i, true);
            }
        }
        progress.map(p -> p.record(kv.getContig(), kv.getStart()));
    }

    CloserUtil.close(variantReader);
}
 
Example 14
Source File: GencodeFuncotationFactory.java    From gatk with BSD 3-Clause "New" or "Revised" License 4 votes vote down vote up
private static boolean isInTranscriptLeftFlank(final VariantContext variant, final GencodeGtfTranscriptFeature transcript, final int flankSize) {
    return variant.getContig().equals(transcript.getContig()) &&
            variant.getEnd() < transcript.getStart() &&
            transcript.getStart() - variant.getEnd() <= flankSize;
}
 
Example 15
Source File: FuncotatorEngineUnitTest.java    From gatk with BSD 3-Clause "New" or "Revised" License 4 votes vote down vote up
@Test(dataProvider = "provideGt")
public void testGetFuncotationFactoriesAndCreateFuncotationMapForVariant(final File vcfFile,
                                                                         final List<String> correspondingGeneName,
                                                                         final boolean[] hasClinvarHit) {

    final Pair<VCFHeader, List<VariantContext>> entireVcf = VariantContextTestUtils.readEntireVCFIntoMemory(vcfFile.getAbsolutePath());
    final Map<Path, Properties> configData = DataSourceUtils.getAndValidateDataSourcesFromPaths("hg19", Collections.singletonList(DS_PIK3CA_DIR));

    final Pair<VCFHeader, List<VariantContext>> vcfFileContents = VariantContextTestUtils.readEntireVCFIntoMemory(vcfFile.getAbsolutePath());

    // Set up our arguments:
    final FuncotatorVariantArgumentCollection funcotatorArguments = new FuncotatorVariantArgumentCollection();
    funcotatorArguments.referenceVersion = BaseFuncotatorArgumentCollection.FuncotatorReferenceVersionHg19;
    funcotatorArguments.transcriptSelectionMode = TranscriptSelectionMode.CANONICAL;
    funcotatorArguments.lookaheadFeatureCachingInBp = FuncotatorArgumentDefinitions.LOOKAHEAD_CACHE_IN_BP_DEFAULT_VALUE;

    // Create the metadata directly from the input.
    final FuncotatorEngine funcotatorEngine =
            new FuncotatorEngine(
                    funcotatorArguments,
                    vcfFileContents.getLeft().getSequenceDictionary(),
                    VcfFuncotationMetadata.create(new ArrayList<>(entireVcf.getLeft().getInfoHeaderLines())),
                    DataSourceUtils.createDataSourceFuncotationFactoriesForDataSources(
                            configData,
                            new LinkedHashMap<>(),
                            TranscriptSelectionMode.CANONICAL,
                            new HashSet<>(),
                            new DummyPlaceholderGatkTool(),
                            FuncotatorArgumentDefinitions.LOOKAHEAD_CACHE_IN_BP_DEFAULT_VALUE,
                            new FlankSettings(0, 0),
                            false,
                            FuncotatorUtils.DEFAULT_MIN_NUM_BASES_FOR_VALID_SEGMENT)
            );

    for (int i = 0; i < entireVcf.getRight().size(); i++) {
        final VariantContext vc = entireVcf.getRight().get(i);
        final SimpleInterval variantInterval = new SimpleInterval(vc.getContig(), vc.getStart(), vc.getEnd());
        final ReferenceContext referenceContext = new ReferenceContext(ReferenceDataSource.of(Paths.get(FuncotatorReferenceTestUtils.retrieveHg19Chr3Ref())), variantInterval);
        final FeatureContext featureContext = FuncotatorTestUtils.createFeatureContext(funcotatorEngine.getFuncotationFactories(), "TEST", variantInterval,
                0,0,0, null);
        final FuncotationMap funcotationMap = funcotatorEngine.createFuncotationMapForVariant(vc, referenceContext, featureContext);

        // Check that all of the transcripts at this location have the same gene name as the corresponding gene.
        //  The ground truth selected has the same gene name for all transcripts.
        //  Also, input VCF has no multiallelics.
        for (final String txId : funcotationMap.getTranscriptList()) {
            Assert.assertEquals(funcotationMap.getFieldValue(txId, "Gencode_19_hugoSymbol", vc.getAlternateAllele(0)), correspondingGeneName.get(i));
            Assert.assertTrue((funcotationMap.getFieldValue(txId, "dummy_ClinVar_VCF_ALLELEID", vc.getAlternateAllele(0)).isEmpty()) != hasClinvarHit[i]);
        }
    }
}
 
Example 16
Source File: ValidateVariants.java    From gatk with BSD 3-Clause "New" or "Revised" License 4 votes vote down vote up
@Override
public void apply(final VariantContext vc, final ReadsContext readsContext, final ReferenceContext ref, final FeatureContext featureContext) {
    if (DO_NOT_VALIDATE_FILTERED && vc.isFiltered()) {
        return;
    }
    // get the true reference allele
    final Allele reportedRefAllele = vc.getReference();
    final int refLength = reportedRefAllele.length();

    final Allele observedRefAllele = hasReference() ? Allele.create(Arrays.copyOf(ref.getBases(), refLength)) : null;

    final Set<String> rsIDs = getRSIDs(featureContext);

    if (VALIDATE_GVCF) {
        final SimpleInterval refInterval = ref.getInterval();

        validateVariantsOrder(vc);

        // GenomeLocSortedSet will automatically merge intervals that are overlapping when setting `mergeIfIntervalOverlaps`
        // to true.  In a GVCF most blocks are adjacent to each other so they wouldn't normally get merged.  We check
        // if the current record is adjacent to the previous record and "overlap" them if they are so our set is as
        // small as possible while still containing the same bases.
        final int start = (previousInterval != null && previousInterval.overlapsWithMargin(refInterval, 1)) ?
                previousInterval.getStart() : refInterval.getStart();
        final int end = (previousInterval != null && previousInterval.overlapsWithMargin(refInterval, 1)) ?
                Math.max(previousInterval.getEnd(), vc.getEnd()) : vc.getEnd();
        final GenomeLoc possiblyMergedGenomeLoc = genomeLocSortedSet.getGenomeLocParser().createGenomeLoc(refInterval.getContig(), start, end);
        genomeLocSortedSet.add(possiblyMergedGenomeLoc, true);

        previousInterval = new SimpleInterval(possiblyMergedGenomeLoc);
        previousStart = vc.getStart();
        validateGVCFVariant(vc);
    }

    for (final ValidationType t : validationTypes) {
        try{
            applyValidationType(vc, reportedRefAllele, observedRefAllele, rsIDs, t);
        } catch (TribbleException e) {
            throwOrWarn(new UserException.FailsStrictValidation(drivingVariantFile, t, e.getMessage()));
        }
    }
}
 
Example 17
Source File: VariantContextVariantAdapter.java    From gatk with BSD 3-Clause "New" or "Revised" License 4 votes vote down vote up
public static GATKVariant sparkVariantAdapter(VariantContext vc) {
    return new MinimalVariant(new SimpleInterval(vc.getContig(),vc.getStart(),vc.getEnd()), vc.isSNP(), vc.isIndel());
}
 
Example 18
Source File: AS_ReadPosRankSumTest.java    From gatk with BSD 3-Clause "New" or "Revised" License 4 votes vote down vote up
@Override
public boolean isUsableRead(final GATKRead read, final VariantContext vc) {
    Utils.nonNull(read);
    // we use vc.getEnd() + 1 in case of a leading indel -- if this isn't relevant getReadPosition will return empty
    return super.isUsableRead(read, vc) && read.getSoftStart() <= vc.getEnd() + 1 && read.getSoftEnd() >= vc.getStart();
}
 
Example 19
Source File: CosmicFuncotationFactory.java    From gatk with BSD 3-Clause "New" or "Revised" License 4 votes vote down vote up
@Override
protected List<Funcotation> createFuncotationsOnVariant(final VariantContext variant, final ReferenceContext referenceContext, final List<Feature> featureList, final List<GencodeFuncotation> gencodeFuncotations) {

    final List<Funcotation> outputFuncotations = new ArrayList<>();

    // Keep count of each overlapping mutation here:
    final Map<String, Integer> proteinChangeCounts = new LinkedHashMap<>();

    // If we have gencodeFuncotations we go through them and get the gene name
    // Then query our DB for matches on the gene name.
    // Then grab Genome position / Protein position and see if we overlap.
    // If any do, we create our CosmicFuncotation
    for ( final GencodeFuncotation gencodeFuncotation : gencodeFuncotations ) {
        final String geneName = gencodeFuncotation.getHugoSymbol();

        final SimpleInterval genomePosition = new SimpleInterval(variant.getContig(), variant.getStart(), variant.getEnd());

        final SimpleInterval proteinPosition;
        if ( gencodeFuncotation.getProteinChange() != null ) {
            proteinPosition = parseProteinString(gencodeFuncotation.getProteinChange());
        }
        else {
            proteinPosition = null;
        }

        try {
            try ( final Statement statement = dbConnection.createStatement() ) {
                try ( final ResultSet resultSet = statement.executeQuery(RESULT_QUERY_TEMPLATE + "\"" + geneName + "\";") ) {
                    // iterate through our results:
                    while ( resultSet.next() ) {

                        // Get the genome position:
                        final SimpleInterval cosmicGenomePosition = getGenomePositionFromResults(resultSet);

                        // Try to match on genome position first:
                        if ( cosmicGenomePosition != null ) {
                            if ( genomePosition.overlaps(cosmicGenomePosition) ) {
                                // If we overlap the records, we get the protein change and add it to the map:
                                updateProteinChangeCountMap(proteinChangeCounts, resultSet);
                                continue;
                            }
                        }

                        // Get the protein position:
                        final SimpleInterval cosmicProteinPosition = getProteinPositionFromResults(resultSet);

                        // Now try to match on protein position:
                        if ( proteinPosition != null ) {
                            // If we overlap the records, we update the counter:
                            if ( proteinPosition.overlaps(cosmicProteinPosition) ) {
                                updateProteinChangeCountMap(proteinChangeCounts, resultSet);
                            }
                        }
                        // NOTE: We can't annotate if the protein position is null.
                    }
                }
            }
        }
        catch (final SQLException ex) {
            throw new GATKException("Unable to query the database for geneName: " + geneName, ex);
        }
    }

    // Add our counts to all alternate alleles in this variant:
    for ( final Allele altAllele : variant.getAlternateAlleles() ) {
        outputFuncotations.add(
                TableFuncotation.create(
                        new ArrayList<>(supportedFields),
                        Collections.singletonList(proteinChangeCounts.entrySet().stream()
                                .map(entry -> entry.getKey() + '('+ entry.getValue() + ')')
                                .collect(Collectors.joining("|"))),
                        altAllele,
                        name, null
                )
        );
    }

    return outputFuncotations;
}
 
Example 20
Source File: CountingVariantFilterUnitTest.java    From gatk with BSD 3-Clause "New" or "Revised" License votes vote down vote up
@Override public boolean test(final VariantContext variant){return variant.getEnd() <= 10;}