Java Code Examples for org.apache.commons.math3.util.MathArrays#shuffle()

The following examples show how to use org.apache.commons.math3.util.MathArrays#shuffle() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: RandomDataGenerator.java    From astor with GNU General Public License v2.0 6 votes vote down vote up
/**
 * {@inheritDoc}
 *
 * This method calls {@link MathArrays#shuffle(int[],RandomGenerator)
 * MathArrays.shuffle} in order to create a random shuffle of the set
 * of natural numbers {@code { 0, 1, ..., n - 1 }}.
 *
 * @throws NumberIsTooLargeException if {@code k > n}.
 * @throws NotStrictlyPositiveException if {@code k <= 0}.
 */
public int[] nextPermutation(int n, int k)
    throws NumberIsTooLargeException, NotStrictlyPositiveException {
    if (k > n) {
        throw new NumberIsTooLargeException(LocalizedFormats.PERMUTATION_EXCEEDS_N,
                                            k, n, true);
    }
    if (k <= 0) {
        throw new NotStrictlyPositiveException(LocalizedFormats.PERMUTATION_SIZE,
                                               k);
    }

    int[] index = MathArrays.natural(n);
    MathArrays.shuffle(index, getRandomGenerator());

    // Return a new array containing the first "k" entries of "index".
    return MathArrays.copyOf(index, k);
}
 
Example 2
Source File: RandomDataGenerator.java    From astor with GNU General Public License v2.0 6 votes vote down vote up
/**
 * {@inheritDoc}
 *
 * This method calls {@link MathArrays#shuffle(int[],RandomGenerator)
 * MathArrays.shuffle} in order to create a random shuffle of the set
 * of natural numbers {@code { 0, 1, ..., n - 1 }}.
 *
 * @throws NumberIsTooLargeException if {@code k > n}.
 * @throws NotStrictlyPositiveException if {@code k <= 0}.
 */
public int[] nextPermutation(int n, int k)
    throws NumberIsTooLargeException, NotStrictlyPositiveException {
    if (k > n) {
        throw new NumberIsTooLargeException(LocalizedFormats.PERMUTATION_EXCEEDS_N,
                                            k, n, true);
    }
    if (k <= 0) {
        throw new NotStrictlyPositiveException(LocalizedFormats.PERMUTATION_SIZE,
                                               k);
    }

    int[] index = getNatural(n);
    MathArrays.shuffle(index, getRandomGenerator());

    // Return a new array containing the first "k" entries of "index".
    return MathArrays.copyOf(index, k);
}
 
Example 3
Source File: RandomDataGenerator.java    From astor with GNU General Public License v2.0 6 votes vote down vote up
/**
 * {@inheritDoc}
 *
 * <p>
 * Uses a 2-cycle permutation shuffle. The shuffling process is described <a
 * href="http://www.maths.abdn.ac.uk/~igc/tch/mx4002/notes/node83.html">
 * here</a>.
 * </p>
 * @throws NumberIsTooLargeException if {@code k > n}.
 * @throws NotStrictlyPositiveException if {@code k <= 0}.
 */
public int[] nextPermutation(int n, int k)
    throws NumberIsTooLargeException, NotStrictlyPositiveException {
    if (k > n) {
        throw new NumberIsTooLargeException(LocalizedFormats.PERMUTATION_EXCEEDS_N,
                                            k, n, true);
    }
    if (k <= 0) {
        throw new NotStrictlyPositiveException(LocalizedFormats.PERMUTATION_SIZE,
                                               k);
    }

    int[] index = getNatural(n);
    MathArrays.shuffle(index, getRandomGenerator());

    // Return a new array containing the first "k" entries of "index".
    return MathArrays.copyOf(index, k);
}
 
Example 4
Source File: RandomDataGenerator.java    From astor with GNU General Public License v2.0 6 votes vote down vote up
/**
 * {@inheritDoc}
 *
 * This method calls {@link MathArrays#shuffle(int[],RandomGenerator)
 * MathArrays.shuffle} in order to create a random shuffle of the set
 * of natural numbers {@code { 0, 1, ..., n - 1 }}.
 *
 * @throws NumberIsTooLargeException if {@code k > n}.
 * @throws NotStrictlyPositiveException if {@code k <= 0}.
 */
public int[] nextPermutation(int n, int k)
    throws NumberIsTooLargeException, NotStrictlyPositiveException {
    if (k > n) {
        throw new NumberIsTooLargeException(LocalizedFormats.PERMUTATION_EXCEEDS_N,
                                            k, n, true);
    }
    if (k <= 0) {
        throw new NotStrictlyPositiveException(LocalizedFormats.PERMUTATION_SIZE,
                                               k);
    }

    int[] index = MathArrays.natural(n);
    MathArrays.shuffle(index, getRandomGenerator());

    // Return a new array containing the first "k" entries of "index".
    return MathArrays.copyOf(index, k);
}
 
Example 5
Source File: Dataset.java    From graphdb-benchmarks with Apache License 2.0 6 votes vote down vote up
public Set<Integer> generateRandomNodes(int numRandomNodes)
{
    Set<String> nodes = new HashSet<String>();
    for (List<String> line : data.subList(4, data.size()))
    {
        for (String nodeId : line)
        {
            nodes.add(nodeId.trim());
        }
    }

    List<String> nodeList = new ArrayList<String>(nodes);
    int[] nodeIndexList = new int[nodeList.size()];
    for (int i = 0; i < nodeList.size(); i++)
    {
        nodeIndexList[i] = i;
    }
    MathArrays.shuffle(nodeIndexList);

    Set<Integer> generatedNodes = new HashSet<Integer>();
    for (int i = 0; i < numRandomNodes; i++)
    {
        generatedNodes.add(Integer.valueOf(nodeList.get(nodeIndexList[i])));
    }
    return generatedNodes;
}
 
Example 6
Source File: LR.java    From ml-models with Apache License 2.0 5 votes vote down vote up
@UserFunction(value = "regression.linear.split")
@Description("Randomly selects and returns a 'fraction' of 'data' entries. Ex. if fraction=0.75 will randomly select and " +
        "return a list containing 75% of the entries in 'data'. Use to split data into training/test sets.")
public List<Long> split(@Name("node IDs") List<Long> data, @Name("fraction") double fraction) {
    int n = data.size();
    int k = (int) Math.floor(n*fraction);

    final int[] index = MathArrays.natural(n);
    MathArrays.shuffle(index);

    List<Long> subset = new ArrayList<>(k);
    for (int i = 0; i < k; i++) subset.add(i, data.get(index[i]));

    return subset;
}
 
Example 7
Source File: CallGraphGenerator.java    From fasten with Apache License 2.0 4 votes vote down vote up
/** Generates <code>np</code> call graphs. Each call graph is obtained using {@link #preferentialAttachmentDAG(int, int, IntegerDistribution, RandomGenerator)} (with
 *  specified initial graph size (<code>initialGraphSizeDistribution</code>), graph size (<code>graphSizeDistribution</code>), outdegree distribution (<code>outdegreeDistribution</code>).
 *  Then a dependency DAG is generated between the call graphs, once more using {@link #preferentialAttachmentDAG(int, int, IntegerDistribution, RandomGenerator)} (this
 *  time the initial graph size is 1, whereas the outdegree distribution is <code>outdegreeDistribution</code>).
 *  Then to each node of each call graph a new set of outgoing arcs is generated (their number is chosen using <code>externalOutdegreeDistribution</code>): the target
 *  call graph is generated using the indegree distribution of the dependency DAG; the target node is chosen according to the reverse indegree distribution within the revision call graph.
 *
 * @param np number of revision call graphs to be generated.
 * @param graphSizeDistribution the distribution of the graph sizes (number of functions per call graph).
 * @param initialGraphSizeDistribution the distribution of the initial graph sizes (the initial independent set from which the preferential attachment starts).
 * @param outdegreeDistribution the distribution of internal outdegrees (number of internal calls per function).
 * @param externalOutdegreeDistribution the distribution of external outdegrees (number of external calls per function).
 * @param depExponent exponent of the Zipf distribution used to establish the dependencies between call graphs.
 * @param random the random object used for the generation.
 */
public void generate(final int np, final IntegerDistribution graphSizeDistribution, final IntegerDistribution initialGraphSizeDistribution,
		final IntegerDistribution outdegreeDistribution, final IntegerDistribution externalOutdegreeDistribution, final IntegerDistribution dependencyOutdegreeDistribution, final RandomGenerator random) {
	rcgs = new ArrayListMutableGraph[np];
	nodePermutation = new int[np][];
	final FenwickTree[] td = new FenwickTree[np];
	deps = new IntOpenHashSet[np];
	source2Targets = new ObjectOpenCustomHashSet[np];

	// Generate rcg of the np revisions, and the corresponding reverse preferential distribution; cumsize[i] is the sum of all nodes in packages <i
	for ( int i = 0; i < np; i++) {
		deps[i] = new IntOpenHashSet();
		final int n = graphSizeDistribution.sample();
		final int n0 = Math.min(initialGraphSizeDistribution.sample(), n);
		rcgs[i] = preferentialAttachmentDAG(n, n0, outdegreeDistribution, random);
		td[i] = getPreferentialDistribution(rcgs[i].immutableView(), true);
		nodePermutation[i] = Util.identity(n);
		Collections.shuffle(IntArrayList.wrap(nodePermutation[i]), new Random(random.nextLong()));
	}

	// Generate the dependency DAG between revisions using preferential attachment starting from 1 node
	final ArrayListMutableGraph depDAG = preferentialAttachmentDAG(np, 1, dependencyOutdegreeDistribution, random);

	// For each source package, generate function calls so to cover all dependencies
	for (int sourcePackage = 0; sourcePackage < np; sourcePackage++) {
		source2Targets[sourcePackage] = new ObjectOpenCustomHashSet<>(IntArrays.HASH_STRATEGY);
		final int outdegree = depDAG.outdegree(sourcePackage);
		if (outdegree == 0) continue; // No calls needed (I'm kinda busy)

		final int numFuncs = rcgs[sourcePackage].numNodes();
		final int[] externalArcs = new int[numFuncs];
		int allExternalArcs = 0;
		// We decide how many calls to dispatch from each function
		for (int sourceNode = 0; sourceNode < numFuncs; sourceNode++) allExternalArcs += (externalArcs[sourceNode] = externalOutdegreeDistribution.sample());
		// We create a global list of external successors by shuffling
		final int[] targetPackage = new int[allExternalArcs];
		final int[] succ = depDAG.successorArray(sourcePackage);
		for(int i = 0; i < outdegree; i++) deps[sourcePackage].add(succ[i]);
		for(int i = 0; i < allExternalArcs; i++) targetPackage[i] = succ[i % outdegree];
		MathArrays.shuffle(targetPackage, random);

		for (int sourceNode = allExternalArcs = 0; sourceNode < numFuncs; sourceNode++) {
			final int externalOutdegree = externalArcs[sourceNode];
			for (int t = 0; t < externalOutdegree; t++) {
				final int targetNode = td[targetPackage[allExternalArcs + t]].sample(random) - 1;
				source2Targets[sourcePackage].add(new int[] { sourceNode, targetPackage[allExternalArcs + t], targetNode });
			}
			allExternalArcs += externalOutdegree;
		}
	}
}
 
Example 8
Source File: KolmogorovSmirnovTest.java    From astor with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Uses Monte Carlo simulation to approximate \(P(D_{n,m} > d)\) where \(D_{n,m}\) is the
 * 2-sample Kolmogorov-Smirnov statistic. See
 * {@link #kolmogorovSmirnovStatistic(double[], double[])} for the definition of \(D_{n,m}\).
 * <p>
 * The simulation generates {@code iterations} random partitions of {@code m + n} into an
 * {@code n} set and an {@code m} set, computing \(D_{n,m}\) for each partition and returning
 * the proportion of values that are greater than {@code d}, or greater than or equal to
 * {@code d} if {@code strict} is {@code false}.
 * </p>
 *
 * @param d D-statistic value
 * @param n first sample size
 * @param m second sample size
 * @param iterations number of random partitions to generate
 * @param strict whether or not the probability to compute is expressed as a strict inequality
 * @return proportion of randomly generated m-n partitions of m + n that result in \(D_{n,m}\)
 *         greater than (resp. greater than or equal to) {@code d}
 */
public double monteCarloP(double d, int n, int m, boolean strict, int iterations) {
    final int[] nPlusMSet = MathArrays.natural(m + n);
    final double[] nSet = new double[n];
    final double[] mSet = new double[m];
    int tail = 0;
    for (int i = 0; i < iterations; i++) {
        copyPartition(nSet, mSet, nPlusMSet, n, m);
        final double curD = kolmogorovSmirnovStatistic(nSet, mSet);
        if (curD > d) {
            tail++;
        } else if (curD == d && !strict) {
            tail++;
        }
        MathArrays.shuffle(nPlusMSet, rng);
        Arrays.sort(nPlusMSet, 0, n);
    }
    return (double) tail / iterations;
}
 
Example 9
Source File: KolmogorovSmirnovTest.java    From astor with GNU General Public License v2.0 4 votes vote down vote up
/**
 * Uses Monte Carlo simulation to approximate \(P(D_{n,m} > d)\) where \(D_{n,m}\) is the
 * 2-sample Kolmogorov-Smirnov statistic. See
 * {@link #kolmogorovSmirnovStatistic(double[], double[])} for the definition of \(D_{n,m}\).
 * <p>
 * The simulation generates {@code iterations} random partitions of {@code m + n} into an
 * {@code n} set and an {@code m} set, computing \(D_{n,m}\) for each partition and returning
 * the proportion of values that are greater than {@code d}, or greater than or equal to
 * {@code d} if {@code strict} is {@code false}.
 * </p>
 *
 * @param d D-statistic value
 * @param n first sample size
 * @param m second sample size
 * @param iterations number of random partitions to generate
 * @param strict whether or not the probability to compute is expressed as a strict inequality
 * @return proportion of randomly generated m-n partitions of m + n that result in \(D_{n,m}\)
 *         greater than (resp. greater than or equal to) {@code d}
 */
public double monteCarloP(double d, int n, int m, boolean strict, int iterations) {
    final int[] nPlusMSet = MathArrays.natural(m + n);
    final double[] nSet = new double[n];
    final double[] mSet = new double[m];
    int tail = 0;
    for (int i = 0; i < iterations; i++) {
        copyPartition(nSet, mSet, nPlusMSet, n, m);
        final double curD = kolmogorovSmirnovStatistic(nSet, mSet);
        if (curD > d) {
            tail++;
        } else if (curD == d && !strict) {
            tail++;
        }
        MathArrays.shuffle(nPlusMSet, rng);
        Arrays.sort(nPlusMSet, 0, n);
    }
    return (double) tail / iterations;
}