Java Code Examples for burlap.debugtools.RandomFactory#getMapped()

The following examples show how to use burlap.debugtools.RandomFactory#getMapped() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: PolicyUtils.java    From burlap with Apache License 2.0 6 votes vote down vote up
/**
 * This is a helper method for stochastic policies. If the policy is stochastic, then rather than
 * having the policy define both the {@link Policy#action(State)} method and
 * {@link EnumerablePolicy#policyDistribution(State)} method,
 * the objects needs to only define the {@link EnumerablePolicy#policyDistribution(State)} method and
 * the {@link Policy#action(State)} method can simply
 * return the result of this method to sample an action.
 * @param p the {@link EnumerablePolicy}
 * @param s the input state from which an action should be selected.
 * @return an {@link Action} to take
 */
public static Action sampleFromActionDistribution(EnumerablePolicy p, State s){
	Random rand = RandomFactory.getMapped(0);
	double roll = rand.nextDouble();
	List <ActionProb> probs = p.policyDistribution(s);
	if(probs == null || probs.isEmpty()){
		throw new PolicyUndefinedException();
	}
	double sump = 0.;
	for(ActionProb ap : probs){
		sump += ap.pSelection;
		if(roll < sump){
			return ap.ga;
		}
	}

	throw new RuntimeException("Tried to sample policy action distribution, but it did not sum to 1.");

}
 
Example 2
Source File: Option.java    From burlap with Apache License 2.0 5 votes vote down vote up
public static EnvironmentOptionOutcome control(Option o, Environment env, double discount){
	Random rand = RandomFactory.getMapped(0);
	State initial = env.currentObservation();
	State cur = initial;

	Episode episode = new Episode(cur);
	Episode history = new Episode(cur);
	double roll;
	double pT;
	int nsteps = 0;
	double r = 0.;
	double cd = 1.;
	do{
		Action a = o.policy(cur, history);
		EnvironmentOutcome eo = env.executeAction(a);
		nsteps++;
		r += cd*eo.r;
		cur = eo.op;
		cd *= discount;


		history.transition(a, eo.op, eo.r);

		AnnotatedAction annotatedAction = new AnnotatedAction(a, o.toString() + "(" + nsteps + ")");
		episode.transition(annotatedAction, eo.op, r);


		pT = o.probabilityOfTermination(eo.op, history);
		roll = rand.nextDouble();

	}while(roll > pT && !env.isInTerminalState());

	EnvironmentOptionOutcome eoo = new EnvironmentOptionOutcome(initial, o, cur, r, env.isInTerminalState(), discount, episode);

	return eoo;

}
 
Example 3
Source File: DFS.java    From burlap with Apache License 2.0 5 votes vote down vote up
/**
 * Constructor of DFS with specification of depth limit, whether to maintain a closed list that affects exploration, and whether paths
 * generated by options should be explored first.
 * @param domain the domain in which to plan
 * @param gc indicates the goal states
 * @param hashingFactory the state hashing factory to use
 * @param maxDepth depth limit of DFS. -1 specifies no limit.
 * @param maintainClosed whether to maintain a closed list or not
 * @param optionsFirst whether to explore paths generated by options first.
 */
protected void DFSInit(SADomain domain, StateConditionTest gc, HashableStateFactory hashingFactory, int maxDepth, boolean maintainClosed, boolean optionsFirst){
	this.deterministicPlannerInit(domain, gc, hashingFactory);
	this.maxDepth = maxDepth;
	this.maintainClosed = maintainClosed;
	if(optionsFirst){
		this.setOptionsFirst();
	}
	
	rand = RandomFactory.getMapped(0);
}
 
Example 4
Source File: ObservationUtilities.java    From burlap with Apache License 2.0 5 votes vote down vote up
/**
 * A helper method for easily implementing the {@link ObservationFunction#sample(State, Action)} method that
 * samples an observation by first getting all non-zero probability observations, as returned by the {@link DiscreteObservationFunction#probabilities(State, Action)}
 * method, and then sampling from the enumerated distribution. Note that enumerating all observation probabilities may be computationally
 * inefficient; therefore, it may be better to directly implement the {@link ObservationFunction#sample(State, Action)}
 * method with efficient domain specific code.
 * @param of the {@link ObservationFunction} to use.
 * @param state the true MDP state
 * @param action the action that led to the MDP state
 * @return an observation represented with a {@link State}.
 */
public static State sampleByEnumeration(DiscreteObservationFunction of, State state, Action action){
	List<ObservationProbability> obProbs = of.probabilities(state, action);
	Random rand = RandomFactory.getMapped(0);
	double r = rand.nextDouble();
	double sumProb = 0.;
	for(ObservationProbability op : obProbs){
		sumProb += op.p;
		if(r < sumProb){
			return op.observation;
		}
	}

	throw new RuntimeException("Could not sample observaiton because observation probabilities did not sum to 1; they summed to " + sumProb);
}
 
Example 5
Source File: MCRandomStateGenerator.java    From burlap with Apache License 2.0 5 votes vote down vote up
/**
 * Initializes for the {@link MountainCar} {@link Domain} object for which states will be generated. By default, the random x and velocity ranges will be
 * the full range used by the domain.
 * @param params the mountain car physics parameters specifying the boundaries
 */
public MCRandomStateGenerator(MountainCar.MCPhysicsParams params){

	this.xmin = params.xmin;
	this.xmax = params.xmax;
	this.vmin = params.vmin;
	this.vmax = params.vmax;

	
	this.rand = RandomFactory.getMapped(0);
}
 
Example 6
Source File: FrameExperienceMemory.java    From burlap_caffe with Apache License 2.0 4 votes vote down vote up
public List<FrameExperience> sampleFrameExperiences(int n) {
    List<FrameExperience> samples;

    if(this.size == 0){
        return new ArrayList<>();
    }

    if(this.alwaysIncludeMostRecent){
        n--;
    }

    if(this.size < n){
        samples = new ArrayList<>(this.size);
        for(int i = 0; i < this.size; i++){
            samples.add(this.experiences[i]);
        }
        return samples;
    }
    else{
        samples = new ArrayList<>(Math.max(n, 1));
        Random r = RandomFactory.getMapped(0);
        for(int i = 0; i < n; i++) {
            int sind = r.nextInt(this.size);
            samples.add(this.experiences[sind]);
        }
    }
    if(this.alwaysIncludeMostRecent){
        FrameExperience eo;
        if(next > 0) {
            eo = this.experiences[next - 1];
        }
        else if(size > 0){
            eo = this.experiences[this.experiences.length-1];
        }
        else{
            throw new RuntimeException("FixedSizeMemory getting most recent fails because memory is size 0.");
        }
        samples.add(eo);
    }

    return samples;
}
 
Example 7
Source File: GreedyQPolicy.java    From burlap with Apache License 2.0 4 votes vote down vote up
public GreedyQPolicy(){
	qplanner = null;
	rand = RandomFactory.getMapped(0);
}
 
Example 8
Source File: GreedyQPolicy.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Initializes with a QComputablePlanner
 * @param planner the QComputablePlanner to use
 */
public GreedyQPolicy(QProvider planner){
	qplanner = planner;
	rand = RandomFactory.getMapped(0);
}
 
Example 9
Source File: EpsilonGreedy.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Initializes with the value of epsilon, where epsilon is the probability of taking a random action.
 * @param epsilon the probability of taking a random action.
 */
public EpsilonGreedy(double epsilon) {
	qplanner = null;
	this.epsilon = epsilon;
	rand = RandomFactory.getMapped(0);
}
 
Example 10
Source File: EpsilonGreedy.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Initializes with the QComputablePlanner to use and the value of epsilon to use, where epsilon is the probability of taking a random action.
 * @param planner the QComputablePlanner to use
 * @param epsilon the probability of taking a random action.
 */
public EpsilonGreedy(QProvider planner, double epsilon) {
	qplanner = planner;
	this.epsilon = epsilon;
	rand = RandomFactory.getMapped(0);
}
 
Example 11
Source File: StochasticTree.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Initializes the three data structure
 */
protected void init(){
	root = null;
	nodeMap = new HashMap<T, StochasticTree<T>.STNode>();
	rand = RandomFactory.getMapped(2347636);
}
 
Example 12
Source File: GridGameStandardMechanics.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Initializes the mechanics for the given domain and sets the semi-wall pass through probability to 0.5;
 * @param d the domain object
 */
public GridGameStandardMechanics(Domain d){
	rand = RandomFactory.getMapped(0);
	domain = d;
	pMoveThroughSWall = 0.5;
}
 
Example 13
Source File: GridGameStandardMechanics.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Initializes the mechanics for the given domain and sets the semi-wall pass through probability to semiWallPassThroughProb.
 * @param d d the domain object
 * @param semiWallPassThroughProb the probability that an agent will pass through a semi-wall.
 */
public GridGameStandardMechanics(Domain d, double semiWallPassThroughProb){
	rand = RandomFactory.getMapped(0);
	domain = d;
	pMoveThroughSWall = semiWallPassThroughProb;
}
 
Example 14
Source File: UCT.java    From burlap with Apache License 2.0 3 votes vote down vote up
protected void UCTInit(SADomain domain, double gamma, HashableStateFactory hashingFactory, int horizon, int nRollouts, int explorationBias){
	
	this.solverInit(domain, gamma, hashingFactory);
	this.maxHorizon = horizon;
	this.maxRollOutsFromRoot = nRollouts;
	this.explorationBias = explorationBias;
	
	goalCondition = null;
	
	rand = RandomFactory.getMapped(589449);
	
}
 
Example 15
Source File: GraphDefinedDomain.java    From burlap with Apache License 2.0 2 votes vote down vote up
/**
 * Initializes a graph action object for the given domain and for the action of the given number.
 * The name of this action will be the constant BASEACTIONNAMEi where i is the action number specified.
 * @param aId the action identifier number
 * @param transitionDynamics the underlying transition dynamics that also define the action preconditions
 */
public GraphActionType(int aId, Map<Integer, Map<Integer, Set<NodeTransitionProbability>>> transitionDynamics){
	this.aId = aId;
	rand = RandomFactory.getMapped(0);
	this.transitionDynamics = transitionDynamics;
}