burlap.behavior.singleagent.planning.Planner Java Examples

The following examples show how to use burlap.behavior.singleagent.planning.Planner. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: QMDP.java    From burlap with Apache License 2.0 6 votes vote down vote up
/**
 * Calls the {@link burlap.behavior.singleagent.planning.Planner#planFromState(State)} method
 * on all states defined in the POMDP. Calling this method requires that the PODomain provides a {@link burlap.behavior.singleagent.auxiliary.StateEnumerator},
 * otherwise an exception will be thrown.
 */
public void forceMDPPlanningFromAllStates(){

	if(!((PODomain)this.domain).providesStateEnumerator()){
		throw new RuntimeException("QMDP cannot apply method forceMDPPlanningFromAllStates because the domain does not provide a StateEnumerator.");
	}

	Planner planner = (Planner)this.mdpQSource;
	StateEnumerator senum = ((PODomain)this.domain).getStateEnumerator();
	if(senum == null){
		throw new RuntimeException("QMDP cannot plan from all states because the StateEnumerator for the POMDP domain was never specified.");
	}
	for(int i = 0; i < senum.numStatesEnumerated(); i++){
		State s = senum.getStateForEnumerationId(i);
		planner.planFromState(s);
	}
}
 
Example #2
Source File: Main.java    From cs7641-assignment4 with MIT License 5 votes vote down vote up
/**
 * Here is where the magic happens. In this method is where I loop through the specific number
 * of episodes (iterations) and run the specific algorithm. To keep things nice and clean, I use
 * this method to run all three algorithms. The specific details are specified through the
 * PlannerFactory interface.
 * 
 * This method collects all the information from the algorithm and packs it in an Analysis
 * instance that later gets dumped on the console.
 */
private static void runAlgorithm(Analysis analysis, Problem problem, SADomain domain, HashableStateFactory hashingFactory, State initialState, PlannerFactory plannerFactory, Algorithm algorithm) {
	ConstantStateGenerator constantStateGenerator = new ConstantStateGenerator(initialState);
	SimulatedEnvironment simulatedEnvironment = new SimulatedEnvironment(domain, constantStateGenerator);
	Planner planner = null;
	Policy policy = null;
	for (int episodeIndex = 1; episodeIndex <= problem.getNumberOfIterations(algorithm); episodeIndex++) {
		long startTime = System.nanoTime();
		planner = plannerFactory.createPlanner(episodeIndex, domain, hashingFactory, simulatedEnvironment);
		policy = planner.planFromState(initialState);

		/*
		 * If we haven't converged, following the policy will lead the agent wandering around
		 * and it might never reach the goal. To avoid this, we need to set the maximum number
		 * of steps to take before terminating the policy rollout. I decided to set this maximum
		 * at the number of grid locations in our map (width * width). This should give the
		 * agent plenty of room to wander around.
		 * 
		 * The smaller this number is, the faster the algorithm will run.
		 */
		int maxNumberOfSteps = problem.getWidth() * problem.getWidth();

		Episode episode = PolicyUtils.rollout(policy, initialState, domain.getModel(), maxNumberOfSteps);
		analysis.add(episodeIndex, episode.rewardSequence, episode.numTimeSteps(), (long) (System.nanoTime() - startTime) / 1000000);
	}

	if (algorithm == Algorithm.QLearning && USE_LEARNING_EXPERIMENTER) {
		learningExperimenter(problem, (LearningAgent) planner, simulatedEnvironment);
	}

	if (SHOW_VISUALIZATION && planner != null && policy != null) {
		visualize(problem, (ValueFunction) planner, policy, initialState, domain, hashingFactory, algorithm.getTitle());
	}
}
 
Example #3
Source File: MultipleIntentionsMLIRL.java    From burlap with Apache License 2.0 5 votes vote down vote up
/**
 * Initializes cluster data; i.e., it initializes RF parameters, cluster prior parameters (to uniform), and creates {@link burlap.behavior.singleagent.learnfromdemo.mlirl.MLIRLRequest}
 * objects for each cluster.
 * @param k the number of clusters
 * @param plannerFactory the {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.QGradientPlannerFactory} to use to generate a valueFunction for each cluster.
 */
protected void initializeClusters(int k, QGradientPlannerFactory plannerFactory){

	List<DifferentiableRF> rfs = new ArrayList<DifferentiableRF>(k);
	for(int i = 0; i < k; i++){
		rfs.add((DifferentiableRF)this.request.getRf().copy());
	}

	this.initializeClusterRFParameters(rfs);

	this.clusterRequests = new ArrayList<MLIRLRequest>(k);
	this.clusterPriors = new double[k];
	double uni = 1./(double)k;
	for(int i = 0; i < k; i++){
		this.clusterPriors[i] = uni;
		MLIRLRequest nRequest = new MLIRLRequest(this.request.getDomain(),null,
				this.request.getExpertEpisodes(),rfs.get(i));

		nRequest.setGamma(this.request.getGamma());
		nRequest.setBoltzmannBeta(this.request.getBoltzmannBeta());
		nRequest.setPlanner((Planner)plannerFactory.generateDifferentiablePlannerForRequest(nRequest));

		this.clusterRequests.add(nRequest);

	}



}
 
Example #4
Source File: MultipleIntentionsMLIRLRequest.java    From burlap with Apache License 2.0 5 votes vote down vote up
/**
 * Sets the {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.QGradientPlannerFactory} to use and also
 * sets this request object's valueFunction instance to a valueFunction generated from it, if it has not already been set.
 * Setting a valueFunction instance ensures that the {@link #isValid()} methods do not return false.
 * @param plannerFactory the {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.QGradientPlannerFactory} to use
 */
public void setPlannerFactory(QGradientPlannerFactory plannerFactory) {
	this.plannerFactory = plannerFactory;
	if(this.planner == null){
		this.setPlanner((Planner) plannerFactory.generateDifferentiablePlannerForRequest(this));
	}
}
 
Example #5
Source File: MultipleIntentionsMLIRLRequest.java    From burlap with Apache License 2.0 5 votes vote down vote up
/**
 * Initializes
 * @param domain the domain of the problem
 * @param plannerFactory A {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.QGradientPlannerFactory} that produces {@link DifferentiableQFunction} objects.
 * @param expertEpisodes the expert trajectories
 * @param rf the {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.DifferentiableRF} model to use.
 * @param k the number of clusters
 */
public MultipleIntentionsMLIRLRequest(SADomain domain, QGradientPlannerFactory plannerFactory, List<Episode> expertEpisodes, DifferentiableRF rf, int k) {
	super(domain, null, expertEpisodes, rf);
	this.plannerFactory = plannerFactory;
	this.k = k;
	if(this.plannerFactory != null) {
		this.setPlanner((Planner) plannerFactory.generateDifferentiablePlannerForRequest(this));
	}
}
 
Example #6
Source File: MLIRLRequest.java    From burlap with Apache License 2.0 5 votes vote down vote up
@Override
public void setPlanner(Planner p) {
	if(planner != null && !(p instanceof DifferentiableQFunction)){
		throw new RuntimeException("Error: MLIRLRequest requires the valueFunction to be an instance of QGradientPlanner");
	}
	this.planner = p;
}
 
Example #7
Source File: QMDP.java    From burlap with Apache License 2.0 4 votes vote down vote up
@Override
public void resetSolver() {
	((Planner)this.mdpQSource).resetSolver();
}
 
Example #8
Source File: QMDP.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Initializes.
 * @param domain the POMDP domain
 * @param mdpQSource the underlying fully observable MDP {@link QProvider} source.
 */
public QMDP(PODomain domain, QProvider mdpQSource){
	this.mdpQSource = mdpQSource;
	Planner planner = (Planner)this.mdpQSource;
	this.solverInit(domain, planner.getGamma(), planner.getHashingFactory());
}
 
Example #9
Source File: MultiStatePrePlanner.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Runs a planning algorithm from multiple initial states to ensure that an adequate plan/policy exist for of the states.
 * @param planner the valueFunction to be used.
 * @param initialStates a collection of states from which to plan.
 */
public static void runPlannerForAllInitStates(Planner planner, Collection <State> initialStates){
	for(State s : initialStates){
		planner.planFromState(s);
	}
}
 
Example #10
Source File: MultiStatePrePlanner.java    From burlap with Apache License 2.0 4 votes vote down vote up
/**
 * Runs a planning algorithm from multiple initial states to ensure that an adequate plan/policy exist for of the states.
 * @param planner the valueFunction to be used.
 * @param initialStates a {@link burlap.mdp.auxiliary.stateconditiontest.StateConditionTestIterable} object that will iterate over the initial states from which to plan.
 */
public static void runPlannerForAllInitStates(Planner planner, StateConditionTestIterable initialStates){
	for(State s : initialStates){
		planner.planFromState(s);
	}
}
 
Example #11
Source File: ApprenticeshipLearningRequest.java    From burlap with Apache License 2.0 4 votes vote down vote up
public ApprenticeshipLearningRequest(SADomain domain, Planner planner, DenseStateFeatures featureGenerator, List<Episode> expertEpisodes, StateGenerator startStateGenerator) {
	super(domain, planner, expertEpisodes);
	this.initDefaults();
	this.setFeatureGenerator(featureGenerator);
	this.setStartStateGenerator(startStateGenerator);
}
 
Example #12
Source File: IRLRequest.java    From burlap with Apache License 2.0 4 votes vote down vote up
public void setPlanner(Planner p) {
	this.planner = p;
}
 
Example #13
Source File: MLIRLRequest.java    From burlap with Apache License 2.0 3 votes vote down vote up
/**
 * Initializes the request without any expert trajectory weights (which will be assumed to have a value 1).
 * If the provided valueFunction is not null and does not implement the {@link DifferentiableQFunction}
 * interface, an exception will be thrown.
 * @param domain the domain in which trajectories are provided.
 * @param planner a valueFunction that implements the {@link DifferentiableQFunction} interface.
 * @param expertEpisodes the expert episodes/trajectories to use for training.
 * @param rf the {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.DifferentiableRF} model to use.
 */
public MLIRLRequest(SADomain domain, Planner planner, List<Episode> expertEpisodes, DifferentiableRF rf){
	super(domain, planner, expertEpisodes);
	if(planner != null && !(planner instanceof DifferentiableQFunction)){
		throw new RuntimeException("Error: MLIRLRequest requires the valueFunction to be an instance of QGradientPlanner");
	}
	this.rf = rf;
}
 
Example #14
Source File: MultipleIntentionsMLIRLRequest.java    From burlap with Apache License 2.0 3 votes vote down vote up
/**
 * Initializes using a default {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.QGradientPlannerFactory.DifferentiableVIFactory} that
 * is based on the provided {@link burlap.statehashing.HashableStateFactory} object.
 * @param domain the domain of the problem
 * @param expertEpisodes the expert trajectories
 * @param rf the {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.DifferentiableRF} model to use.
 * @param k the number of clusters
 * @param hashableStateFactory the {@link burlap.statehashing.HashableStateFactory} to use for the {@link burlap.behavior.singleagent.learnfromdemo.mlirl.support.QGradientPlannerFactory.DifferentiableVIFactory} that will be created.
 */
public MultipleIntentionsMLIRLRequest(SADomain domain, List<Episode> expertEpisodes, DifferentiableRF rf, int k, HashableStateFactory hashableStateFactory) {
	super(domain, null, expertEpisodes, rf);
	this.plannerFactory = new QGradientPlannerFactory.DifferentiableVIFactory(hashableStateFactory);
	this.k = k;
	this.setPlanner((Planner) plannerFactory.generateDifferentiablePlannerForRequest(this));
}
 
Example #15
Source File: MinecraftSolver.java    From burlapcraft with GNU Lesser General Public License v3.0 3 votes vote down vote up
public static void stocasticPlan(double gamma){

		MinecraftDomainGenerator simdg = new MinecraftDomainGenerator();
		
		SADomain domain = simdg.generateDomain();

		State initialState = MinecraftStateGeneratorHelper.getCurrentState(BurlapCraft.currentDungeon);
		
		Planner planner = new ValueIteration(domain, gamma, new SimpleHashableStateFactory(false), 0.001, 1000);
		
		Policy p = planner.planFromState(initialState);
		
		MinecraftEnvironment me = new MinecraftEnvironment();
		PolicyUtils.rollout(p, me);
	}
 
Example #16
Source File: BasicBehavior.java    From burlap_examples with MIT License 3 votes vote down vote up
public void valueIterationExample(String outputPath){

		Planner planner = new ValueIteration(domain, 0.99, hashingFactory, 0.001, 100);
		Policy p = planner.planFromState(initialState);

		PolicyUtils.rollout(p, initialState, domain.getModel()).write(outputPath + "vi");

		simpleValueFunctionVis((ValueFunction)planner, p);
		//manualValueFunctionVis((ValueFunction)planner, p);

	}
 
Example #17
Source File: IRLRequest.java    From burlap with Apache License 2.0 2 votes vote down vote up
/**
 * Initializes. Discount factor will be defaulted to 0.99, which can optionally be changed with a setter.
 * @param domain the domain in which IRL is to be performed
 * @param planner the planning algorithm the IRL algorithm will invoke.
 * @param expertEpisodes the example expert trajectories/episodes.
 */
public IRLRequest(SADomain domain, Planner planner, List<Episode> expertEpisodes){
	this.setDomain(domain);
	this.setPlanner(planner);
	this.setExpertEpisodes(expertEpisodes);
}
 
Example #18
Source File: IRLRequest.java    From burlap with Apache License 2.0 votes vote down vote up
public Planner getPlanner() {return this.planner;} 
Example #19
Source File: PlannerFactory.java    From cs7641-assignment4 with MIT License votes vote down vote up
Planner createPlanner(int episodeIndex, SADomain domain, HashableStateFactory hashingFactory, SimulatedEnvironment simulatedEnvironment);