Project: DRL_DeliveryDuel (GitHub Link)

DRL_DeliveryDuel-master
- img
- deliveryduel
  - builds
    - win_x86
      - DRL_DeliveryDuel_Data
        Managed
        UnityEngine.UNETModule.xml
        UnityEngine.GameCenterModule.xml
        UnityEngine.WindModule.xml
        UnityEngine.AccessibilityModule.xml
        UnityEngine.ParticleSystemModule.xml
        UnityEngine.ClusterRendererModule.xml
        UnityEngine.TerrainModule.xml
        UnityEngine.TerrainPhysicsModule.xml
        UnityEngine.SharedInternalsModule.xml
        UnityEngine.PhysicsModule.xml
        UnityEngine.VideoModule.xml
        UnityEngine.ClothModule.xml
        UnityEngine.VehiclesModule.xml
        UnityEngine.AIModule.xml
        UnityEngine.WebModule.xml
        UnityEngine.TextRenderingModule.xml
        UnityEngine.CrashReportingModule.xml
        UnityEngine.ARModule.xml
        UnityEngine.ScreenCaptureModule.xml
        UnityEngine.UnityWebRequestTextureModule.xml
        UnityEngine.ParticlesLegacyModule.xml
        UnityEngine.VRModule.xml
        UnityEngine.SpriteMaskModule.xml
        UnityEngine.PerformanceReportingModule.xml
        UnityEngine.AssetBundleModule.xml
        UnityEngine.UnityWebRequestWWWModule.xml
        UnityEngine.SpriteShapeModule.xml
        UnityEngine.JSONSerializeModule.xml
        UnityEngine.UnityConnectModule.xml
        UnityEngine.UnityAnalyticsModule.xml
        UnityEngine.ClusterInputModule.xml
        UnityEngine.ImageConversionModule.xml
        UnityEngine.AnimationModule.xml
        UnityEngine.AudioModule.xml
        UnityEngine.InputModule.xml
        UnityEngine.TilemapModule.xml
        UnityEngine.UnityWebRequestAudioModule.xml
        UnityEngine.Physics2DModule.xml
        UnityEngine.DirectorModule.xml
        UnityEngine.UIElementsModule.xml
        UnityEngine.UIModule.xml
        UnityEngine.StyleSheetsModule.xml
        UnityEngine.UnityWebRequestModule.xml
        UnityEngine.GridModule.xml
        boot.config
        globalgamemanagers.assets
        sharedassets0.resource
        level0.resS
        Resources
        MonoBleedingEdge
        EmbedRuntime
        etc
        mono
        2.0
        DefaultWsdlHelpGenerator.aspx
        machine.config
        Browsers
        Compat.browser
        web.config
        settings.map
        4.5
        DefaultWsdlHelpGenerator.aspx
        machine.config
        Browsers
        Compat.browser
        web.config
        settings.map
        config
        4.0
        DefaultWsdlHelpGenerator.aspx
        machine.config
        Browsers
        Compat.browser
        web.config
        settings.map
        mconfig
        config.xml
        globalgamemanagers
        app.info
- results
  - a2c_3d_cont_linear.csv
  - a2c_3d_cont_phys.csv
  - dqn_2d_non-cont_phys.csv
  - dqn_3d_cont_phys.csv
  - dqn_2d_cont_phys.csv
  - dqn_3d_non-cont_linear.csv
  - a2c_3d_non-cont_linear.csv
  - dqn_3d_cont_linear.csv
  - dqn_2d_cont_linear.csv
  - a2c_3d_non-cont_phys.csv
  - dqn_2d_non-cont_linear.csv
  - a2c_2d_non-cont_phys.csv
  - a2c_2d_cont_phys.csv
  - a2c_2d_non-cont_linear.csv
  - dqn_3d_non-cont_phys.csv
  - a2c_2d_cont_linear.csv
- LICENSE
- frameworks
  - gym
    - gym
      - utils
        reraise.py
        ezpickle.py
        seeding.py
        play.py
        atomic_write.py
        __init__.py
        tests
        test_atexit.py
        test_seeding.py
        colorize.py
        closer.py
        reraise_impl_py3.py
        json_utils.py
        reraise_impl_py2.py
      - version.py
      - logger.py
      - __init__.py
      - tests
        test_core.py
      - core.py
      - spaces
        tuple_space.py
        box.py
        multi_binary.py
        dict_space.py
        multi_discrete.py
        prng.py
        __init__.py
        discrete.py
        tests
        test_spaces.py
        __init__.py
      - error.py
      - wrappers
        time_limit.py
        monitoring
        video_recorder.py
        __init__.py
        tests
        test_video_recorder.py
        __init__.py
        helpers.py
        stats_recorder.py
        monitor.py
        __init__.py
        README.md
        tests
        test_wrappers.py
        __init__.py
        dict.py
      - envs
        classic_control
        continuous_mountain_car.py
        pendulum.py
        mountain_car.py
        rendering.py
        __init__.py
        acrobot.py
        cartpole.py
        assets
        mujoco
        humanoid.py
        reacher.py
        striker.py
        swimmer.py
        walker2d.py
        humanoidstandup.py
        mujoco_env.py
        pusher.py
        __init__.py
        hopper.py
        inverted_double_pendulum.py
        thrower.py
        ant.py
        inverted_pendulum.py
        half_cheetah.py
        assets
        inverted_pendulum.xml
        inverted_double_pendulum.xml
        thrower.xml
        striker.xml
        half_cheetah.xml
        reacher.xml
        ant.xml
        hopper.xml
        humanoidstandup.xml
        swimmer.xml
        pusher.xml
        humanoid.xml
        point.xml
        walker2d.xml
        box2d
        car_dynamics.py
        lunar_lander.py
        bipedal_walker.py
        car_racing.py
        __init__.py
        registration.py
        atari
        atari_env.py
        __init__.py
        robotics
        fetch
        push.py
        reach.py
        slide.py
        __init__.py
        pick_and_place.py
        rotations.py
        hand_env.py
        hand
        reach.py
        manipulate.py
        __init__.py
        fetch_env.py
        __init__.py
        README.md
        utils.py
        robot_env.py
        assets
        fetch
        push.xml
        robot.xml
        shared.xml
        reach.xml
        slide.xml
        pick_and_place.xml
        textures
        hand
        manipulate_pen.xml
        manipulate_egg.xml
        robot.xml
        manipulate_block.xml
        shared.xml
        reach.xml
        shared_asset.xml
        LICENSE.md
        stls
        fetch
        shoulder_pan_link_collision.stl
        bellows_link_collision.stl
        wrist_roll_link_collision.stl
        l_wheel_link_collision.stl
        wrist_flex_link_collision.stl
        torso_fixed_link.stl
        upperarm_roll_link_collision.stl
        elbow_flex_link_collision.stl
        forearm_roll_link_collision.stl
        gripper_link.stl
        r_wheel_link_collision.stl
        shoulder_lift_link_collision.stl
        head_tilt_link_collision.stl
        head_pan_link_collision.stl
        torso_lift_link_collision.stl
        .get
        hand
        knuckle.stl
        F3.stl
        TH3_z.stl
        TH1_z.stl
        forearm_electric_cvx.stl
        F1.stl
        wrist.stl
        F2.stl
        lfmetacarpal.stl
        TH2_z.stl
        toy_text
        cliffwalking.py
        guessing_game.py
        blackjack.py
        taxi.py
        hotter_colder.py
        __init__.py
        discrete.py
        frozen_lake.py
        roulette.py
        kellycoinflip.py
        nchain.py
        __init__.py
        README.md
        tests
        test_determinism.py
        test_envs.py
        test_envs_semantics.py
        __init__.py
        spec_list.py
        test_registration.py
        unittest
        __init__.py
        memorize_digits.py
        cube_crash.py
        algorithmic
        repeat_copy.py
        algorithmic_env.py
        reversed_addition.py
        reverse.py
        duplicated_input.py
        __init__.py
        tests
        test_algorithmic.py
        __init__.py
        copy_.py
    - Makefile
    - examples
      - agents
        cem.py
        random_agent.py
        _policies.py
        keyboard_agent.py
      - scripts
        sim_env
        benchmark_runner
        list_envs
    - test.dockerfile
    - CODE_OF_CONDUCT.rst
    - vendor
      - Xdummy
    - README.rst
    - setup.py
    - LICENSE.md
    - .travis.yml
    - scripts
      - generate_json.py
    - bin
      - render.py
      - docker_entrypoint
    - requirements.txt
    - Dockerfile
    - unittest.cfg
    - .gitignore
    - docs
      - agents.md
      - readme.md
      - environments.md
      - misc.md
    - .dockerignore
    - tox.ini
    - requirements_dev.txt
  - baselines
    - LICENSE
    - setup.py
    - data
    - baselines
      - gail
        statistics.py
        result
        gail-result.md
        run_mujoco.py
        adversary.py
        dataset
        mujoco_dset.py
        __init__.py
        mlp_policy.py
        __init__.py
        README.md
        trpo_mpi.py
        behavior_clone.py
        gail-eval.py
      - her
        util.py
        ddpg.py
        rollout.py
        replay_buffer.py
        actor_critic.py
        normalizer.py
        __init__.py
        README.md
        experiment
        config.py
        play.py
        plot.py
        train.py
        __init__.py
        her.py
      - deepq
        simple.py
        replay_buffer.py
        build_graph.py
        models.py
        experiments
        run_atari.py
        train_cartpole.py
        enjoy_mountaincar.py
        enjoy_cartpole.py
        enjoy_pong.py
        train_mountaincar.py
        custom_cartpole.py
        __init__.py
        __init__.py
        README.md
        utils.py
      - ddpg
        ddpg.py
        training.py
        models.py
        noise.py
        memory.py
        __init__.py
        main.py
        README.md
      - acer
        run_atari.py
        buffer.py
        policies.py
        acer_simple.py
        __init__.py
        README.md
      - a2c
        run_atari.py
        policies.py
        __init__.py
        README.md
        utils.py
        a2c.py
      - ppo2
        run_atari.py
        ppo2.py
        run_mujoco.py
        policies.py
        __init__.py
        README.md
      - ppo1
        pposgd_simple.py
        run_atari.py
        run_mujoco.py
        cnn_policy.py
        mlp_policy.py
        __init__.py
        README.md
      - common
        console_util.py
        mpi_adam.py
        mpi_moments.py
        misc_util.py
        mpi_fork.py
        distributions.py
        schedules.py
        math_util.py
        vec_env
        subproc_vec_env.py
        dummy_vec_env.py
        vec_normalize.py
        vec_frame_stack.py
        __init__.py
        cmd_util.py
        atari_wrappers.py
        cg.py
        __init__.py
        tests
        test_tf_util.py
        test_schedules.py
        test_segment_tree.py
        mpi_running_mean_std.py
        dataset.py
        tf_util.py
        running_mean_std.py
        segment_tree.py
      - logger.py
      - __init__.py
      - results_plotter.py
      - trpo_mpi
        run_atari.py
        run_mujoco.py
        nosharing_cnn_policy.py
        __init__.py
        README.md
        trpo_mpi.py
      - acktr
        run_atari.py
        acktr_cont.py
        run_mujoco.py
        running_stat.py
        value_functions.py
        policies.py
        filters.py
        acktr_disc.py
        __init__.py
        kfac_utils.py
        README.md
        utils.py
        kfac.py
      - bench
        monitor.py
        __init__.py
        benchmarks.py
    - README.md
    - .gitignore
  - ml-agents
    - LICENSE
    - unity-volume
      - .gitignore
    - CONTRIBUTING.md
    - .gitattributes
    - unity-environment
      - Assets
        ML-Agents
        Examples
        Crawler.meta
        3DBall
        TFModels.meta
        3DBall.unity
        Scripts
        Ball3DAgent.cs.meta
        Ball3DDecision.cs.meta
        Ball3DAgent.cs
        Ball3DHardAgent.cs
        Ball3DHardAgent.cs.meta
        Ball3DAcademy.cs.meta
        Ball3DDecision.cs
        Ball3DAcademy.cs
        Materials.meta
        Prefabs
        Game.prefab
        GameHard.prefab.meta
        GameHard.prefab
        Game.prefab.meta
        3DBallHard.unity.meta
        Scripts.meta
        Prefabs.meta
        TFModels
        3DBall.bytes.meta
        3DBallHard.bytes.meta
        3DBall.bytes
        3DBallHard.unity
        Materials
        logo.png.meta
        Materials.meta
        Text.mat
        Text.mat.meta
        Materials
        logo2.mat.meta
        logo1.mat.meta
        logo1.mat
        logo2.mat
        3DBall.unity.meta
        Reacher
        TFModels.meta
        Scripts
        ReacherAcademy.cs
        ReacherDecision.cs
        ReacherGoal.cs
        ReacherAgent.cs.meta
        ReacherGoal.cs.meta
        ReacherAcademy.cs.meta
        ReacherDecision.cs.meta
        ReacherAgent.cs
        Scene.unity
        Materials.meta
        Scene.unity.meta
        Prefabs
        Agent.prefab.meta
        Agent.prefab
        Scripts.meta
        Prefabs.meta
        TFModels
        Reacher.bytes.meta
        Materials
        Goal_on.mat.meta
        Materials.meta
        Goal_on.mat
        Goal.mat
        Goal.mat.meta
        Materials
        checker1.mat
        checker1.mat.meta
        checker.mat.meta
        checker.mat
        PushBlock
        TFModels.meta
        Scripts
        GoalDetect.cs.meta
        PushAgentBasic.cs.meta
        PushBlockAcademy.cs.meta
        GoalDetect.cs
        PushAgentBasic.cs
        PushBlockAcademy.cs
        Scenes
        PushBlock.unity
        PushBlock.unity.meta
        Prefabs
        PushBlockArea.prefab.meta
        PushBlockArea.prefab
        Scripts.meta
        Prefabs.meta
        TFModels
        PushBlock.bytes.meta
        Scenes.meta
        WallJump
        TFModels.meta
        Scripts
        WallJumpAcademy.cs
        WallJumpAgent.cs
        WallJumpAcademy.cs.meta
        WallJumpAgent.cs.meta
        Scenes
        WallJump.unity.meta
        WallJump.unity
        Prefabs
        WallJumpArea.prefab.meta
        WallJumpArea.prefab
        Scripts.meta
        Prefabs.meta
        TFModels
        WallJump.bytes.meta
        Scenes.meta
        Material.meta
        Material
        spawnVolumeMaterial.mat.meta
        spawnVolumeMaterial.mat
        wallMaterial.mat.meta
        wallMaterial.mat
        Template
        Scripts
        TemplateDecision.cs.meta
        TemplateAgent.cs
        TemplateAcademy.cs.meta
        TemplateAgent.cs.meta
        TemplateDecision.cs
        TemplateAcademy.cs
        Scene.unity
        Scene.unity.meta
        Scripts.meta
        PushBlock.meta
        Tennis
        TFModels.meta
        Scripts
        TennisAcademy.cs.meta
        hitWall.cs
        hitWall.cs.meta
        TennisAgent.cs
        TennisAgent.cs.meta
        TennisArea.cs
        TennisArea.cs.meta
        TennisAcademy.cs
        Racket.meta
        Materials.meta
        Prefabs
        TennisArea.prefab.meta
        TennisArea.prefab
        Racket
        Racket.obj.meta
        Materials.meta
        Racket.obj
        Materials
        defaultMat.mat
        defaultMat.mat.meta
        Scripts.meta
        Prefabs.meta
        Tennis.unity
        TFModels
        Tennis.bytes
        Tennis.bytes.meta
        Materials
        invisible.mat.meta
        racketMat.physicMaterial
        ballMat.physicMaterial.meta
        ballMat.physicMaterial
        bounce.physicMaterial.meta
        racketMat.physicMaterial.meta
        bounce.physicMaterial
        sand.mat
        invisible.mat
        sand.mat.meta
        NetMat.mat
        NetMat.mat.meta
        Tennis.unity.meta
        Hallway
        TFModels.meta
        Scripts
        HallwayAcademy.cs
        HallwayAgent.cs.meta
        HallwayAgent.cs
        HallwayAcademy.cs.meta
        Scenes
        Hallway.unity
        Hallway.unity.meta
        Prefabs
        orangeBlock.prefab
        violetBlock.prefab
        orangeBlock.prefab.meta
        HallwayArea.prefab.meta
        HallwayArea.prefab
        violetBlock.prefab.meta
        Scripts.meta
        Prefabs.meta
        TFModels
        Hallway.bytes.meta
        Scenes.meta
        Material.meta
        Material
        Orange.mat
        Red.mat
        Red.mat.meta
        Goal.mat
        Orange.mat.meta
        Goal.mat.meta
        PrototypeCheckerAlbedo.png.meta
        GridWorld.meta
        Soccer.meta
        BananaCollectors.meta
        SharedAssets.meta
        SharedAssets
        Scripts
        Area.cs
        FlyCamera.cs.meta
        RayPerception.cs.meta
        RandomDecision.cs.meta
        RayPerception.cs
        FlyCamera.cs
        CameraFollow.cs.meta
        RandomDecision.cs
        CameraFollow.cs
        Area.cs.meta
        Materials.meta
        Scripts.meta
        Materials
        Obstacle.mat
        agent.mat.meta
        CheckerSquare.mat
        Ground.mat
        Ball.mat.meta
        SuccessGround.mat
        Block.mat
        SuccessGround.mat.meta
        CheckerGoal.mat.meta
        FailGround.mat.meta
        Obstacle.mat.meta
        Ball.mat
        CheckerMany.mat
        redAgent.mat.meta
        agent.mat
        Wall.mat.meta
        goal.mat.meta
        CheckerRectangle.mat
        UIDefault.mat
        Ground.mat.meta
        goal.mat
        redAgent.mat
        FailGround.mat
        Block.mat.meta
        CheckerSquare.mat.meta
        blueAgent.mat
        CheckerRectangle.mat.meta
        CheckerMany.mat.meta
        blueAgent.mat.meta
        PrototypeCheckerAlbedo.png.meta
        CheckerGoal.mat
        Wall.mat
        UIDefault.mat.meta
        Basic.meta
        Crawler
        TFModels.meta
        Scripts
        CrawlerBodyContact.cs
        CrawlerAcademy.cs.meta
        CrawlerAgentConfigurable.cs.meta
        CrawlerLegContact.cs.meta
        CrawlerAcademy.cs
        CrawlerLegContact.cs
        CrawlerBodyContact.cs.meta
        CrawlerAgentConfigurable.cs
        Prefabs
        Crawler.prefab
        Crawler.prefab.meta
        Crawler.unity
        Crawler.unity.meta
        Scripts.meta
        Prefabs.meta
        TFModels
        crawler.bytes.meta
        Template.meta
        WallJump.meta
        Reacher.meta
        BananaCollectors
        BananaRL.unity
        Models.meta
        TFModels.meta
        BananaImitation.unity
        BananaImitation.unity.meta
        Scripts
        BananaAcademy.cs.meta
        BananaAcademy.cs
        BananaArea.cs.meta
        BananaLogic.cs.meta
        BananaLogic.cs
        BananaAgent.cs.meta
        BananaAgent.cs
        BananaArea.cs
        Materials.meta
        Models
        banana.fbx.meta
        banana.fbx
        Prefabs
        Banana.prefab
        Banana.prefab.meta
        BadBanana.prefab
        StudentAgent.prefab.meta
        TeachingArea.prefab
        TeacherAgent.prefab
        TeacherAgent.prefab.meta
        StudentAgent.prefab
        RLAgent.prefab.meta
        RLAgent.prefab
        RLArea.prefab
        TeachingArea.prefab.meta
        RLArea.prefab.meta
        BadBanana.prefab.meta
        Scripts.meta
        Prefabs.meta
        TFModels
        BananaRL.bytes.meta
        BananaIL.bytes
        BananaIL.bytes.meta
        BananaRL.unity.meta
        Materials
        white.mat
        agent.mat.meta
        lazer.mat.meta
        swatch.mat
        badColor.mat.meta
        badColor.mat
        red.mat
        black.mat.meta
        agent.mat
        black.mat
        swatch.mat.meta
        bad.mat.meta
        lazer.mat
        red.mat.meta
        swatchMaster.psd.meta
        white.mat.meta
        bad.mat
        swatchMaster.psd
        Soccer
        TFModels.meta
        Scripts
        SoccerBallController.cs.meta
        SoccerBallController.cs
        AgentSoccer.cs
        SoccerAcademy.cs.meta
        SoccerFieldArea.cs
        SoccerFieldArea.cs.meta
        SoccerAcademy.cs
        AgentSoccer.cs.meta
        Scenes
        SoccerTwos.unity
        SoccerTwos.unity.meta
        Materials.meta
        Prefabs
        SoccerFieldTwos.prefab.meta
        SoccerBall
        Textures
        SoccerBallNormalMap.png.meta
        Meshes
        SoccerBallMesh.fbx.meta
        SoccerBallMesh.fbx
        Meshes.meta
        Materials.meta
        Prefabs
        SoccerBall.prefab
        SoccerBall.prefab.meta
        Prefabs.meta
        Materials
        Black-Ball-Material.mat
        White-Ball-Material.mat
        White-Ball-Material.mat.meta
        Black-Ball-Material.mat.meta
        Textures.meta
        SoccerBall.meta
        SoccerFieldTwos.prefab
        Scripts.meta
        Prefabs.meta
        TFModels
        Soccer.bytes.meta
        Scenes.meta
        Materials
        redGoal.mat
        NoFriction.physicMaterial.meta
        rollyCubeFriction.physicMaterial.meta
        white.mat
        Clear.mat.meta
        NoFriction.physicMaterial
        blueGoal.mat
        Bouncy.physicMaterial.meta
        wall.mat.meta
        swatch.mat
        zeroFriction.physicMaterial
        zeroFriction.physicMaterial.meta
        blueGoal.mat.meta
        black.mat.meta
        grass.mat
        black.mat
        Bouncy.physicMaterial
        wall.mat
        swatch.mat.meta
        rollyCubeFriction.physicMaterial
        Clear.mat
        redGoal.mat.meta
        swatchMaster.psd.meta
        reducedFriction.physicMaterial
        white.mat.meta
        swatchMaster.psd
        grass.mat.meta
        reducedFriction.physicMaterial.meta
        3DBall.meta
        Basic
        TFModels.meta
        Scripts
        BasicAcademy.cs
        BasicDecision.cs.meta
        BasicAgent.cs
        BasicAcademy.cs.meta
        BasicAgent.cs.meta
        BasicDecision.cs
        Scene.unity
        Scene.unity.meta
        Scripts.meta
        TFModels
        Basic.bytes.meta
        Basic.bytes
        Bouncer.meta
        Tennis.meta
        Bouncer
        TFModels.meta
        Bouncer.unity.meta
        Scripts
        BouncerAgent.cs
        BouncerBanana.cs
        BouncerBanana.cs.meta
        BouncerAgent.cs.meta
        BouncerAcademy.cs.meta
        BouncerAcademy.cs
        Bouncer.unity
        Prefabs
        Environment.prefab.meta
        Environment.prefab
        RLAgent.prefab.meta
        RLAgent.prefab
        Scripts.meta
        Prefabs.meta
        TFModels
        Bouncer.bytes.meta
        Bouncer.bytes
        GridWorld
        TFModels.meta
        GridWorld.unity
        Scripts
        GridAcademy.cs.meta
        GridAgent.cs
        GridAcademy.cs
        GridAgent.cs.meta
        GridWorld.unity.meta
        Resources
        goal.prefab
        agent.prefab
        pit.prefab
        pit.prefab.meta
        agent.prefab.meta
        goal.prefab.meta
        Materials.meta
        Scripts.meta
        Resources.meta
        TFModels
        GridWorld_3x3.bytes.meta
        GridWorld_5x5.bytes.meta
        Materials
        goalMaterial.mat
        pitMaterial.mat.meta
        pitMaterial.mat
        Floor.mat
        goalMaterial.mat.meta
        Floor.mat.meta
        Hallway.meta
        Scripts
        ResetParameters.cs.meta
        CoreBrainPlayer.cs.meta
        Decision.cs
        CoreBrainInternal.cs.meta
        Brain.cs.meta
        Monitor.cs.meta
        Brain.cs
        CoreBrain.cs.meta
        ResetParameters.cs
        Academy.cs
        ExternalCommunicator.cs.meta
        CoreBrainHeuristic.cs
        UnityAgentsException.cs.meta
        CoreBrain.cs
        CoreBrainInternal.cs
        ExternalCommunicator.cs
        BCTeacherHelper.cs.meta
        BCTeacherHelper.cs
        Communicator.cs
        UnityAgentsException.cs
        Decision.cs.meta
        CoreBrainHeuristic.cs.meta
        Communicator.cs.meta
        Agent.cs
        CoreBrainExternal.cs.meta
        CoreBrainExternal.cs
        Monitor.cs
        CoreBrainPlayer.cs
        Academy.cs.meta
        Agent.cs.meta
        Editor
        AgentEditor.cs
        MLAgentsEditModeTest.cs
        BrainEditor.cs
        BrainEditor.cs.meta
        ResetParameterDrawer.cs.meta
        ResetParameterDrawer.cs
        AgentEditor.cs.meta
        MLAgentsEditModeTest.cs.meta
        Examples.meta
        Plugins
        JSON.meta
        JSON
        Newtonsoft.Json.dll.meta
        Plugins.meta
        Scripts.meta
        Editor.meta
        ML-Agents.meta
      - ProjectSettings
        DynamicsManager.asset
        PresetManager.asset
        NetworkManager.asset
        QualitySettings.asset
        ClusterInputManager.asset
        TagManager.asset
        EditorSettings.asset
        GraphicsSettings.asset
        AudioManager.asset
        EditorBuildSettings.asset
        UnityConnectSettings.asset
        Physics2DSettings.asset
        TimeManager.asset
        ProjectSettings.asset
        ProjectVersion.txt
        InputManager.asset
        NavMeshAreas.asset
    - python
      - Basics.ipynb
      - learn.py
      - trainer_config.yaml
      - unityagents
        curriculum.py
        __init__.py
        exception.py
        brain.py
        environment.py
      - setup.py
      - curricula
        test.json
        push.json
        wall.json
      - tests
        test_bc.py
        test_unitytrainers.py
        test_unityagents.py
        test_ppo.py
        __init__.py
      - requirements.txt
      - unitytrainers
        models.py
        buffer.py
        __init__.py
        trainer.py
        trainer_controller.py
        bc
        models.py
        __init__.py
        trainer.py
        ppo
        models.py
        __init__.py
        trainer.py
    - README.md
    - CODE_OF_CONDUCT.md
    - Dockerfile
    - .gitignore
    - docs
      - Training-ML-Agents.md
      - Feature-Monitor.md
      - Learning-Environment-Best-Practices.md
      - Learning-Environment-Design-Player-Brains.md
      - Learning-Environment-Design-Brains.md
      - Learning-Environment-Design.md
      - images
        cudnn_zip_files.PNG
        path_variables.PNG
        new_system_variable.PNG
        system_variable_name_value.PNG
        cuda_toolkit_directory.PNG
        anaconda_install.PNG
        anaconda_default.PNG
        conda_new.PNG
      - Learning-Environment-Examples.md
      - Training-Curriculum-Learning.md
      - Background-Unity.md
      - Using-Tensorboard.md
      - Readme.md
      - Limitations-and-Common-Issues.md
      - Learning-Environment-Create-New.md
      - ML-Agents-Overview.md
      - doxygen
        navtree.css
        doxygenbase.css
        unity.css
        Readme.md
        footer.html
        header.html
      - Installation.md
      - Learning-Environment-Design-Agents.md
      - Background-Jupyter.md
      - Background-Machine-Learning.md
      - Learning-Environment-Design-Academy.md
      - Learning-Environment-Design-External-Internal-Brains.md
      - Training-on-Amazon-Web-Service.md
      - Migrating-v0.3.md
      - Using-TensorFlow-Sharp-in-Unity.md
      - Background-TensorFlow.md
      - Getting-Started-with-Balance-Ball.md
      - Using-Docker.md
      - Learning-Environment-Design-Heuristic-Brains.md
      - Installation-Windows.md
      - API-Reference.md
      - Glossary.md
      - dox-ml-agents.conf
      - Training-Imitation-Learning.md
      - Training-PPO.md
      - Feature-Memory.md
      - Python-API.md
- .gitattributes
- README.md
- research
- run
  - baselines_wrapper.py
  - run_dqn.py
  - .gitignore
  - run_baselines.py
  - run_a2c.py

Deep Reinforcement Learning of an Agent in a Modern 3D Video Game

This repository provides access to all material necessary to reproduce the results of our paper on Deep Reinforcement Learning applied to a modern 3D video-game. The video-game used in our experiments is a 3D local-multiplayer game developed using the Unity game-engine and is called Delivery Duel. We evaluated the impact on learning performance of different modifications to the environment by applying two recent deep reinforcement learning algorithms, namely DQN (with various extensions) and A2C.

Data included in this repository:

The research paper and poster this repository represents (research/)
Raw reproducible results of our research (results/)
Source code of the frameworks in the state they were used in our work, including some modifications / extensions (frameworks/)
Built instances of Delivery Duel necessary to reproduce our results and which can be used for future work (deliveryduel/)

Delivery Duel

In Delivery Duel the player / agent controls a delivery van. The agent's goal is to deliver pizzas to target locations in an urban environment. The entire scene is rendered from an approximately 75° top-down perspective. Delivery destinations are highlighted above the corresponding building by a big circular marker. The agent completes a delivery by conducting a throw action in the correct direction and hitting the destination building. After a delivery, the agent has to return to its base, located in the middle of the city, before a new destination is highlighted. The agent's score is increased for delivering items, proportional to how fast the item was delivered, and for returning to the base.

While Delivery Duel offers local-multiplayer capability for up to four players, our experiments for solely conducted on the singleplayer mode.

Optional Environment Modifications

There are three optional environment modifications, which can be turned off or on, resulting in eight possible configurations:

Environment Representation: Complex 3D render-mode or Simple 2D render-mode
Agent Motion Control: Physics-based movement or completely linear movement
Reward Signal: In-game score (very sparse) or more continuous reward (also taking distance to target into account)

These three modifications are often abbreviated as 3D / 2D, Phys / Lin, Non-Cont / Cont.

When using the run_baselines script for running experiments, you can use the optional commandline argument --unity-arguments "<additional arguments>" to pass commandline arguments to the unity process. The following unity arguments are available:

--render-mode: complex or simple
--motion-control: physical or linear
--reward-signal: sparse or cont
--human-players: int in range [0, 3]

The first three arguments can be used to turn the environment modifications on or off. The fourth argument --human-players can be used to start an instance of the game with support for 0 to 3 additional human players (controlled using gamepads).

For instance, a dqn test-run using the complex render-mode, linear motion control and a continuous reward signal can be executed by running:

python run_baselines.py <path_to_deliveryduel> --method dqn --unity_arguments "--render-mode complex --motion-control linear --reward-signal cont"

In order to start an instance, which loads an existing model-file in order to play against it using a gamepad, you can run:

python run_baselines.py <path_to_deliveryduel> --method dqn --enjoy <save_folder/filename> --unity_arguments "--human-players 1"

(You can specify the folder / filename models get saved to after training by passing the argument --model-file <folder/filename> to a training process)

Human Player Controls

When playing with human players enabled (see unity-arguments under 'Optional Environment Modifications'), you can use the following controls:

Delivery Duel Credits

Delivery Duel was developed by Samuel Arzt, Katrin-Anna Zibuschka and Lukas Machegger, who approved to make the game publicly available for scientific purposes. The icons for delivery items were used under the CC3 License, with attribution to the following authors:

Pizza Icon: Designed by freepik from Flaticon.
Cabbage, Mail, Milk and Notepad Icon: Designed by madebyoliver from Flaticon.

Frameworks

In order to train Deep Reinforcement Learning agents on this novel environment, three different frameworks were combined:

Baselines provides open-source Python implementations of popular RL algorithms. Gym defines an open-source Python interface for agent environments. ML-Agents extends the Unity engine with an interface for testing AI algorithms on existing Unity games by offering an API which connects Unity with Python. Please see each individual framework's license file included in their respective subdirectory for license information of each framework.

Please refer to the individual repositories of the frameworks for an installation guide.

The source code of these frameworks in the state they were used for our paper is also included in this repository (see frameworks/). The code of Baselines and ML-Agents was slightly modified / extended to fit our needs. In order to train a reinforcement learning agent in a Unity game environment using the algorithms provided by Baselines, the ML-Agents environment was wrapped to function as a Gym environment.

The wrappers used for training a ML-Agents environment using Baselines can be found in run/baselines_wrapper.py. The code used for executing the test-runs of our paper can be found in run/run_baselines.py.