Skip to content
Snippets Groups Projects
Commit 0ace9ed0 authored by mkoenig's avatar mkoenig
Browse files

add readme

parent feda4f46
Branches
No related tags found
No related merge requests found
# ma
# Learn to Drive from Pixels only
**Autonomous Driving with an RC car using Reinforcement Learning and Evolution Strategies**
Master Thesis - Autonomous RC Car Driving with RL
This work used a RC car for autonomous driving. The car learned to drive by itself through RL and Evolution Strategies.
## Getting started
This work is structured as follows:
To make it easy for you to get started with GitLab, here's a list of recommended next steps.
- *In [yolo](./yolo/)* An object detector is implemented to detect cones.
Already a pro? Just edit this README.md and make it your own. Want to make it easy? [Use the template at the bottom](#editing-this-readme)!
- *In [rl](./rl/)* the reinforcement learning algorithms are implemented
## Add your files
- *In [evo](./evo/)* the evolution strategies with a master-worker framework is implemented
- [ ] [Create](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#create-a-file) or [upload](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#upload-a-file) files
- [ ] [Add files using the command line](https://docs.gitlab.com/ee/gitlab-basics/add-file.html#add-a-file-using-the-command-line) or push an existing Git repository with the following command:
- *In [simulations](./simulations/)* The Unity simulations for training the car are contained
```
cd existing_repo
git remote add origin https://gitlab.ost.ch/matthias.koenig/ma.git
git branch -M main
git push -uf origin main
```
## Integrate with your tools
- [ ] [Set up project integrations](https://gitlab.ost.ch/matthias.koenig/ma/-/settings/integrations)
## Collaborate with your team
- [ ] [Invite team members and collaborators](https://docs.gitlab.com/ee/user/project/members/)
- [ ] [Create a new merge request](https://docs.gitlab.com/ee/user/project/merge_requests/creating_merge_requests.html)
- [ ] [Automatically close issues from merge requests](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically)
- [ ] [Enable merge request approvals](https://docs.gitlab.com/ee/user/project/merge_requests/approvals/)
- [ ] [Automatically merge when pipeline succeeds](https://docs.gitlab.com/ee/user/project/merge_requests/merge_when_pipeline_succeeds.html)
## Test and Deploy
Use the built-in continuous integration in GitLab.
- [ ] [Get started with GitLab CI/CD](https://docs.gitlab.com/ee/ci/quick_start/index.html)
- [ ] [Analyze your code for known vulnerabilities with Static Application Security Testing(SAST)](https://docs.gitlab.com/ee/user/application_security/sast/)
- [ ] [Deploy to Kubernetes, Amazon EC2, or Amazon ECS using Auto Deploy](https://docs.gitlab.com/ee/topics/autodevops/requirements.html)
- [ ] [Use pull-based deployments for improved Kubernetes management](https://docs.gitlab.com/ee/user/clusters/agent/)
- [ ] [Set up protected environments](https://docs.gitlab.com/ee/ci/environments/protected_environments.html)
***
# Editing this README
When you're ready to make this README your own, just edit this file and use the handy template below (or feel free to structure it however you want - this is just a starting point!). Thank you to [makeareadme.com](https://www.makeareadme.com/) for this template.
## Suggestions for a good README
Every project is different, so consider which of these sections apply to yours. The sections used in the template are suggestions for most open source projects. Also keep in mind that while a README can be too long and detailed, too long is better than too short. If you think your README is too long, consider utilizing another form of documentation rather than cutting out information.
## Name
Choose a self-explaining name for your project.
## Description
Let people know what your project can do specifically. Provide context and add a link to any reference visitors might be unfamiliar with. A list of Features or a Background subsection can also be added here. If there are alternatives to your project, this is a good place to list differentiating factors.
## Badges
On some READMEs, you may see small images that convey metadata, such as whether or not all the tests are passing for the project. You can use Shields to add some to your README. Many services also have instructions for adding a badge.
## Visuals
Depending on what you are making, it can be a good idea to include screenshots or even a video (you'll frequently see GIFs rather than actual videos). Tools like ttygif can help, but check out Asciinema for a more sophisticated method.
## Installation
Within a particular ecosystem, there may be a common way of installing things, such as using Yarn, NuGet, or Homebrew. However, consider the possibility that whoever is reading your README is a novice and would like more guidance. Listing specific steps helps remove ambiguity and gets people to using your project as quickly as possible. If it only runs in a specific context like a particular programming language version or operating system or has dependencies that have to be installed manually, also add a Requirements subsection.
## Usage
Use examples liberally, and show the expected output if you can. It's helpful to have inline the smallest example of usage that you can demonstrate, while providing links to more sophisticated examples if they are too long to reasonably include in the README.
## Support
Tell people where they can go to for help. It can be any combination of an issue tracker, a chat room, an email address, etc.
## Roadmap
If you have ideas for releases in the future, it is a good idea to list them in the README.
## Contributing
State if you are open to contributions and what your requirements are for accepting them.
For people who want to make changes to your project, it's helpful to have some documentation on how to get started. Perhaps there is a script that they should run or some environment variables that they need to set. Make these steps explicit. These instructions could also be useful to your future self.
You can also document commands to lint the code or run tests. These steps help to ensure high code quality and reduce the likelihood that the changes inadvertently break something. Having instructions for running tests is especially helpful if it requires external setup, such as starting a Selenium server for testing in a browser.
## Authors and acknowledgment
Show your appreciation to those who have contributed to the project.
## License
For open source projects, say how it is licensed.
## Project status
If you have run out of energy or time for your project, put a note at the top of the README saying that development has slowed down or stopped completely. Someone may choose to fork your project or volunteer to step in as a maintainer or owner, allowing your project to keep going. You can also make an explicit request for maintainers.
## MA Autonomous Driving with rc car
## Project Steps:
- [ ] Cone Detection
- [ ] Cone Keypoints Regression
- [ ] Basic driving alorithm
- [x] Basic Simulator (unity)
- [ ] Driving with RL in simulator
- [ ] Driving with basic Algorithm in simulator
- [ ] extend simulator (point cloud)
## == Week 4 15.3 - 23.3 ==
- Boundingbox extraction from unity
- Yolo implementation
- Yolo modification
- papers
- Dataset
## == Week 5 23.3 - 29.3 ==
- yolo training on dataset
- reinforcement learning with mlagent api
- 3D Scan
- PointCloud to Mesh
## == Week 6 30.3 - 5.4 ==
- yolo bb as feature input to RL algorithm
- implement trainer with ml-agent python API
- improve mesh
# A Master Worker architecture for training Models with Evolution Strategies (ES).
A master-worker architecture for parallel distributed training of neuronal networks (NN)
with Evolution Strategies (ES). The implemented ES algorithm is <cite>[Covariance Matrix Adaptation Evolution Strategies][1]</cite> (CMA-ES).
The implementation is based on <cite>[[2]]</cite> and <cite>[[3]]</cite>.
In this work, the model is a hard-attention vision model to solve a control task based on image input. The models can be found in [solutions.py](./algorithm/solution.py) and the tasks is defined in [task.py](./algorithm/task.py).
## Start a Training
An example configuration can be found in [rc_car_config.yaml](./rc_car_config.yaml).
First, on all machines the workers have to be started with:
```
bash start_workers.sh --config rc_car_config.yaml --num-workers 10
```
The number of workers `--num-workers` should be the same for all machines.
The master is then started with
```
python run_master.py --config rc_car_config.yaml --worker-ip ip_1 ip_2 ... ip_n --num-worker 10 --exp-name 'experiment_name' --track
```
The options `--track` enables [Weights & Biases](https://wandb.ai/) tracking.
For more options run `python run_master.py --help`
## Evaluate
For evaluating a trained model run
```
python test.py --log-dir ./runs/ppo/exp_name --n-episodes 10
```
With `--env UnityEditor` the unity editor is chosen as the environment.
More options with `python test.py --help`
The default model is best_model.npz with `--model-filename` another model can be chosen.
With the option `--overplot` the input image with the overlaying attention patches are shown.
[1]: https://arxiv.org/abs/1604.00772
[2]: https://github.com/lerrytang/es_on_gke
[3]: https://github.com/google/brain-tokyo-workshop/tree/master/AttentionAgent
\ No newline at end of file
#CMA Configuration
CMA:
population_size: 64
init_sigma: 0.1
Worker:
simulation: simulation/rccar_outdoor_track
Master:
seed: 0
n_repeat: 16
max_iter: 2000
eval_every_n_iter: 10
n_eval_roll_outs: 100
timeout: 3600
Solution:
image_size: 256
query_dim: 4
output_dim: 2
output_activation: "tanh"
activation: "tanh"
num_hiddens: [64,]
l2_coefficient: 0
patch_size: 16
patch_stride: 8
top_k: 10
data_dim: 3
normalize_positions: True
use_lstm_controller: False
File moved
import sys
sys.path.append(".")
import argparse
import os
import numpy as np
import shutil
import yaml
import utils
from task import CarRacingTask
from solution import VisionTaskSolution
UNITY_PORT = 5012
import numpy as np
import algorithm.utils as utils
import yaml
from algorithm.solution import VisionTaskSolution
from algorithm.task import CarRacingTask
def run_solution(config):
......@@ -20,8 +16,8 @@ def run_solution(config):
logger = utils.create_logger(name="test_solution", debug=False)
task = CarRacingTask(logger=logger)
task.init_task(sim_path=settings["Worker"]["simulation"], unity_port=UNITY_PORT)
env_path = settings["Worker"]["simulation"] if config.env == None else config.env
task.init_task(sim_path=env_path, unity_port=config.port)
task.seed(config.seed)
......@@ -100,6 +96,14 @@ if __name__ == "__main__":
parser.add_argument(
"--seed", help="Random seed for evaluation.", type=int, default=1
)
parser.add_argument(
"--port", help="Uniyt communication port", type=int, default=7050
)
parser.add_argument(
"--env",
default=None,
help="Path to Unity simulation. With 'UnityEditor' the editor scene is used",
)
args, _ = parser.parse_known_args()
run_solution(args)
# Publications - Autonomous driving and Reinforcement Learning
## Autonomous Driving
### Perception pipline with classic path planning
[AMZ Driverless: The Full Autonomous Racing System](https://arxiv.org/pdf/1905.05150.pdf)
Visual cone detection: Mono and stereo camera. LIDAR cone detection.
Cone detection with YOLO and Keypoint regression for 3d pose estimate.
### Reinforcement Learning
[Reinforcement Learning Approach for Formula Student Technion Driverless](https://gip.cs.technion.ac.il/projects/uploads/180_preport_7.pdf)
CNN with VAE as state encoder. Best model SAC with continues actions space. Stacked 4 consecutive frames. Learned a model only for steering.
[Deep Reinforcement Learning for Autonomous Driving](https://arxiv.org/abs/1811.11329)
DDPG Algorithm. TORCS Simulator. Simple features from the simulator. Reward: V\_x\*(angle)-V\_y\*sin(angle)-\gamma\*abs(trackpose) - \beta V\_x abs(trackpose)
[High-speed Autonomous Drifting with Deep Reinforcement Learning](https://arxiv.org/pdf/2001.01377.pdf)
State-Space model. SAC drift controller. CARLA Simulator. Features: state variables and errors between reference and current state.
[End-to-End Race Driving with Deep Reinforcement Learning](https://arxiv.org/pdf/1807.02371.pdf)
Features: RAW RGB image. Simulator: World rally championship game. A3C Algorithm, CNN+LSTM encoder. Reward: R = v(cos(alpha)-d) where alpha angle to track and d distance to track center.
### Imitation Learning
[Generative Adversarial Imitation Learning](https://arxiv.org/pdf/1606.03476.pdf)
### Supervised policy learning
## Reinforcement Learning
[Proximal Policy Optimization Algorithms](https://arxiv.org/pdf/1707.06347.pdf)
[Soft Actor-Critic](https://arxiv.org/pdf/1801.01290.pdf)
[Decision Transformer](https://arxiv.org/pdf/2106.01345.pdf)<br />
Reinforcement Learning via Sequence Modeling
## Misc.
[YOLOv3](https://arxiv.org/pdf/1804.02767.pdf)
[YOLO-Z: Improving small object detection in YOLOv5 for autonomous vehicles](https://arxiv.org/pdf/2112.11798.pdf)
[Real-time 3D Pose Estimation with a
Monocular Camera Using Deep
Learning and Object Priors](https://arxiv.org/pdf/1809.10548.pdf)
[Accurate, Low-Latency Visual Perception for Autonomous Racing: Challenges, Mechanisms, and Practical Solutions](https://arxiv.org/pdf/2007.13971.pdf)
from flask import Flask, render_template, Response, request, json
import numpy as np
import cv2
import sys
import time
import torch
import pathlib
import yaml
import serial
from queue import Queue, Empty
from threading import Thread
import multiprocessing
from multiprocessing import Value, Process, Lock
sys.path.append("..")
from rl.utils import detect_cones
from yolo.model import TinyYoloModified
yolo = TinyYoloModified(2, 416)
if torch.cuda.is_available():
yolo = yolo.to("cuda")
yolo.load_weights("../yolo/checkpoints/TinyYoloModified_yellow.pth")
app = Flask(__name__)
camera = cv2.VideoCapture("/dev/video0") # use 0 for web camera
commands = multiprocessing.Queue()
global speed_cmd
speed_cmd = Value("d", 0)
global steer_cmd
steer_cmd = Value("d", 0)
cmd_lock = Lock()
image_queue = Queue()
global img_counter
img_counter = 0
np.set_printoptions(suppress=True, precision=4)
# real footage from Marshall camera with 220 deg fisheye
LENS_FOV_DEGREE = 220
LENS_IMAGE_CIRCLE_DIAMETER = 5.1
SENSOR_WIDTH = 5.76
SENSOR_HEIGHT = SENSOR_WIDTH / 3840 * 2160
SENSOR_PIXEL_WIDTH = 3840
SENSOR_PIXEL_HEIGHT = 2160
# remote control config
config_file = pathlib.Path(__file__).parent / "config.yml"
with open(config_file, "r") as f:
config = yaml.safe_load(f)
comport = config["serial_port"]
baudrate = config["serial_baudrate"]
ser = serial.Serial(comport, baudrate=baudrate, timeout=0.2)
speed_factor = float(config["speed_factor"])
steer_factor = float(config["steer_factor"])
def map_azimuth_elevation(img, outshape, pix_per_deg):
LENS_MM_PER_DEG = 1.92 / 82.50
h, w = img.shape[:2]
pix_x, pix_y = np.meshgrid(
np.arange(0, outshape[1], dtype=np.float32),
np.arange(0, outshape[0], dtype=np.float32),
)
rad_x = (pix_x - outshape[1] // 2) / pix_per_deg * np.pi / 180
rad_y = (pix_y - outshape[0] // 2) / pix_per_deg * np.pi / 180
y = np.sin(rad_y)
x = np.sin(rad_x) * np.cos(y)
z = np.cos(rad_x) * np.cos(y)
phi = np.arctan2(y, x)
theta = np.arccos(z)
dist = LENS_MM_PER_DEG * (180 / np.pi) * theta
pos_x_mm = np.cos(phi) * dist
pos_y_mm = np.sin(phi) * dist
pix_x = pos_x_mm / SENSOR_WIDTH * w + w / 2
pix_y = pos_y_mm / SENSOR_HEIGHT * h + h / 2
out_img = cv2.remap(img, pix_x, pix_y, cv2.INTER_LINEAR)
return out_img
def send_control_cmd(speed, steer, gear):
"""
speed :float -1...1
steer :float -1...1
gear: bool
"""
# print("speed", speed, "steer:", steer)
if speed > 0.1:
speed = speed**2 * 0.9 * abs(speed_factor) + 10
elif speed < -0.1:
speed = -(speed**2) * 0.9 * abs(speed_factor) - 10
speed = int(speed if speed_factor >= 0 else -speed)
steer = int(steer * steer_factor)
speed = speed + 256 if speed < 0 else speed
steer = steer + 256 if steer < 0 else steer
bits = 1 * gear
s = f"{speed:02x}{steer:02x}{bits:02x}\n"
ser.write(s.encode("utf8"))
def control_loop(spe_cmd, ste_cmd, lock):
gear = False
while True:
start = time.time()
# ser.flush()
with lock:
speed = spe_cmd.value
steer = ste_cmd.value
# send_control_cmd(spe_cmd.value, ste_cmd.value, gear)
# print(speed, steer)
# send_control_cmd(speed, steer, gear)
# print(1./25-(time.time()-start))
time.sleep(max(1.0 / 2 - (time.time() - start), 0))
def capture_frames(): # generate frame by frame from camera
pix_per_deg = 10
outshape = (65 * pix_per_deg, 87 * pix_per_deg)
while True:
# Capture frame-by-frame
success, frame = camera.read() # read the camera frame
if not success:
break
else:
frame = map_azimuth_elevation(frame, outshape, pix_per_deg)
frame1 = cv2.resize(frame, (768, 512), interpolation=cv2.INTER_AREA)
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
detection = detect_cones(
yolo, frame, img_size=416, conf_thres=0.1, nms_thres=0.5
)
for box in detection:
width = int(box[2] * 0.5)
height = int(box[3] * 0.5)
x = int(box[0])
y = int(box[1])
color = (0, 0, 255) if box[4] == 0 else (230, 220, 10)
frame = cv2.rectangle(
frame,
(x, y),
(x + width, y + height),
color,
3,
)
frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
image_queue.put_nowait(frame1)
time.sleep(0.05)
def gen_images():
frame = None
while True:
try:
frame = image_queue.get_nowait()
ret, buffer = cv2.imencode(".jpg", frame)
frame = buffer.tobytes()
yield (
b"--frame\r\n" b"Content-Type: image/jpeg\r\n\r\n" + frame + b"\r\n"
) # concat frame one by one and show result
except Empty:
pass
@app.route("/video_feed")
def video_feed():
# Video streaming route. Put this in the src attribute of an img tag
return Response(gen_images(), mimetype="multipart/x-mixed-replace; boundary=frame")
@app.route("/drive", methods=["GET"])
def drive():
with cmd_lock:
speed_cmd.value = float(request.args.get("speed"))
steer_cmd.value = float(request.args.get("steer"))
# commands.put({"speed": speed, "steer": steer})
return app.response_class(json.dumps(True), content_type="application/json")
@app.route("/capture", methods=["GET"])
def capture():
global img_counter
print("capture")
if not image_queue.empty():
img_counter += 1
frame = image_queue.get_nowait()
cv2.imwrite(f"frames/frame{img_counter}.png", frame)
return app.response_class(json.dumps(True), content_type="application/json")
@app.route("/")
def index():
"""Video streaming home page."""
return render_template("index.html")
if __name__ == "__main__":
# Thread(target=control_loop, daemon=True).start()
# Process(target=capture_frames).start()
# speed_cmd.value = 0.3
# steer_cmd.value = 0.5
p = multiprocessing.Process(
target=control_loop, args=(speed_cmd, steer_cmd, cmd_lock)
).start()
Thread(target=capture_frames, daemon=True).start()
app.run(host="0.0.0.0", debug=True)
import sys
sys.path.append("../evo2")
sys.path.append("../rl")
import os
import yaml
import cv2
import numpy as np
import base64
from rl.ppo_continous import Agent
from rl.cardriver_env import DummyEnv
from evo2.solution import VisionTaskSolution
import torch
import abc
import time
frame_rate = 20
def load_attention_agent(path, model, overplot=False):
with open(os.path.join(path, "config.yaml"), "r") as f:
settings = yaml.safe_load(f)
solution = VisionTaskSolution(
image_size=settings["Solution"]["image_size"],
query_dim=settings["Solution"]["query_dim"],
output_dim=settings["Solution"]["output_dim"],
output_activation=settings["Solution"]["output_activation"],
num_hiddens=settings["Solution"]["num_hiddens"],
l2_coefficient=settings["Solution"]["l2_coefficient"],
patch_size=settings["Solution"]["patch_size"],
patch_stride=settings["Solution"]["patch_stride"],
top_k=settings["Solution"]["top_k"],
data_dim=settings["Solution"]["data_dim"],
activation=settings["Solution"]["activation"],
normalize_positions=settings["Solution"]["normalize_positions"],
use_lstm_controller=settings["Solution"]["use_lstm_controller"],
use_conv_features=settings["Solution"]["use_conv_features"],
show_overplot=overplot,
)
model_file = os.path.join(path, model + ".npz")
solution.load(model_file)
solution.set_log_dir("runs/exp_outdoor")
return solution
class BasePolicyController:
def __init__(self, cam, speed, steer, stop):
self.speed = speed
self.steer = steer
self.stop = stop
self.agent = None
self.cam = cam
def drive(self):
raise NotImplementedError()
class RLPolicyController(BasePolicyController):
def __init__(self, path, cam, speed, steer, stop, n_stacked_obs=8):
super(RLPolicyController, self).__init__(cam, speed, steer, stop)
dummy_env = DummyEnv()
self.agent = Agent(dummy_env)
self.agent.load_agent(path)
self.state = np.zeros((n_stacked_obs, 6, 3))
def drive(self):
self.stop.value = False
while True:
start = time.time()
# print("Speed:", speed.value, "Steer:", steer.value)
detections = self.cam.get_detections()
obs = self.get_obs_from_dedections(detections)
obs = torch.Tensor(obs)
action = self.agent.get_action(obs)
action = action.numpy()
self.speed.value = np.clip(action[0], -0.2, 1.0)
self.steer.value = np.clip(action[1], -1.0, 1.0)
print("speed action:", action[0])
if self.stop.value:
self.speed.value = 0
break
time.sleep(max(1.0 / frame_rate - (time.time() - start), 0))
def get_obs_from_dedections(self, detection):
self.state[1:, :, :] = self.state[:-1, :, :]
# sort detection by box size
bb_area = np.prod(detection[:, 2:4] - detection[:, :2], axis=1)
idx_sorted = bb_area.argsort()[::-1] # reversed order
detection = detection[idx_sorted]
# seperate blue/yellow cones
blue = detection[detection[:, 5] == 0]
yellow = detection[detection[:, 5] == 1]
# blue cones
if len(blue) > 0:
obs = 3 if len(blue) > 3 else len(blue)
bb_box_size = blue[:, 2:4] - blue[:, :2]
self.state[0, :obs, :2] = (
blue[:obs, :2] + bb_box_size[:obs, :2] * 0.5
) # Position normalized
self.state[0, :obs, 0] /= 768
self.state[0, :obs, 1] /= 512
self.state[0, :obs, 2] = blue[:obs, 4] # object confidence
# yellow cones
if len(yellow) > 0:
obs = 3 if len(yellow) > 3 else len(yellow)
bb_box_size = yellow[:, 2:4] - yellow[:, :2]
self.state[0, 3 : 3 + obs, :2] = (
yellow[:obs, :2] + bb_box_size[:obs, :2] * 0.5
) # Position
self.state[0, 3 : 3 + obs, 0] /= 768
self.state[0, 3 : 3 + obs, 1] /= 512
self.state[0, 3 : 3 + obs, 2] = yellow[:obs, 4] # object confidence
return self.state.reshape(-1)
class AttentionPolicyController(BasePolicyController):
def __init__(self, path, model, cam, speed, steer, stop):
super(AttentionPolicyController, self).__init__(cam, speed, steer, stop)
self.agent = load_attention_agent(path, model, overplot=True)
def drive(self):
self.stop.value = False
alpha = 0.9
while True:
start = time.time()
# print("Speed:", speed.value, "Steer:", steer.value)
obs = self.cam.get_frame()
action = self.agent.get_output(obs)
self.speed.value = self.speed.value*alpha+(1-alpha)*np.clip(action[0], -0.6, 0.7)
self.steer.value = np.clip(action[1]*1.0, -1.0, 1.0)
print("speed action:", action[0])
if self.stop.value:
self.speed.value = 0
break
time.sleep(max(1.0 / frame_rate - (time.time() - start), 0))
# Reinforcement learning library for training the RC Car
# Reinforcement Learning library for training the RC Car
A small reinforcment learning libary for training the RC Car with reinforcement learning algorithm.
The two algorithm Proximal Policy Optimization <cite>[PPO][1]</cite> and Soft Actor Critic <cite>[SAC][2]</cite> are implemented.
A small library for training the RC Car with reinforcement learning algorithms.
The two algorithms Proximal Policy Optimization <cite>[PPO][1]</cite> and Soft Actor-Critic <cite>[SAC][2]</cite> are implemented.
A new experiment can be started by running train.py
## Train
```
python train.py --config train_ppo.yaml --log-dir runs/ppo --exp-name 'experiment_name' --track --port 8001 --cuda
```
The options `--track` enables [Weights & Biases](https://wandb.ai/) tracking and
`--port` defines the communication port for the UnityEnvironment.
For more options run `python train.py --help`
for more options run `python train.py --help`
Example configurations can be found in [train_ppo.yaml](./train_ppo.yaml) and [train_sac.yaml](./train_sac.yaml).
The environments are saved in [../simulations](../simulations)
## Evaluate
For evaluating a trained model run
```
......@@ -25,7 +34,8 @@ The default model is best_model.pth with `--model-filename` another model can be
With `--env UnityEditor` the unity editor is chosen as the environment.
The file [policy_gradient.py](./policy_gradient.py) containes a simple from scratch implementation of the policy gradient algorithm in numpy. An example to understand the theorem and algorithm.
In the file [policy_gradient.py](./policy_gradient.py) a simple from-scratch implementation of the policy gradient algorithm in numpy is contined. An example to understand the theorem and algorithm.
......
numpy
pytorch
torchvision
tensorboard
tqdm
imgaug
Pillow
import sys
sys.path.append(".")
import argparse
import os
import numpy as np
import shutil
import yaml
import utils
from cardriver_env import CarDriverEnvCont, CarDriverEnvCont2
from algorithm.ppo import PPOAgent, LSTM_PPOAgent
from algorithm.ppo import LSTM_PPOAgent, PPOAgent
from algorithm.sac import SACAgent
from cardriver_env import CarDriverEnvCont
#from ppo_test import Agent
......@@ -62,7 +61,7 @@ def run_solution(config):
num_features=settings["env"]["num_features"],
worker_id=config.port,
)
env.unity_env.seed(config.seed)
env.env.seed(config.seed)
rewards = []
......
# A Yolo Implementation for Detecting Cones
A PyTorch Yolo implementation for detecting cones. The detected cones were used as features for reinforcement learning algorithms.
## Train
A new training can be started with:
```
python train.py -m config/tiny_yolo.cfg -d config/cones.data -e 300 --evaluation_interval 10 --run_id 'experiment_name' --pretrained_weights weights/yolov3-tiny.weights
```
For more options run `python train.py --help`
An Example configuration can be found in [config/tiny_yolo.cfg](config/tiny_yolo.cfg) and the dataset is defined in [config/cones.data](config/cones.data).
## Evaluate
For evaluating a trained model on a test set run:
```
python test.py -w checkpoints/saved_model.pth -d ./config/test_s.data
```
More options with `python test.py --help`
numpy
pytorch
torchvision
tensorboard
tqdm
imgaug
Pillow
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment