Deep Learning for Autonomous Vehicles: Milestone 1

10 minute read

Published:

In this post, I provide a kickstarter guide to getting started with TrajNet++ framework for human trajectory forecasting, which will prove useful in helping you approach Milestone 1.

Updates (22.04.2021)

  1. Added an FAQ section.
  2. Updated Visualization section.

Updates (20.04.2021)

  1. Resources to get started with GitHub added.

Updates (19.04.2021)

  1. Procedure to setting up TrajNet++ on SCITAS added.

Updates (16.04.2021)

  1. Procedure to submit to AICrowd added.

Updates (15.04.2021)

  1. Git fork the code, and then clone the forked repository. This will help the TAs to track your code easily.
  2. The latest push on trajnetplusplusbaselines now works with Python 3.8
  3. The Next Steps of Milestone 1 have been added, at the end.
Working with (15.04.2021)
  1. MacOS 10.15.7 + Conda (Python3.6 & Python3.8) ✅
  2. Ubuntu 18.04 + Virtualenv (Python3.6 & Python3.8) ✅

Overview

On a high-level, Trajnet++ constitutes four primary components:

  1. Trajnetplusplustools: This repository provides helper functions for trajectory prediction. For instance: trajectory categorization, evaluation metrics, prediction visualization.

  2. Trajnetplusplusdataset: This repository provides scripts to generate train, val and test dataset splits from raw data as well as simulators.

  3. Trajnetplusplusbaselines: This repository provides baseline models (handcrafted as well as data-driven) for human motion prediction. This repository also provides scripts to extensively evaluate the trained models.

  4. Trajnetplusplusdata: This repository provides the already processed real world data as well as synthetic datasets conforming to human motion.

Milestone 1: Getting Started

I describe on how to get started using TrajNet++ with the help of a running example. We will download an already-created synthetic dataset and train an LSTM-based model to perform trajectory forecasting.

Setting Up The Repository

The first step is to setup the repository Trajnetplusplusbaselines for model training. Next, we setup the virtual environment and download the requirements. The virtual environment can also be setup using Conda on local machines.


## 1. On LOCAL MACHINE
## Make virtual environment using either A. virtualenv or B. conda

## A. Using virtualenv
## Works with Python3.6 and Python3.8
virtualenv -p /usr/bin/python3.6 trajnetv
source trajnetv/bin/activate

## B. Using conda
## Works with Python3.6 and Python3.8
conda create --name trajnetv python=3.8
conda activate trajnetv

## 2. On SCITAS
module load gcc
module load python/3.7.3
virtualenv --system-site-packages venvs/trajnetv
source venvs/trajnetv/bin/activate

Set up TrajNet++ on SCITAS after verifying the setup on local machine. For SCITAS, no need to fork the repository again. Clone the already-created forked repository.

## Create directory to setup Trajnet++
mkdir trajnet++
cd trajnet++

## Clone Repositories
# git clone https://github.com/vita-epfl/trajnetplusplusbaselines.git (Old)
git clone <forked_repository>

## Download Requirements
cd trajnetplusplusbaselines/ 
pip install -e .

## SCITAS-Specific (!)
## If previous command gives an error: "requires deeptoolsintervals>=0.1.7, requires plotly>=2.0.0", then:
pip install deeptoolsintervals
pip install plotly
pip install -e .

Follow the next steps for SCITAS as well.

## Additional Requirements (ORCA)
wget https://github.com/sybrenstuvel/Python-RVO2/archive/master.zip
unzip master.zip
rm master.zip

## Setting up ORCA (steps provided in the Python-RVO2 repo)
cd Python-RVO2-master/
pip install cmake
pip install cython
python setup.py build
python setup.py install
cd ../
## Additional Requirements (Social Force)
wget https://github.com/svenkreiss/socialforce/archive/refs/heads/main.zip
unzip main.zip
rm main.zip

## Setting up Social Force
cd socialforce-main/
pip install -e .
cd ../

Our repository is now setup!

Preparing the Dataset

Now, we will download and prepare data for training our models. In this example, we download a synthetic dataset generated using ORCA policy.

cd DATA_BLOCK/
wget https://github.com/vita-epfl/trajnetplusplusdata/releases/download/v3.1/five_parallel_synth.zip
unzip five_parallel_synth.zip
rm five_parallel_synth.zip

ls five_parallel_synth/

You will notice that the current folder contains train data, test data and test_private data. The test data contains the test examples uptil the end of observation period, while the test_private, as the name suggests, contains the ground-truth predictions which will be used as reference to evaluate the performing of forecasting model. You will notice that a validation set is not present. Preparing a validation set is important for performing hyperparameter tuning. You can use the following helper file to split the current training samples into training (80%) and validation split (20%).

cd ../
python create_validation.py --path five_parallel_synth --val_ratio 0.2

The above command will create a new folder five_parallel_synth_split in the DATA_BLOCK folder.

We can additionally transfer the goal information of the dataset in the goal_files folder.

## Preparing Goals folder (additional attributes)
mkdir -p goal_files/train
mkdir goal_files/val
mkdir goal_files/test_private

## For other datasets, the goal files can be different for corresponding dataset split
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/train/
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/val/
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/test_private/

Now that the dataset is ready, its time to train the model! :)

Training Models

Training models is more easier than setting up Trajnet++ !

For SCITAS, the training takes place using bash-scripts, please refer to the tutorial to understand more. The below training procedure is only for your local machines.

All you got to do is ….

python -m trajnetbaselines.lstm.trainer --path five_parallel_synth_split --augment

…. and your LSTM model starts training. Your model will be saved in the five_parallel_synth_split folder within OUTPUT_BLOCK. Currently, models are being saved according to the type of interaction models being used.

In order to train using interaction modules (eg. directional-grid) utilizing additional attribute (goal information), run

python -m trajnetbaselines.lstm.trainer --path five_parallel_synth_split --type 'directional' --goals --augment
## To know more options about trainer 
python -m trajnetbaselines.lstm.trainer --help

For models trained on SCITAS, you can evaluate and visualize these models on your local machine. To do so, ‘scp’ the output files and log files from SCITAS to the repository on your local machine. Note: Maintain the same file structure.

Evaluating Models

One strength of TrajNet++ is its extensive evaluation system. You can read more about it in the metrics section here.

To perform extensive evaluation of your trained model. The results are saved in Results.png

# python -m evaluator.trajnet_evaluator --path <test_dataset> --output <path_to_model_pkl_file>
python -m evaluator.trajnet_evaluator --path five_parallel_synth --output OUTPUT_BLOCK/five_parallel_synth_split/lstm_vanilla_None.pkl OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl

## To know more options about evaluator 
python -m evaluator.trajnet_evaluator --help

To know more about how the evaluation procedure works, please refer to this README.

Visualize Models

Visualize learning curves of two different models

python -m trajnetbaselines.lstm.plot_log OUTPUT_BLOCK/five_parallel_synth_split/lstm_vanilla_None.pkl.log OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl.log

## To view the different log files generated, run the command below:
ls OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl*
## You will notice various log files in form of *.png

Visualize predictions of models

# python -m evaluator.visualize_predictions <ground_truth_file> <prediction_files>
python -m evaluator.visualize_predictions DATA_BLOCK/five_parallel_synth/test_private/orca_five_nontraj_synth.ndjson DATA_BLOCK/five_parallel_synth/test_pred/lstm_vanilla_None_modes1/orca_five_nontraj_synth.ndjson DATA_BLOCK/five_parallel_synth/test_pred/lstm_goals_directional_None_modes1/orca_five_nontraj_synth.ndjson --labels Vanilla D-Grid --n 10 --random -o visualize

## Note the addition of output argument above. The 10 random predictions are saved in the trajnetplusplusbaselines directory. Run:
ls visualize*.png
## You wil see 10 '.png' files with prefix 'visualize' as it was the provided output name.

Next Steps for Milestone 1

  1. Add visualizations (obtained using previous command) of 3 test scenes qualitatively comparing outputs of the vanilla model and D-Grid model (that uses goal information), as well as the quantitative evaluation (Results.png) in the README file of your forked repository.

  2. Next, train the vanilla model and the D-Grid model on TrajNet++ synthetic data and real data following the same steps as above. Link to Train data and Test data. Note that, you will need to make different folders for real_data and synth_data in DATA_BLOCK folder. Your TrajNet++ data folders should have structure similar to below:
    DATA_BLOCK
    │
    └───real_data
    │   └── train
    |   └── val (self-generated)
    │   └── test
    │       │   crowds_zara02.ndjson
    │       │   biwi_eth.ndjson
    │       │   crowds_uni_examples.ndjson
    │   
    └───synth_data
    │   └── train
    |   └── val (self-generated)
    │   └── test
    │       │   orca_synth.ndjson
    |       |   collision_test.ndjson
    
  3. For faster training on real data, you can remove the CFF files the train folder.
    rm <path_to_real_data>/train/cff*
    
  4. You are encouraged to play with other interaction encoders and maybe, design your own! You can validate your designs on the synthetic data before trying out on real data.

  5. Final Step: Upload the predictions of D-Grid on AICrowd. You will have to create an account on AICrowd as well as accept the terms and conditions for the TrajNet++ challenge.

Submission to AICrowd

Lets assume you have two models named ‘synth_model_name’ trained on TrajNet++ synthetic data and ‘real_model_name’ trained on TrajNet++ real data. Also, by default, you have the data directory structure as mentioned above.

Generating Predictions for AICrowd

## Generate for Real data
python -m evaluator.trajnet_evaluator --path real_data --output OUTPUT_BLOCK/real_data/<real_model_name>.pkl --write_only

## Generate for Real data
python -m evaluator.trajnet_evaluator --path synth_data --output OUTPUT_BLOCK/synth_data/<synth_model_name>.pkl --write_only

The above operations will save your model predictions in the test_pred folder within data directory as shown below:

DATA_BLOCK
│
└───real_data
│   └── train
|   └── val (self-generated)
│   └── test_pred
|       └── <real_model_name>_modes1
│           | crowds_zara02.ndjson
│           | biwi_eth.ndjson
│           | crowds_uni_examples.ndjson
│   └── test

└───synth_data
│   └── train
|   └── val (self-generated)
│   └── test_pred
|       └── <synth_model_name>_modes1
│           | orca_synth.ndjson
|           | collision_test.ndjson
│   └── test

Uploading Predictions to AICrowd

These test predictions need to be uploaded to AICrowd.

## KEEP THE SAME FOLDER NAMES and STRUCTURE given below !!
mkdir test
mkdir test/real_data
mkdir test/synth_data
cp DATA_BLOCK/real_data/test_pred/<real_model_name>_modes1/* test/real_data
cp DATA_BLOCK/synth_data/test_pred/<synth_model_name>_modes1/* test/synth_data
zip -r <my_model_name>.zip test/

## Upload the <my_model_name>.zip to AICrowd. 

Done Done! :)

Useful Resources

Introduction to Git

To help you get started with Git, here are some useful resources:

Git Handbook (10 min. read): https://guides.github.com/introduction/git-handbook/

Git Cheatsheet: https://training.github.com/downloads/github-git-cheat-sheet/

FAQ

Q1. Important steps when you come back to the code

Please activate your virtual environment!

Q2. Important steps before you close your code after a good day’s progress :)

Do not forget to push your code on GitHub. It saves your progress! :) Refer to the GitHub resources if you haven’t yet.

Q3. What are the ‘goal’ files in synthetic data, why are they absent in real data?

The goal files contains the ‘final destination’ (goal) of the pedestrian in the ORCA simulator. It does not refer to the location at the end of the prediction period, but the location at the end of the simulation. We have access to these goals only for synthetic data, so only use ‘–goals’ command for synthetic data and not real data. Remember to shift the goal .pkl file to the goal_files folder as shown in the tutorial above.

Q4. How do we open the .png files which are generated when we run the visualization command?

You can transfer the .png files using ‘scp’ to your Desktop (the boring but simple way). Or you can use a text editor that allows you to open .png files from the terminal. I use Sublime Text.