Parth Kothari Home Page

Empirical Evidence

2023-12-31T00:00:00-08:00

‘Do you think it’s possible to get six packs by just exercising daily and controlling the food intake? I mean without taking extra whey protein and supplements and without going to the gym.’

‘I haven’t seen any empirical evidence.’

‘Lets verify it then.’

Google Sheet

Deep Learning for Autonomous Vehicles: Milestone 1

2021-04-14T00:00:00-07:00

In this post, I provide a kickstarter guide to getting started with TrajNet++ framework for human trajectory forecasting, which will prove useful in helping you approach Milestone 1.

Updates (22.04.2021)

Added an FAQ section.
Updated Visualization section.

Updates (20.04.2021)

Resources to get started with GitHub added.

Updates (19.04.2021)

Procedure to setting up TrajNet++ on SCITAS added.

Updates (16.04.2021)

Procedure to submit to AICrowd added.

Updates (15.04.2021)

Git fork the code, and then clone the forked repository. This will help the TAs to track your code easily.
The latest push on trajnetplusplusbaselines now works with Python 3.8
The Next Steps of Milestone 1 have been added, at the end.

Working with (15.04.2021)

MacOS 10.15.7 + Conda (Python3.6 & Python3.8) ✅
Ubuntu 18.04 + Virtualenv (Python3.6 & Python3.8) ✅

Overview

On a high-level, Trajnet++ constitutes four primary components:

Trajnetplusplustools: This repository provides helper functions for trajectory prediction. For instance: trajectory categorization, evaluation metrics, prediction visualization.
Trajnetplusplusdataset: This repository provides scripts to generate train, val and test dataset splits from raw data as well as simulators.
Trajnetplusplusbaselines: This repository provides baseline models (handcrafted as well as data-driven) for human motion prediction. This repository also provides scripts to extensively evaluate the trained models.
Trajnetplusplusdata: This repository provides the already processed real world data as well as synthetic datasets conforming to human motion.

Milestone 1: Getting Started

I describe on how to get started using TrajNet++ with the help of a running example. We will download an already-created synthetic dataset and train an LSTM-based model to perform trajectory forecasting.

Setting Up The Repository

The first step is to setup the repository Trajnetplusplusbaselines for model training. Next, we setup the virtual environment and download the requirements. The virtual environment can also be setup using Conda on local machines.

## 1. On LOCAL MACHINE
## Make virtual environment using either A. virtualenv or B. conda

## A. Using virtualenv
## Works with Python3.6 and Python3.8
virtualenv -p /usr/bin/python3.6 trajnetv
source trajnetv/bin/activate

## B. Using conda
## Works with Python3.6 and Python3.8
conda create --name trajnetv python=3.8
conda activate trajnetv

## 2. On SCITAS
module load gcc
module load python/3.7.3
virtualenv --system-site-packages venvs/trajnetv
source venvs/trajnetv/bin/activate

Set up TrajNet++ on SCITAS after verifying the setup on local machine. For SCITAS, no need to fork the repository again. Clone the already-created forked repository.

## Create directory to setup Trajnet++
mkdir trajnet++
cd trajnet++

## Clone Repositories
# git clone https://github.com/vita-epfl/trajnetplusplusbaselines.git (Old)
git clone <forked_repository>

## Download Requirements
cd trajnetplusplusbaselines/ 
pip install -e .

## SCITAS-Specific (!)
## If previous command gives an error: "requires deeptoolsintervals>=0.1.7, requires plotly>=2.0.0", then:
pip install deeptoolsintervals
pip install plotly
pip install -e .

Follow the next steps for SCITAS as well.

## Additional Requirements (ORCA)
wget https://github.com/sybrenstuvel/Python-RVO2/archive/master.zip
unzip master.zip
rm master.zip

## Setting up ORCA (steps provided in the Python-RVO2 repo)
cd Python-RVO2-master/
pip install cmake
pip install cython
python setup.py build
python setup.py install
cd ../

## Additional Requirements (Social Force)
wget https://github.com/svenkreiss/socialforce/archive/refs/heads/main.zip
unzip main.zip
rm main.zip

## Setting up Social Force
cd socialforce-main/
pip install -e .
cd ../

Our repository is now setup!

Preparing the Dataset

Now, we will download and prepare data for training our models. In this example, we download a synthetic dataset generated using ORCA policy.

cd DATA_BLOCK/
wget https://github.com/vita-epfl/trajnetplusplusdata/releases/download/v3.1/five_parallel_synth.zip
unzip five_parallel_synth.zip
rm five_parallel_synth.zip

ls five_parallel_synth/

You will notice that the current folder contains train data, test data and test_private data. The test data contains the test examples uptil the end of observation period, while the test_private, as the name suggests, contains the ground-truth predictions which will be used as reference to evaluate the performing of forecasting model. You will notice that a validation set is not present. Preparing a validation set is important for performing hyperparameter tuning. You can use the following helper file to split the current training samples into training (80%) and validation split (20%).

cd ../
python create_validation.py --path five_parallel_synth --val_ratio 0.2

The above command will create a new folder five_parallel_synth_split in the DATA_BLOCK folder.

We can additionally transfer the goal information of the dataset in the goal_files folder.

## Preparing Goals folder (additional attributes)
mkdir -p goal_files/train
mkdir goal_files/val
mkdir goal_files/test_private

## For other datasets, the goal files can be different for corresponding dataset split
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/train/
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/val/
cp DATA_BLOCK/five_parallel_synth/orca_five_nontraj_synth.pkl goal_files/test_private/

Now that the dataset is ready, its time to train the model! :)

Training Models

Training models is more easier than setting up Trajnet++ !

For SCITAS, the training takes place using bash-scripts, please refer to the tutorial to understand more. The below training procedure is only for your local machines.

All you got to do is ….

python -m trajnetbaselines.lstm.trainer --path five_parallel_synth_split --augment

…. and your LSTM model starts training. Your model will be saved in the five_parallel_synth_split folder within OUTPUT_BLOCK. Currently, models are being saved according to the type of interaction models being used.

In order to train using interaction modules (eg. directional-grid) utilizing additional attribute (goal information), run

python -m trajnetbaselines.lstm.trainer --path five_parallel_synth_split --type 'directional' --goals --augment

## To know more options about trainer 
python -m trajnetbaselines.lstm.trainer --help

For models trained on SCITAS, you can evaluate and visualize these models on your local machine. To do so, ‘scp’ the output files and log files from SCITAS to the repository on your local machine. Note: Maintain the same file structure.

Evaluating Models

One strength of TrajNet++ is its extensive evaluation system. You can read more about it in the metrics section here.

To perform extensive evaluation of your trained model. The results are saved in Results.png

# python -m evaluator.trajnet_evaluator --path <test_dataset> --output <path_to_model_pkl_file>
python -m evaluator.trajnet_evaluator --path five_parallel_synth --output OUTPUT_BLOCK/five_parallel_synth_split/lstm_vanilla_None.pkl OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl

## To know more options about evaluator 
python -m evaluator.trajnet_evaluator --help

To know more about how the evaluation procedure works, please refer to this README.

Visualize Models

Visualize learning curves of two different models

python -m trajnetbaselines.lstm.plot_log OUTPUT_BLOCK/five_parallel_synth_split/lstm_vanilla_None.pkl.log OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl.log

## To view the different log files generated, run the command below:
ls OUTPUT_BLOCK/five_parallel_synth_split/lstm_goals_directional_None.pkl*
## You will notice various log files in form of *.png

Visualize predictions of models

# python -m evaluator.visualize_predictions <ground_truth_file> <prediction_files>
python -m evaluator.visualize_predictions DATA_BLOCK/five_parallel_synth/test_private/orca_five_nontraj_synth.ndjson DATA_BLOCK/five_parallel_synth/test_pred/lstm_vanilla_None_modes1/orca_five_nontraj_synth.ndjson DATA_BLOCK/five_parallel_synth/test_pred/lstm_goals_directional_None_modes1/orca_five_nontraj_synth.ndjson --labels Vanilla D-Grid --n 10 --random -o visualize

## Note the addition of output argument above. The 10 random predictions are saved in the trajnetplusplusbaselines directory. Run:
ls visualize*.png
## You wil see 10 '.png' files with prefix 'visualize' as it was the provided output name.

Next Steps for Milestone 1

Add visualizations (obtained using previous command) of 3 test scenes qualitatively comparing outputs of the vanilla model and D-Grid model (that uses goal information), as well as the quantitative evaluation (Results.png) in the README file of your forked repository.

Next, train the vanilla model and the D-Grid model on TrajNet++ synthetic data and real data following the same steps as above. Link to Train data and Test data. Note that, you will need to make different folders for real_data and synth_data in DATA_BLOCK folder. Your TrajNet++ data folders should have structure similar to below:

DATA_BLOCK
│
└───real_data
│   └── train
|   └── val (self-generated)
│   └── test
│       │   crowds_zara02.ndjson
│       │   biwi_eth.ndjson
│       │   crowds_uni_examples.ndjson
│   
└───synth_data
│   └── train
|   └── val (self-generated)
│   └── test
│       │   orca_synth.ndjson
|       |   collision_test.ndjson

For faster training on real data, you can remove the CFF files the train folder.
```
rm <path_to_real_data>/train/cff*
```
You are encouraged to play with other interaction encoders and maybe, design your own! You can validate your designs on the synthetic data before trying out on real data.
Final Step: Upload the predictions of D-Grid on AICrowd. You will have to create an account on AICrowd as well as accept the terms and conditions for the TrajNet++ challenge.

Submission to AICrowd

Lets assume you have two models named ‘synth_model_name’ trained on TrajNet++ synthetic data and ‘real_model_name’ trained on TrajNet++ real data. Also, by default, you have the data directory structure as mentioned above.

Generating Predictions for AICrowd

## Generate for Real data
python -m evaluator.trajnet_evaluator --path real_data --output OUTPUT_BLOCK/real_data/<real_model_name>.pkl --write_only

## Generate for Real data
python -m evaluator.trajnet_evaluator --path synth_data --output OUTPUT_BLOCK/synth_data/<synth_model_name>.pkl --write_only

The above operations will save your model predictions in the test_pred folder within data directory as shown below:

DATA_BLOCK
│
└───real_data
│   └── train
|   └── val (self-generated)
│   └── test_pred
|       └── <real_model_name>_modes1
│           | crowds_zara02.ndjson
│           | biwi_eth.ndjson
│           | crowds_uni_examples.ndjson
│   └── test

└───synth_data
│   └── train
|   └── val (self-generated)
│   └── test_pred
|       └── <synth_model_name>_modes1
│           | orca_synth.ndjson
|           | collision_test.ndjson
│   └── test

Uploading Predictions to AICrowd

These test predictions need to be uploaded to AICrowd.

## KEEP THE SAME FOLDER NAMES and STRUCTURE given below !!
mkdir test
mkdir test/real_data
mkdir test/synth_data
cp DATA_BLOCK/real_data/test_pred/<real_model_name>_modes1/* test/real_data
cp DATA_BLOCK/synth_data/test_pred/<synth_model_name>_modes1/* test/synth_data
zip -r <my_model_name>.zip test/

## Upload the <my_model_name>.zip to AICrowd. 

Done Done! :)

Useful Resources

Introduction to Git

To help you get started with Git, here are some useful resources:

Git Handbook (10 min. read): https://guides.github.com/introduction/git-handbook/

Git Cheatsheet: https://training.github.com/downloads/github-git-cheat-sheet/

FAQ

Q1. Important steps when you come back to the code

Please activate your virtual environment!

Q2. Important steps before you close your code after a good day’s progress :)

Do not forget to push your code on GitHub. It saves your progress! :) Refer to the GitHub resources if you haven’t yet.

Q3. What are the ‘goal’ files in synthetic data, why are they absent in real data?

The goal files contains the ‘final destination’ (goal) of the pedestrian in the ORCA simulator. It does not refer to the location at the end of the prediction period, but the location at the end of the simulation. We have access to these goals only for synthetic data, so only use ‘–goals’ command for synthetic data and not real data. Remember to shift the goal .pkl file to the goal_files folder as shown in the tutorial above.

Q4. How do we open the .png files which are generated when we run the visualization command?

You can transfer the .png files using ‘scp’ to your Desktop (the boring but simple way). Or you can use a text editor that allows you to open .png files from the terminal. I use Sublime Text.

TrajNet++ : Dataset Conversion

2020-10-03T00:00:00-07:00

In this blog post, I provide a quick tutorial to converting external datasets into the desired .ndjson format using the TrajNet++ framework. This post will focus on utilizing the TrajNet++ dataset code for easily converting new datasets.

Details

In this tutorial, I will convert the ETH dataset utilized by the Social GAN paper.

Step 1: Downloading Data

Before proceeding, please setup the base repositories. See ‘Setting Up Repositories’ here

## Checkout 'eth' branch of Trajnetplusplusdataset
git checkout -b eth origin/eth

## Download external data
sh download_data.sh
cp -r datasets/eth/ data/

Step 2: Converting Raw Data

In our external dataset, each trajectory point is delimited by ‘\t’
TrackRow takes the arguments ‘frame’, ‘ped_id’, ‘x’, ‘y’ in order.

def standard(line):
    line = [e for e in line.split('\t') if e != '']
    return TrackRow(int(float(line[0])),
                    int(float(line[1])),
                    float(line[2]),
                    float(line[3]))

Code snippet already provided in readers.py

Dataset Generation and Categorization

For dataset conversion, we call the ‘raw dataset conversion’ code shown above in convert.py

def standard(sc, input_file):
    print('processing ' + input_file)
    return (sc
            .textFile(input_file)
            .map(readers.standard)
            .cache())

Code snippet already provided in convert.py

Finally, we call the appropriate data files for conversion and categorization (See convert.py).

python -m trajnetdataset.convert --obs_len 8 --pred_len 12

Now that the dataset is ready [in output folder], you can train the model! :)

Difference in generated data

Partial tracks are now included (for correct occupancy maps)
Pedestrians that appear in multiple chunks had the same id before (might be a problem for some input readers)
Explicit index of scenes with annotation of the primary pedestrian

Summarizing

So, for converting any external dataset, all you got to do is 4 simple steps:

Download data and place it in the /data folder.
Edit readers.py to convert raw format into TrackRows in .ndjson format
Call the above snippet in convert.py
Call the dataset generation code with the appropriate arguments.

We recently released TrajNet++ Challenge for agent-agent based trajectory forecasting. Details regarding the challenge can be found here.

Introducing TrajNet++ : A Framework for Human Trajectory Forecasting

2020-03-27T00:00:00-07:00

In this blog post, I provide a kickstarter guide to our recently released TrajNet++ framework for human trajectory forecasting. We recently released TrajNet++ Challenge for agent-agent based trajectory forecasting. Details regarding the challenge can be found here. This post will focus on utilizing the TrajNet++ framework for easily creating datasets and learning human motion forecasting models.

Overview

On a high-level, Trajnet++ constitutes four primary components:

Trajnetplusplustools: This repository provides helper functions for trajectory prediction. For instance: trajectory categorization, evaluation metrics, prediction visualization.
Trajnetplusplusdataset: This repository provides scripts to generate train, val and test dataset splits from raw data as well as simulators.
Trajnetplusplusbaselines: This repository provides baseline models (handcrafted as well as data-driven) for human motion prediction. This repository also provides scripts to extensively evaluate the trained models.
Trajnetplusplusdata: This repository provides the already processed real world data as well as synthetic datasets conforming to human motion.

Getting Started

I describe how to get started using TrajNet++ with the help of a running example. We will create a synthetic dataset using ORCA simulator and train an LSTM-based model to perform trajectory prediction.

Setting Up Repositories

The first step is to setup the repositories, namely Trajnetplusplusdata for dataset generation and Trajnetplusplusbaselines for model training. Next, we setup the virtual environment and download the requirements.

## Create directory to setup Trajnet++
mkdir trajnet++
cd trajnet++ 

## Clone Repositories
git clone https://github.com/vita-epfl/trajnetplusplusdataset.git
git clone https://github.com/vita-epfl/trajnetplusplusbaselines.git

## Make virtual environment
virtualenv -p /usr/bin/python3.6 trajnetv
source trajnetv/bin/activate

## Download Requirements
cd trajnetplusplusbaselines/ 
pip install -e .

cd ../trajnetplusplusdataset/ 
pip install -e .
pip install -e '.[test, plot]'

Alright, our repositories are now setup!

Dataset Preparation

Trajnetplusplusdataset helps in creating the dataset splits to train and test our prediction models. In this example, we will be using the ORCA simulator for generating our synthetic data. Therefore, we will setup the simulator with the help of this wonderful repo.

## Download Repository
wget https://github.com/sybrenstuvel/Python-RVO2/archive/master.zip
unzip master.zip
rm master.zip

## Setting up ORCA (steps provided in the Python-RVO2 repo)
cd Python-RVO2-master/
pip install cmake
pip install cython
python setup.py build
python setup.py install
cd ../

We also download the Social Force simulator available at this repository.

## Download Repository
wget https://github.com/svenkreiss/socialforce/archive/refs/heads/main.zip
unzip main.zip
rm main.zip

## Setting up Social Force
cd socialforce-main/
pip install -e .
cd ../

Now, we will generate controlled data using the ORCA simulator. We will generate 1000 scenarios of 5 pedestrains moving in an interactive setting.

## Destination to store generated trajectories
mkdir -p data/raw/controlled
python -m trajnetdataset.controlled_data --simulator 'orca' --num_ped 5 --num_scenes 1000

## To know more options of generating controlled data
python -m trajnetdataset.controlled_data --help

By default, the generated trajectories will be stored in ‘orca_circle_crossing_5ped_1000scenes_.txt’. Procedure for extracting publicly available datasets can be found here. Also, the goals of the generated trajectories are stored in the ‘goal_files’ folder under the same name as the .txt file.

We will now convert the generated ‘.txt’ file into the TrajNet++ data structure format. Moreover, we will choose to select only interacting scenes (Type III) from our generated trajectories. More details regarding our data format and trajectory categorization can be found on our challenge overview page.

For conversion, open the trajnetdataset/convert.py, comment the real dataset conversion part in main() and uncomment the below given snippet.

## Run the conversion
python -m trajnetdataset.convert --linear_threshold 0.3 --acceptance 0 0 1.0 0 --synthetic

## To know more options of converting data
python -m trajnetdataset.convert --help

Once the conversion process completes, your converted datasets will be available in the output folder. Trajnetplusplustools provides the following utilities to understand your dataset better. To visualize trajectories in terminal in MacOS, I use itermplot.

## obtain new dataset statistics
python -m trajnetplusplustools.dataset_stats output/train/*.ndjson

## visualize sample scenes
python -m trajnetplusplustools.trajectories output/train/*.ndjson --random

## visualize interactions (Default: Collision Avoidance)
mkdir interactions
python -m trajnetplusplustools.visualize_type output/train/*.ndjson

Finally, move the converted data and goal files (if necessary) to the trajnetbaselines folder.

mv output ../trajnetplusplusbaselines/DATA_BLOCK/synth_data
mv goal_files/ ../trajnetplusplusbaselines/
cd ../trajnetplusplusbaselines/

Now that the dataset is ready, its time to train the model! :)

Training Models

Training models is more easier than generating datasets in Trajnet++ ! All you got to do is ….

python -m trajnetbaselines.lstm.trainer --path synth_data

…. and your LSTM model starts training. Your model will be saved in the synth_data folder within OUTPUT_BLOCK. Currently, models are being saved according to the type of interaction models being used.

In order to train using interaction modules (eg. nearest-neighour encoding) utilizing goal information, run

python -m trajnetbaselines.lstm.trainer --path synth_data --type 'nn' --goals

## To know more options about trainer 
python -m trajnetbaselines.lstm.trainer --help