TrajNet++ : Dataset Conversion
Published:
In this blog post, I provide a quick tutorial to converting external datasets into the desired .ndjson format using the TrajNet++ framework. This post will focus on utilizing the TrajNet++ dataset code for easily converting new datasets.
Details
In this tutorial, I will convert the ETH dataset utilized by the Social GAN paper.
Step 1: Downloading Data
Before proceeding, please setup the base repositories. See ‘Setting Up Repositories’ here
## Checkout 'eth' branch of Trajnetplusplusdataset
git checkout -b eth origin/eth
## Download external data
sh download_data.sh
cp -r datasets/eth/ data/
Step 2: Converting Raw Data
- In our external dataset, each trajectory point is delimited by ‘\t’
- TrackRow takes the arguments ‘frame’, ‘ped_id’, ‘x’, ‘y’ in order.
def standard(line):
line = [e for e in line.split('\t') if e != '']
return TrackRow(int(float(line[0])),
int(float(line[1])),
float(line[2]),
float(line[3]))
Code snippet already provided in readers.py
Dataset Generation and Categorization
For dataset conversion, we call the ‘raw dataset conversion’ code shown above in convert.py
def standard(sc, input_file):
print('processing ' + input_file)
return (sc
.textFile(input_file)
.map(readers.standard)
.cache())
Code snippet already provided in convert.py
Finally, we call the appropriate data files for conversion and categorization (See convert.py).
python -m trajnetdataset.convert --obs_len 8 --pred_len 12
Now that the dataset is ready [in output folder], you can train the model! :)
Difference in generated data
- Partial tracks are now included (for correct occupancy maps)
- Pedestrians that appear in multiple chunks had the same id before (might be a problem for some input readers)
- Explicit index of scenes with annotation of the primary pedestrian
Summarizing
So, for converting any external dataset, all you got to do is 4 simple steps:
- Download data and place it in the /data folder.
- Edit readers.py to convert raw format into TrackRows in .ndjson format
- Call the above snippet in convert.py
- Call the dataset generation code with the appropriate arguments.
We recently released TrajNet++ Challenge for agent-agent based trajectory forecasting. Details regarding the challenge can be found here.