1801.06397

IJCV 2018

[arxiv 1801.06397] What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? [PDF] [notes]

Nikolaus Mayer, Eddy Ilg, Philipp Fischer, Caner Hazirbas, Daniel Cremers, Alexey Dosovitskiy, Thomas Brox

read 17/04/2018

Objective

Evaluate on rgb synthetic images what improves quality of synthetic data

Explain why FlyingChairs (2D motion of objects in front of background images) although more simple work better then FlyingThings3D on real dataset KITTI 2015 and on synthetic dataset Sintel

Synthesis

Data generated using blender

Use 3 datasets

flying chairs/things
Mnokaa (from animated movie, limited variability)
Driving with basic cubes/cylinders to model buildings, and models of cars, usefull for learning driving priors (e.g. road is usually a flat surface, ...)

Data augmentation

color changes : change brightness, contrast, colors, add color noise
geometric changes : shift rotation, scaling, either on one or both images

Color augmentation has less impact on performance then geometric augmentation, both leads to best results

Evaluate on real datasets

Quality of textures

Use the same real-world photographs from Flickr as background and object textures for initial experiments, then create other types of textures (Plasma - clear boundaries between various colors, Clouds (composition of color noise at different scales) and real image Flickr Textures)

--> Plasma < Clouds < Flickr textures, which produce the best results on Sintel Natural images as textures therefore seem to perform well

lighting Use FlyingChairs remodeled in 3D, rendered with 3 different levels of light realism:
- shadeless (no lighting effect, uniform flat color or texture)
- static (uniform lighting in all directions, no self-shadowing or mututal shadowing
- dynamic lighting (ray tracing from a lamp from randomized diretions with realistic shadows)

Conclusion --> shadeless < dynamic < static works a bit better but on Sintel, which is not real but synthetic as well

Size of training data Error decreases ~linearly with log scale of training data without augmentation, no significant improvement with data augmentation (color and geometric)
lens distorsion-blur/Bayer artifacts (separate acquisition of R, G and B through bayer filters) Taking them into account produces qualitative improvements for lens distorsion/blur and quanititative improvements on Kitti for Bayer interpolation artefacts
Scheduling Training on Mixture of FlyingChairs and Things performs worse then FlyingChairs alone, however training on FlyingChairs and finetuining on FlyingThings3D performs better then FlyingChairs alone --> intuition : to complex info in early stages is harmfull, but helps if simple pixel correspondences have already been learnt