Build your own image dataset from search engine

Finding or getting the right dataset is painful process. Here’s an idea : let’s scrape images from search engine.

A combination of category and variations would pull series of queries to wrest images from the search engine, before downloading an image from an URL.  Source code that scrapes close to 1M of images overnight (I do not own the code, however, I think it is very useful). Really helpful to getting train sets for the classifier.

The pipeline of developing dataset. Original diagram by D Grossman.

