I am thinking of the requirements for training a model that would be able to detect if there is any kind of ad in an image.
I know that this sound too broad not just for a question on CV but for the model itself.
There are numerous problems like:
- The non-standard format of advertisements.
- The fact that ads can also contain pictures apart from plain text, which apparently will display some objects.
- Also the fact that in most cases are part of other objects, for example the frontpage of a magazine, the picture of a tv for a given moment, the contents of a billboard, a leaflet on the front windshield of a car, etc…
Still I’d like to make an attempt, so I am thinking what should be the ideal dataset to train a model for this task.
What I’ve come up with is to use a dataset of company logos and train a model to detect logos in picture.
Yet this strategy would eventually lead to more problems like
- The false positive due to the fact that company logos exist also on the products sold apart from the product advertisements. This particular problem could be solved if there was a way to configuring the model to mark an object(a logo in this case) only if it occupied a portion of the picture larger than X%, since for example a logo on a car is relatively small compared to the car in contrast to the proportions of a car and a company logo in a magazine advertisement.
So, any ideas on which criteria should I take into consideration to create a useful dataset for this task are welcome.