A zero-shot object counting app is an app that counts objects without the need to specify what to count. Just present an image, and the app automatically identifies visually similar objects in the image and counts them.
This is enabled by an innovative architecture consisting of a ViT-based encoder and a convolutional decoder, capable of extracting features from image and turning it into a "heat map" where each object is represented by a dot. Conventional methods are then used to count the dots in the "heat map".
Note that this app only provide a rough estimation of the object count. Estimations may be inaccurate or incorrect. Do not rely on the results.