Meta has presented an AI tool that should recognize objects in photos without being trained on it. Tools so far only work in areas where there is data in the training set. The dataset is also much larger than previous tools, claims Meta.
Meta calls the tool Segment Anything. It has been trained on about eleven million images and a total of one billion masks on those images. Humans trained the model by providing feedback on the masks and annotating them. As a result, the model can now recognize objects in photos it has not seen before.
Meta also has of paper in a demo posted online. With that demo, users can upload photos themselves and have the system create masks for them. The demonstration does not show labels on those objects. It does show exactly where the model draws boundaries for objects. The model works with ‘prompting’, where Segment Anything tries to find out how likely it is that a certain point belongs to a certain object.
Meta wants to use the technology in the future to enable AR glasses to recognize objects without the model having seen them before. It must then be possible to perform actions in the software. It is by no means the first object recognition algorithm. That’s in the camera software of every modern smartphone. Recognizing objects is also part of this, although they mainly work with trained data.