Better Photo Search for MacOS with Clarifai

Introduction

I’ve previously written about a method of enabling Spotlight search of Photos.app libraries1 using tags generated by photoanalysisd that are otherwise only available through Photos.app.

But photoanalysisd is limited to something like 1,000 unique subjects, while some commercial services advertise recognition of 10x that number, and might be expected to produce more relevant tags.

This post considers Clarifai along with photoanalysisd, comparing their performance on the task of classifying pictures from my photo library.

No Useful Benchmarks

Commercial image classification services like Clarifai and Rekognition seem oddly reluctant to publish competitive benchmarks, which might help someone choose between them.

Some vendors tout performance in academic competitions like the ILSVRC, but this isn’t a direct test of their commercial offerings and is therefore only so useful.

Clarifai, however, do make one interesting claim about their general model: that it classifies 11,000 subjects. But is that a lot compared with Google, for instance? We don’t actually know.

(Nor do we know what those 11,000 subjects are and whether Clarifai reliably detects them.)

To get a sense of how Clarifai compares with Photos.app on the task of classifying an individual’s Photo library, I submitted several thousand pictures from mine.

You can use the search field below to get a sense of the breadth and specifity of tags generated by each service, as well as the frequency with which particular tags occur.

Key: ClarifaiPhotos.app

Of course these are all pictures from my library, and therefore biased towards my interests, but the code is available on GitHub.

Based on this experiment, and in view of my expectations, I’m impressed with the performance of Apple’s classifier. As I begin typing a query, Apple often seems to anticipate it with a relevant tag.

But Clarifai adds a great deal of breadth, even if some of the tags don’t correspond with likely search terms (e.g. “no person”).

So, it would be nice if Spotlight tags could be drawn from services like Clarifai and Rekognition as well as Photos.app.

Implementing Multi-Vendor Tagging

The approach I settled on was to maintain a sqlite database seperate from photos.db that contains both the photoanalysisd tags and tags from 3rd-party services.

Find the tools referenced below on GitHub.

Initialize a database (here: new.db), populating it with information about your Photos.app library, and synchronize the two.

[...]$ ./alternative-classifiers.py --init              \
            --db new.db                                 \
            --photos-db ~/Pictures/..../photos.db       \
            --lib ~/Pictures/..../library.photoslibrary \
            --sync

(Please see the note here about stopping photolibraryd, to release its lock on photos.db)

Then add tags from a 3rd-party service like Clarifai.

[...]$ export CLARIFAI_API_KEY="YOUR_API_KEY_HERE"
[...]$ ./alternative-classifiers.py --db new.db --clarifai 

And finally, update the relevant xattrs, so that spotlight can index our new tags.

[...]$ ./alternative-classifiers.py --db new.db --xattrs

Now it should be possible to find Clarifai-specific tags like No Person with Spotlight:

[...]$ mdfind 'tag:no person'
.../Masters/2015/04/22/20150422-011045/IMG_0083.JPG

Note: Adding New Tags to Photos.app

My last post on this subject showed how to extract tags from Photos.app, via a TEXT column in photos.db called RKVersion_stringNote.value. With this post, I had hoped it would be possible to do the reverse.

The obvious thing to try: appending to RKVersion_stringNote.value, and updating the plist where the list of categories is kept, doesn’t get the job done.

When the last post was discussed on Hacker News, several commentors pointed out how absurd this all is–that it should be trivial to retrieve and update information about your own photographs. I agree with them entirely.

(However, if you have a suggestion of how to solve this particular problem, do let me know!)

Conclusion

Combining Clarifai’s tags with Apple’s has produced a marked improvement in the searchability of my Photo library.

I wonder if adding tags from additional services such as Rekognition, or Google Computer Vision would add as much value, or at what point adding another general model becomes redundant.

Comment and Suggestion to patrick.mcmurchie@gmail.com