Photo Metadata and Search on MacOS
This article has been discussed on Hacker News.
Apple’s Photos.app classifies pictures, identifying subjects such as “boat,” and “bicycle,” as well as settings like “cafe,” and “mountains.” It uses this capability to offer vastly better search than before.1
Unfortunately these improvements are neither visible to Spotlight, nor available in the Finder. This post documents a method of reconciling Photos.app metadata with filesystem metadata, so that they are indexed by Spotlight.
- The Finder, Extended Attributes, and Spotlight.
- Photos.app Metadata: A SQLite Adventure
- Reconciling Photos.db With Spotlight Indices
The Finder, Extended Attributes, and Spotlight.
Quite a bit of old information remains in circulation, about tagging and searching for documents in MacOS. The Finder seems to recognize a few extended attributes, for the purpose of associating tags and comments with files.2
To keep things simple, I will present only the method that has worked for me.
The com.apple.metadata:_kMDItemUserTags
xattr is used to associate a plist of tags with a file. Spotlight reads this xattr, and indexes its contents. To assign the tags “mountain,” and “alpine” to a file, for instance, you would create a plist:
<!DOCTYPE plist PUBLIC
"-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<array>
<string>mountain</string>
<string>alpline</string>
</array>
</plist>
To associate this $plist
with a $file
, invoke the xattr
command, as shown below.
[...]$ xattr -w "com.apple.metadata:_kMDItemUserTags" \
"$plist" "$file"
At this point, you should be able to find the file, via the 🔎 icon, or the mdfind
command, as shown below.
[...]$ mdfind tag:mountain
/path/to/matching/file
/path/to/another/matching/file
Photos.app Metadata: A SQLite Adventure
Photos.app does its accounting with a SQLite database, which we’ll call $photodb
, and set thus:
[...]$ cd ~
[...]$ cd Pictures
[...]$ cd Photo\ Library.photoslibrary
[...]$ cd database
[...]$ photodb="$PWD/photos.db"
Unfortunately, $photodb
is probably open, and locked.
[...]$ lsof "$photodb"
COMMAND PID USER FD TYPE DEVICE [...snip...]
photolibr 15432 patrick 4u REG 1,1 [...snip...]
The trouble with just killing photolibraryd, is that it will re-spawn, repeatedly. Undoubtedly, launchd can be told to disable photolibraryd, but the approved mechanism wasn’t immediately obvious to me.
Stop the photolibraryd
service with launchctl
[...]$ launchctl stop photolibraryd
Instead,Initially, I opted for an egregious hack, which you can read about by copying the redacted text.In order to lock its database, photoslibraryd, need to be able to write to it. I simply removed write permisions.
[...]$ chmod -w "$photodb"
[...]$ lsof "$photodb" |
awk '{ print $2}' |
egrep -v PID |
xargs kill
Afterwards–but not yet–don’t forget to restore writepermissions3 to your photos database! You may want to restart your computer to ensure that photolibraryd, continues to work.
Now, it’s possible to open $photodb
, and poke around.
[...]$ sqlite3 "$photodb"
SQLite version 3.14.0 2016-07-26 15:17:14
Enter ".help" for usage hints.
sqlite>
Review the schema with the .schema
command. Quit by sending Ctrl-d
, or with the .quit
command. Tags (and a lot else) are kept in the RKVersion_stringNote
table.
sqlite> .headers on
sqlite> SELECT * FROM RKVersion_stringNote;
We need to find the appropriate value of keyPath
, since this will vary between systems. The following snippet, should suffice to find it:
[...]$ KEYPATH=$( sqlite3 "$photodb" .schema |
grep 'RKVersion_stringNote_skIndexUpdateTrigger' |
grep -Eo '[0-9]{1,4}' )
Okay, but did it work?
[...] echo $KEYPATH
719
Now we can find the strings containing our tags, and associate them with filesystem paths. Be sure to substitute the value of $KEYPATH
determined above, for the literal 719
, below.
sqlite> SELECT RKMaster.imagePath, RKVersion_stringNote.value
FROM RKVersion_stringNote
INNER JOIN RKMaster
ON RKVersion_stringNote.attachedToId = RKMaster.modelId
WHERE RKVersion_stringNote.keyPath = 719 /* !!! */
Records will resemble the following:
2015/04/25/20150425-035012/DSC00435.JPG|DSC00435.JPG \
00435.JPG JPG October 2012 Outdoor Outside Outdoors \
Outsides Land Lands Mountain Mounts Peak Sierra \
Sierras Peaks Mountains Mount
Tags aren’t quoted, but always come last. The trouble is that its not obvious which are 1-grams, 2-grams, or n-grams. Various collisions are possible both between tags, and other substrings.
Reconciling Photos.app metadata with filesystem metadata
Okay, lets put the pieces together. We need four things:
- a copy of
photos.db
in a write-able location. - the system-specific value of
$KEYPATH
- the path to our photos library
$PHOTOLIB
photos2spotlight.py
(listing here).
Start by getting an idea of what your library contains:
[...]$ ./photos2spotlight.py --stats \
--db /path/to/copy/of/photos.db
--lib "~/Pictures/Photos\ Library.photoslibrary/"
--keypath $KEYPATH
[...snip...]
16 Duds
16 Clothing
16 Accoutrements
16 Accoutrement
16 Clothings
16 Apparels
18 Insides
18 Interior Rooms
18 Inside
18 Interior Room
18 Indoors
By default, photos2spotlight.py
will make a dry run. Use the --write
flag to modify fileystem metadata.
[...]$ ./photos2spotlight.py --stats \
--db /path/to/copy/of/photos.db
--lib "~/Pictures/Photos\ Library.photoslibrary/"
--keypath $KEYPATH
--write
You should now be able to find photos using Spotlight, the Finder, or the related mdfind command. E.g.
[...]$ mdfind 'tag:Inside'
.../Masters/2015/04/22/20150422-011045/IMG_0083.JPG
[...snip...]
Conclusion
That’s it. Macs are easy, right?
Continue to Part 2.
-
Google Photos does this too. It’s dramatically less tedious than manual tagging. ↩︎
-
Both
kMDItemFinderComment
andkMDItemOMUserTags
are commonly suggested, as well. As far as I can tell, the former is inappropriate, while the latter is disused, except by legacy applications. ↩︎ -
[...]$ chmod u+w "$photodb"
↩︎