Seven Head-Scratching Features from WWDC 2022

Customize Spatial Audio with TrueDepth Camera

This announcement came and went fairly quickly, but it had us scratching our heads immediately. The idea, it seems, is that spatial audio sounds more realistic if it can take into account aspects of the physicality of the listener that affect their perception of space. Apparently, this is a thing—called Head-Related Transfer Functions—and by capturing data using the iPhone’s TrueDepth camera, Apple could personalize the otherwise average HRTF that combines data from thousands of people.
I worked with HRTFs in grad school, trying to implement the filters in the wavelet domain (More here) so this is interesting to me. Looks like we’ll be able to use some combination of camera + lidar to capture the pinnae and derive personal HRTFs from that.

I cannot wait. Guess I’ll need to explore the spatial music and maybe movies now.


Progress in audio localization

So years ago in grad school at UNM, I worked on audio localization. My work was on the idea of doing the convolution (applying the filter to the audio) in the wavelet domain, to extend the results from a phd student in the group. My results were poor, read about them on this page.

Then this week I saw this on Gear Patrol:

The reviews are mixed, but the really impressive part is less obvious – you use your phone camera and it uses photogrammetry to derive the geometry of your head and pinnae, and from that it creates a personal HRTF that it loads into the finger-sized gadget:

Damn. Deriving an HRTF from just a couple of pictures? As opposed to anechoic chambers and KEMAR heads?

I’m sorely tempted to buy one just to see how well it works. At $160 is a bit much for an impulse buy though.

It’s really cool to see similar work, with a big chunk of progress plus some really new and clever ideas added. This version is USB, so Android/desktop, but they do have a Bluetooth version in the works that’d work with my phone.