UNIX Magic poster

I went looking today for mounting options for my new! signed! Effin’ Birds poster and my first thought was how much I like the aluminum dibond on my UNIX Magic poster:

then I realized that I had never blogged it! So back in 2021, I read this story via Hacker News about a legendary 1970s conference poster full of inside Unix jokes and references, was instantly smitten and ordered one:

Anyway, the Unix poster is all that and continues to occupy a place of honor, right in the corner of my eye, because frankly my entire career has been Unix in various forms: Linux, NetBSD and OpenBSD, Irix, SunOS, Solaris, AIX, and others that I can’t recall any more. I’m counting MacOS since yeah, it’s Unix with a professional makeup, and so are Android and IOS.

(The time spent on OS/2, BeOS, Windows flavors, Desqview, GeoWorks, DOS… well, ahh well)

Now its time to add my signed EB poster – my better half bought this for me at ComiCon 2025:

but you can’t buy the dibond sans a print, so I’ll probably cheap out and get this locally from Blicks.

So this is a mixed post – yeah, you should totally get a unix poster, you can DIY it for much less than the fancy print. Secondly, Effin Birds is awesome, I also have his wall calendar and some conference tchotchkes.

Setup a private LLM server

Why do this?

The scenario: You want to play around with the fast-moving large language models, but don’t want to every random query / thought / personal health question to be logged / trained-on.

Requirements

You need to have a machine to run the LLMs. It needs to have lots of memory. Depending on the models, of course, but larger models are much better and useful. 16GB absolute minimum for a 7b-parameter model, and 27b models need 64G or more. I’m using an M4 Max Mac Studio with 128GB and that seems plenty.

The software is simple – ollama to manage and serve models, and the llm cli to run on your laptop. That way, the server does the lifting and your laptop doesn’t. They don’t, however, work out of the box, so I wanted to share some necessary steps.

Server setup with Ollama

On the server:

brew install ollama
brew services start ollama

Now we need some models. With lots of memory, I got the biggest models, but you should read the list to choose a few. This step pulls gigabytes, so beware on a metered connection.

ollama pull llama4:latest
ollama pull deepseek-r1-70b
ollama pull gemma3:27b

Now we need to tell Ollama to listen to all interfaces – this makes it accessible, since by default it binds only to localhost. There’s a GitHub issue about this if you want more details. We have to edit the plist:

open /opt/homebrew/opt/ollama/homebrew.mxcl.ollama.plist

As you can see, we have to add the OLLAMA_HOST environment variable, set to 0.0.0.0:11434

After that, save the plist and run

brew services restart ollama

To test the change, open the new page in a browser. (I’m using hostname axiom, FYI.)

open http://axiom:11434/

and you should see

That’s the server sorted – on to the client!

LLM client

The LLM cli by Simon Willison is just neat. It can do all sorts of things for you. We want to do the following:

  1. Install LLM
  2. Add our models using Ollama’s OpenAI compatible REST API to extra-openai-models.yaml
  3. Set the default model
  4. Set the sorta-not-needed API key (it’ll ask for it, give it the key ‘ollama’)
brew install llm
llm keys set ollama
llm keys path

Save that path – mine is

/Users/pfh/Library/Application Support/io.datasette.llm/keys.json

We need to create and populate
/Users/pfh/Library/Application Support/io.datasette.llm/extra-openai-models.yaml

Here’s a ChatGPT-generated shell script to query Ollama over ssh and populate the extra-openai-models.yaml

#!/bin/bash

pushd /Users/pfh/Library/Application\ Support/io.datasette.llm

# Configuration
HOST="axiom.phfactor.net"
PORT="11434"
OUTPUT="extra-openai-models.yaml"
API_BASE="http://${HOST}:${PORT}/v1"
API_KEY_NAME="ollama"

# Fetch models from Ollama
echo "Fetching models from Ollama at ${HOST}:${PORT}..."
MODEL_LIST=$(curl -s "http://${HOST}:${PORT}/api/tags" | jq -r '.models[].name')

# Begin YAML
echo "# Auto-generated OpenAI-compatible model config for LLM CLI" > "$OUTPUT"
echo "# Source: http://${HOST}:${PORT}/api/tags" >> "$OUTPUT"
echo "" >> "$OUTPUT"

# Write each model as a separate YAML block
while IFS= read -r MODEL; do
  MODEL_ID="${MODEL//:/-}"  # replace colon with dash
  cat <<EOF >> "$OUTPUT"
- model_id: $MODEL_ID
  model_name: $MODEL
  api_base: "$API_BASE"
  api_key_name: $API_KEY_NAME
  can_stream: true
  supports_schema: true
  vision: true

EOF
done <<< "$MODEL_LIST"

echo "Wrote $OUTPUT with $(echo "$MODEL_LIST" | wc -l) models."

Here’s my models file:

# Auto-generated OpenAI-compatible model config for LLM CLI
# Source: http://axiom.phfactor.net:11434/api/tags

- model_id: qwen3-latest
  model_name: qwen3:latest
  api_base: "http://axiom.phfactor.net:11434/v1"
  api_key_name: ollama
  can_stream: true
  supports_schema: true
  vision: true

- model_id: deepseek-r1-70b
  model_name: deepseek-r1:70b
  api_base: "http://axiom.phfactor.net:11434/v1"
  api_key_name: ollama
  can_stream: true
  supports_schema: true
  vision: true

- model_id: llama4-latest
  model_name: llama4:latest
  api_base: "http://axiom.phfactor.net:11434/v1"
  api_key_name: ollama
  can_stream: true
  supports_schema: true
  vision: true

- model_id: gemma3-27b
  model_name: gemma3:27b
  api_base: "http://axiom.phfactor.net:11434/v1"
  api_key_name: ollama
  can_stream: true
  supports_schema: true
  vision: true

Set the default model – I’m still experimenting, but for now its

llm models default llama4-latest

Important!

The api_base there is not the one in the docs – I had to use ChatGPT (sauce for the gander) to figure out that you have to add the /v1 suffix or the llm cli will return 404s.

What do you get?

From my laptop, I can run random queries and it works great – no limits, ask anything (Lots of stories about this!) and I feel oddly better about asking basic, stupid tech question that I “should” know or remember. For example:

  • What’s the makefile syntax for this?
  • Tell me how to glob the largest non-zero-file of episode.mp3 in a tree of files?
  • What’s the jq invocation to pull this?
llm 'nmap syntax to look for open tcp on a single host'
To use `nmap` to scan for open TCP ports on a single host, you can use the following basic syntax:

```bash
nmap -sT <hostname or IP address>
```

Here's a breakdown of the options used:
- `-sT` specifies a TCP connect scan, which is a type of scan that tries to establish a full TCP connection to the target ports. This is one of the most common scan types and works well for finding open TCP ports.

If you want to scan all 65,535 TCP ports (which can take a long time), you can use:

```bash
nmap -sT -p- <hostname or IP address>
```

However, for a more efficient and commonly used approach, scanning the top 1000 most common ports can be sufficient for many cases:

```bash
nmap -sT -p 1-1000 <hostname or IP address>
```

- `-p 1-1000` specifies the range of ports to scan. Adjust this range according to your needs.

### Additional Options

- **OS Detection:** `-O` (capital O, not zero) can help identify the operating system of the target, though this might not always be accurate or useful.
- **Version Detection:** `-sV` can provide information about the versions of services running on open ports.
- **Script Scanning:** `--script` can run NSE (Nmap Scripting Engine) scripts against the target.

### Full Command Example

For a more detailed scan:

```bash
nmap -sT -p 1-1000 -sV -O <hostname or IP address>
```

Make sure to replace `<hostname or IP address>` with the actual hostname or IP address of the target you're scanning.

### Permissions

Keep in mind that running `nmap` might require administrative or root privileges, especially for certain types of scans. If you run into issues, consider using `sudo` (on Unix-like systems) before your `nmap` command.

### Legal and Ethical Considerations

Always ensure you have permission to scan a target. Unauthorized scanning can be considered malicious and may lead to legal consequences.

Postscript

I opened an issue on the LLM cli, so their docs may be updated to document these steps.

Coding with ChatGPT

As I’ve written about before, LLMs are an amazing tool for programming. In addition to IDE plugins / autocomplete integrations (Github CoPilot, Qodo) I’m finding utility in a pattern of ‘running conversation in ChatGPT in dedicated app.’ I can ask anything there, snippets and one-offs and what-ifs. The full -o1 model is really good.

Anyway, this is the current state – side project (open source here) is an iOS app in SwiftUI that will automatically capture context using Bluetooth iBeacons with the goal of generating a log that can become contact engineer billing records. The idea:

  1. Enter region OFFICE at 8:00AM.
  2. Leave region OFFICE at 6:00PM.
  3. Enter region CAR at 2:15PM at GPS location LAT, LONG.

The GPS idea is for possibly trying to compute miles driven; may work or not. The regions will be marked using iBeacons. Cheap, supported by the operating system, easy.

I don’t really know SwiftUI, so I just asked ChatGPT for code and started hacking. It works and continues to work! LLMs are amazing for learning this way. I can ask random changes, explanations, fixing errors; anything! Here’s todays version of the app:

Current state showing real data.

Damn.

If you’ve not tried coding with an LLM, go try it. Treat it like a freshly graduated student – you can ask it anything, and it’ll know the most amazing ways to write code and solve problems. It’ll be wrong yet confident sometimes, but as a way of getting unstuck and having fun hacking, I’m having a ball.

Smokeping and Caddy 2: hard but worthwhile

Spent a few semi-pleasant hours today to get this working:

That’s Smokeping, proxied by Caddy2 on my love-it-to-bits Raspberry Pi4 web server.

Smokeping lets you track not just the yes-we-are-connected / no-we-are-not, but latency and packet loss and jitter. My network and ISP are generally solid, but this is an easy tool to have around. Once, that is, you have it installed.

I run Debian on my Pi, natch, and the wondrous Caddy to serve files and reverse proxy the various web apps. Not to mention automatic SSL certs from LetsEncrypt and the least verbose configuration possible. Smokeping, alas, uses the now-uncommon CGI interface, so gluing it all together took a while. Let me leave some notes for anyone else in this situation.

Basic install

apt install fcgiwrap
apt install smokeping
service fcgiwrap start

The /etc/smokeping/config.d directory has a bunch of edits you’ll need. In General:

cgiurl   = https://ping.phfactor.net/smokeping.cgi

Note that Caddy prefers CNAMEd virtual hosts, so I’m using ping.phfactor.net. You’ll need that in your DNS. Here’s the Caddyfile entry:

ping.phfactor.net {
	log {
		output file /var/log/caddy/ping.log
	}
	root * /usr/share/smokeping/www
	encode gzip
	file_server
	@cgi {
             path *.cgi
    	}

        reverse_proxy @cgi unix//var/run/fcgiwrap.socket {
    	    transport fastcgi {
                split .cgi
            	env SCRIPT_FILENAME /usr/share/smokeping/smokeping.cgi
            }
        }
    # Ensure CSS and JS files are served correctly
    @static {
        path /css/* /js/* /img/*
    }
    handle @static {
        file_server
    }

    # Try serving static files before falling back to CGI
    try_files {path} /{path}	

}

Kinda ugly. Might be some cleanup possible there. I also had to modify the HTML template file /etc/smokeping/basepage.html to remove the /smokeping/ prefix from the CSS and JS URLs:

   <link rel="stylesheet" type="text/css" href="/css/smokeping-print.css" media="print">
    <link rel="stylesheet" type="text/css" href="/css/smokeping-screen.css" media="screen">

...

<script src="/js/prototype/prototype.js" type="text/javascript"></script>
<script src="/js/scriptaculous/scriptaculous.js?load=builder,effects,dragdrop" type="text/javascript"></script>
<script src="/js/cropper/cropper.js" type="text/javascript"></script>
<script src="/js/smokeping.js" type="text/javascript"></script>

For now, I’m using the basic function of ICMP pings, but Smokeping supports more advanced tests such as SSH login and others.

Note that Safari seems a bit confused by Smokeping graphs, and caches old ones longer than it should. Chrome and Firefox do it right. Odd.

The results are pretty cool though.

LLMs can solve hard problems

We’re a couple of years into the LLM era now, and the Gartner hype cycle from last year seems relevant:

Image credit: Gartnerhttps://www.gartner.com/en/articles/what-s-new-in-artificial-intelligence-from-the-2023-gartner-hype-cycle

The purpose of this post is to share two hard problems (in the CS sense of the term) that I and a friend solved with an LLM.

The problem and impetus

I’m friends with a couple of guys who have a podcast. (I know, right?) It’s been going for seven years or more now, is information-packed and more than once I’ve wanted to be able to search for something previously mentioned. Then, via Simon Willison I think, I learned about the amazing Whisper.cpp project that can run OpenAI’s Whisper speech-to-text model at high speed on desktop hardware. As others have said, “speed is a feature” and being able to do an hour of audio in a few minutes on a MacBookPro or Mini made the project interesting and feasible.

The project and goals

The overall goal of the project was to generate transcripts of every episode, index them with a local search engine, and serve the results as a static website. Open source code, no plan to monetize or commercialize, purely for the fun of it.

The code uses Python for logic, Makefiles for orchestration, wget for web downloads, mkdocs for website generation, xmltodict for the RSS parsing, Tenacity for LLM retries and rsync to deploy code. Nothing too exciting so far.

This let us generate a decent website. However, it was quickly obvious that Whisper would not suffice, since it doesn’t indicate who is speaking, a feature known as ‘diarization.’ After sharing the work on the TGN Slack, a member offered his employers’ product as a potential improvement. WhisperX, hosted on OctoAI, includes diarization. So instead of this wall of text

you get something more like this (shown processed a bit)

Enter the LLM

So now we have roughly an hours’ worth of audio, as JSON, with labels as to speaker. But the labels are not the names of the speakers.SPEAKER_00‘ isn’t helpful. We need the names.

We need to somehow process 50-100KB of text, with all of the peculiarities of English, and extract from it the names of the speakers. Normally it’s the same two guys, but sometimes they do call-in type episodes with as many as 20 callers.

This is super hard to do with programming. I tried some crude “look for their usual intros” logic but it only worked maybe half of the time, and I didn’t want to deep-dive into NLP and parsing. At my day job, I was working on LLM-related things so it made sense to try one, but our podcasts were too large for the ChatGPT models available. Then came Claude, with 200k token windows and we could send the entire episode in a single go.

The code simply asks Claude to figure out who’s speaking. Here is the prompt:

The following is a public podcast transcript. Please write a two paragraph synopsis in a <synopsis> tag
and a JSON dictionary mapping speakers to their labels inside an <attribution> tag.
For example, {"SPEAKER_00": "Jason Heaton", "SPEAKER_01": "James"}.
If you can't determine speaker, put "Unknown".
If for any reason an answer risks reproduction of copyrighted material, explain why.

We get back the JSON dictionary, and the Python code uses that to build correct web pages. That works! Since we were paying maybe a nickel per episode, we also ask for a synopsis, another super hard programming task that LLMs can do easily. The results look like this:

Note the synopsis and the speaker labels.

LLMs are not just hype

It works great! We now have a website, with search index, that is decent looking and a usable reference. There’s more to do and work continues, but I’m super pleased and still impressed at how easy an LLM made two intractable problems. It’s not all hype; there are real, useful things you can do and I encourage you to experiment.

Lastly, please check out the website and enjoy. A complete time capsule of the podcast. I wonder if Archive.org needs a tool like this?

Links

Notes and caveats

  • This uses Claude 3.5 ‘Sonnet’, their mid-tier model. The smaller model didn’t do as well, and the top-tier cost roughly 10x as much for similar results.
  • An hour of podcast audio converted to text is about 45k to 150k tokens. No problem at all for our 200k limit.
  • Anthropic has a billing system that appears designed to defend against stolen credit cards and drive-by fraud. I had to pay a reasonable $40 and wait a week before I could do more than a few requests. For two podcasts, it took about a week to process ~600 episodes. Even at the $40 level, I hit the 2.5M limit in under a hundred episodes.
  • About one episode in ten gets flagged as a copyright violation. Which they are not. Super weird. Even more weird is that making the same call again with no delay usually fixes the error. A Tenacity one-liner suffices. As you can see, we tried to solve this in the prompt but it seems to make no difference.

Whisper and WhisperX for podcast transcription

I just realized that I hadn’t posted this. Several months ago, I read about whisper.cpp and started playing with it. To quote from their docs, whisper.cpp is a

High-performance inference of OpenAI’s Whisper automatic speech recognition (ASR) model:

  • Plain C/C++ implementation without dependencies
  • Apple Silicon first-class citizen – optimized via ARM NEON, Accelerate framework, Metal and Core ML
https://github.com/ggerganov/whisper.cpp

In other words, a fast and free speech transcription app that runs on your laptop. Damn!

In fact, it’s so efficiently written that you can transcribe on your iOS phone. Or browser. Haven’t tried those yet.

Anyway, that gave me an idea: a couple of friends of mine run a podcast called TGN. They’ve been at it for a few years, and have around 250 episodes of an hour each. Could I use whisper.cpp to produce a complete set of episode transcripts? If I did, would that be useful?

(Insert a few months of side project hacking, nights and weekends.)

It works, pretty well. For podcasts, however, you end up with a wall of text because Whisper doesn’t do what’s called ‘speaker diarization,’ that is, identifying one voice or another. It’s on their roadmap, though.

I was sharing the progress on the TGN slack when an employee of the company OctoML DM’d me. They have a WhisperX image that does diarization, and he offered to help me use it for the project.

(More nights and weekends. Me finding bugs for OctoML. Adding a second podcast. Getting help from a couple of friends, including the a-ha mkdocs idea from David.)

Voila! May I present:

The key useful bits include

  • Full text searching
  • Local mirrored copies of the podcast MP3, raw transcript text and (in progress attempts) to mirror the episode web page.
  • Using mkdocs to build static websites that look OK, and can render Markdown into HTML.

Lots of features yet to build but it’s been a really fun side project. Source code is all on GitHub, right now I’m in the prefect branch, trying out a workflow rewrite using Prefect.

Seven Head-Scratching Features from WWDC 2022 – TidBITS

Customize Spatial Audio with TrueDepth Camera

This announcement came and went fairly quickly, but it had us scratching our heads immediately. The idea, it seems, is that spatial audio sounds more realistic if it can take into account aspects of the physicality of the listener that affect their perception of space. Apparently, this is a thing—called Head-Related Transfer Functions—and by capturing data using the iPhone’s TrueDepth camera, Apple could personalize the otherwise average HRTF that combines data from thousands of people.
— Read on tidbits.com/2022/06/13/seven-head-scratching-features-from-wwdc-2022/

I worked with HRTFs in grad school, trying to implement the filters in the wavelet domain (More here) so this is interesting to me. Looks like we’ll be able to use some combination of camera + lidar to capture the pinnae and derive personal HRTFs from that.

I cannot wait. Guess I’ll need to explore the spatial music and maybe movies now.

Controlling the Logitech Litra on MacOS

Recently I bought a USB-C connected diffused light for Zoom calls, the Logitech Litra Glow. The hardware is excellent, but the control software, Logitech GHub, is horrible. Intrusive permissions, auto start, auto updater, laggy UI, just a hard no. The light has buttons on the back to control it, but since mine is next to the camera it looks odd when I’m reaching over and let’s face it, I’m a programmer and love solving small problems with code. Let’s find a way.

Some searching found a Linux app on Github, but the USB library for MacOS required sudo access which adds hassle. There’s an issue and discussion on the topic, which led me to hidapitester, which doesn’t require root access to run.

Goals

I want menu bar control, simplified – light on and off, one or two brightness levels. Basically make it easy to use as I hop on and off calls. Here’s what I came up with, hopefully it’s helpful for you too.

Installation and configuration

  1. Install hidapitester
  2. Add shell aliases to control the light
  3. Tweak color temp and brightness values to your liking
  4. Add Shortcuts to drive the shell aliases from the GUI

Install hidapitester

Grab the source or binary from the project GitHub page. Copy it into /usr/local/bin and then, due to app signing requirements, you’ll need to approve its use from the Finder.

open /usr/local/bin

Right click on hidapitester and select Open

On the warning dialog, select Open.

The binary is now flagged as OK; you only have to do this once.

Add shell aliases

The default shell is now zsh so let’s use that. I also use and like oh-my-zsh, so my aliases are in /Users/phubbard/.oh-my-zsh/custom/aliases.zsh

Using the magic payloads from the Github issue, we define a base function _hid that invokes hidapitester with the full path and identifiers. We then build on that to add a selection of color temps and brightness levels.

function _hid() {
  /usr/local/bin/hidapitester --vidpid 046D/C900 --open --length 20 --send-output $1
}
# 2/17/22 Litra Glow aliases from https://github.com/kharyam/litra-driver/issues/13
function light() {
  _hid 0x11,0xff,0x04,0x1c,0x01
}
function dark() {
  _hid 0x11,0xff,0x04,0x1c
}
# ~10%
function glow(){
  _hid 0x11,0xff,0x04,0x4c,0x00,20
}
# ~20%
function dim(){
  _hid 0x11,0xff,0x04,0x4c,0x00,50
}
# tweaking by hand - less than 50%
function normal() {
  _hid 0x11,0xff,0x04,0x4c,0x00,70
}
# ~50%
function medium() {
  _hid 0x11,0xff,0x04,0x4c,0x00,100
}
# 90%
function bright(){
  _hid 0x11,0xff,0x04,0x4c,0x00,204
}
# 2700K
function warmest() {
  _hid 0x11,0xff,0x04,0x9c,10,140
}
# 3200K
function warm() {
  _hid 0x11,0xff,0x04,0x9c,12,128
}
# 6500K
function coldest() {
  _hid 0x11,0xff,0x04,0x9c,25,100
}

You should be able to run any of these from zsh:

exec zsh
light
dark

That might be all you need/want. If so, enjoy!

Add Shortcuts

There’s a few key things to know in order to make this work:

  • You can invoke a shell script or alias by using the Terminal app from Shortcuts
  • You don’t get a login shell, so we need to manually load the aliases for this to work.
  • We also need to use full paths, since the non-login shell has a different PATH set.
  • Once the aliases are loaded, we can chain calls using the && operator.
  • Under the Shortcuts setting, there’s a checkbox for ‘Pin in Menu Bar’ that exposes the Shortcut to your GUI.

Looks like this on the menu bar:

I use these three:

source /Users/phubbard/.oh-my-zsh/custom/aliases.zsh && light && warmest && dim
source /Users/phubbard/.oh-my-zsh/custom/aliases.zsh && dark
source /Users/phubbard/.oh-my-zsh/custom/aliases.zsh && light && warmest && normal

I hope that others find this useful. I’m happy with the result as the code runs in a second or so and has yet to fail.

Litra Beam

A reader tells me that this code also works with the newer Litra Beam if you change the USB IDs in the _hid routine to 046D/C901. Thanks, John!

May 2025: Use Litra (Rust CLI) instead

There’s now a clean Rust CLI called litra-rs that works better. Use that instead – here’s my 1% Shortcut now:

/opt/homebrew/bin/litra on && /opt/homebrew/bin/litra brightness --percentage 1 && /opt/homebrew/bin/litra temperature --value 2700

It’s hombrew-installalble and has worked great.

Lockdown Apps

Found via Hacker News – free and open source tracker and ad blocker, works on-device using VPN hooks. That means that it’ll block in apps as well as Safari. I’m trying it now.
Note – they pay their bills with the subscription-billed VPN but you can decline that and just use the blocker. That’s what I’m trying.

For iOS devices and Mac only.