Smokeping and Caddy 2: hard but worthwhile

Spent a few semi-pleasant hours today to get this working:

That’s Smokeping, proxied by Caddy2 on my love-it-to-bits Raspberry Pi4 web server.

Smokeping lets you track not just the yes-we-are-connected / no-we-are-not, but latency and packet loss and jitter. My network and ISP are generally solid, but this is an easy tool to have around. Once, that is, you have it installed.

I run Debian on my Pi, natch, and the wondrous Caddy to serve files and reverse proxy the various web apps. Not to mention automatic SSL certs from LetsEncrypt and the least verbose configuration possible. Smokeping, alas, uses the now-uncommon CGI interface, so gluing it all together took a while. Let me leave some notes for anyone else in this situation.

Basic install

apt install fcgiwrap
apt install smokeping
service fcgiwrap start

The /etc/smokeping/config.d directory has a bunch of edits you’ll need. In General:

cgiurl   = https://ping.phfactor.net/smokeping.cgi

Note that Caddy prefers CNAMEd virtual hosts, so I’m using ping.phfactor.net. You’ll need that in your DNS. Here’s the Caddyfile entry:

ping.phfactor.net {
	log {
		output file /var/log/caddy/ping.log
	}
	root * /usr/share/smokeping/www
	encode gzip
	file_server
	@cgi {
             path *.cgi
    	}

        reverse_proxy @cgi unix//var/run/fcgiwrap.socket {
    	    transport fastcgi {
                split .cgi
            	env SCRIPT_FILENAME /usr/share/smokeping/smokeping.cgi
            }
        }
    # Ensure CSS and JS files are served correctly
    @static {
        path /css/* /js/* /img/*
    }
    handle @static {
        file_server
    }

    # Try serving static files before falling back to CGI
    try_files {path} /{path}	

}

Kinda ugly. Might be some cleanup possible there. I also had to modify the HTML template file /etc/smokeping/basepage.html to remove the /smokeping/ prefix from the CSS and JS URLs:

   <link rel="stylesheet" type="text/css" href="/css/smokeping-print.css" media="print">
    <link rel="stylesheet" type="text/css" href="/css/smokeping-screen.css" media="screen">

...

<script src="/js/prototype/prototype.js" type="text/javascript"></script>
<script src="/js/scriptaculous/scriptaculous.js?load=builder,effects,dragdrop" type="text/javascript"></script>
<script src="/js/cropper/cropper.js" type="text/javascript"></script>
<script src="/js/smokeping.js" type="text/javascript"></script>

For now, I’m using the basic function of ICMP pings, but Smokeping supports more advanced tests such as SSH login and others.

Note that Safari seems a bit confused by Smokeping graphs, and caches old ones longer than it should. Chrome and Firefox do it right. Odd.

The results are pretty cool though.

Reddit on iOS minus ads

So a while ago, Reddit enshittified after taking PE money. Turned off the APIs, blocked third-party apps, etc. And the official app is a really shitty ad-laden experience. So. Do you have

  1. A Macintosh
  2. some code/build experience
  3. and iPhone or iPad
  4. the desire to read Reddit
  5. A $99/year Apple Developer account
  6. Stubborness?

The details would take ages to type out, thus numbers 2 and 6. Drop a comment if this is useful and I’ll write a followup; right now I’d guess I have maybe two-digit readership.

The source code that you want is called Winston, here on GitHub. Yes, like 1984. Clone it, load it into Xcode, and then modify the two bundle identifiers. I use the net.phfactor prefix since that’s my domain; be creative but they have to be unique to Apple.

I vaguely remember that you need to create a Reddit developer token which is also painful (See ) but only needs doing once. The results are well worth the hassle. I just pulled main and rebuilt today after my build expired. (The $99 developer device builds are only good for a year. Apple forces everything through their App Store and this as close as they allow. Yes, it sucks.)

And my local peeps

It’s good to be back.

Beware of censored LLMs

I’m a huge, huge fan of all things Simon Willison, but this latest post prompted me to write. Models trained by Alibaba, ByteDance and other Chinese companies have to adhere to Chinese censorship, and the companies have found a so-far secret solution to removing information from them. Qwen, for example, and the new DeepSeek-R1.

Simply ask this:

Tell me about Tiananmen Square. What happened there? Why is it famous? Why is it censored?

If the model is honest, it’ll tell you. If it’s censored, it may do this:

Or this

I haven’t explored the censorship much past that – I’d assume that there are censored topics, altered facts and perhaps added bias. Caveat emptor.

Wildfire intensity scale

Years ago, during the 2016 Fort McMurry wildfire, I read an article that I should have bookmarked that discussed wildfire in physics terms – watts per square meter. Above some threshold, you literally and actually cannot douse the flames. Today, with the LA fires raging, I went searching and found this PDF by Joe H Scott. Seems that the standard is from Byram, G. M. 1959. Combustion of forest fuels. In: Forest fire: Control and use, 2nd edition. New York, NY: McGraw-Hill: chapter 1, 61-89.

Here’s the key bit from the Scott paper:

So a basic wildfire is ticking along at 10kW per meter, and a rager might be 100 to 150 megawatts per linear meter.

Goddamn. No wonder you can’t extinguish them.

By way of comparison, a gallon of gasoline has around 33 kilowatt hours of energy. If I estimate right, a big fire is equivalent to a gallon of gas burned over a 15 minute interval. Not sure that helps my intuition, and I often get stoichiometry wrong anyway.

Music service playlist migration

It will not come as news to anyone streaming music via Spotify, Apple Music, Amazon, Tidal, etc – the playlist is the proprietary bit. The music is identical but your curated playlists are a barrier to moving.

Today, I saw a Spotify playlist in this Cool Tools post:

 I’m in love with this “Halloween” playlist because it isn’t cheesy songs like the Monster Mash and Ghostbusters, instead, it’s an adults’ Halloweenish soundtrack featuring great moody music from bands like M83, the Cure, the National and more. This plays nonstop at my house from Labor Day through the end of October.

Here is the playlist link – it’s “Halloween is a Dead Man’s Party

But I don’t use Spotify. Because of the subsidized hardware, we use Amazon to stream to a bunch of Echos connected to speakers.

The solution is a free web app called Tune My Music. It’s free, and Amazon lists it as an approved way to import playlists. It can go back and forth between a great number of services, but for me I just setup a new Spotify account (yay hide my email!), granted access to TMM, and then playlist access to Amazon music, and it copied it over. Only one track was missing; good enough.

So maybe a bookmark in case you want to move between services.

Status games

Mother Jones today has an excellent story on Arlie Russell Hochschild’s book “Stolen Pride.” This quote in particular:

“We live in both a material economy and a pride economy, and while we pay close attention to shifts in the material economy, we often neglect or underestimate the importance of the pride economy. Just as the fortunes of Appalachian Kentucky have risen and fallen with the fate of coal, so has its standing in the pride economy…https://www.motherjones.com/politics/2024/09/jd-vance-arlie-russell-hochschild-hillbilly-elegy-stolen-pride-excerpt/

So close! What’s she’s describing is status. Rank in the community, real or perceived. Will Storr wrote an excellent book about it, “The Status Game” that I highly recommend. (My local library had it.)

The MJ story is excellent and well worth your time. The Storr book is longer; this review might help you decide if you’d find it worthwhile.

LLMs can solve hard problems

We’re a couple of years into the LLM era now, and the Gartner hype cycle from last year seems relevant:

Image credit: Gartnerhttps://www.gartner.com/en/articles/what-s-new-in-artificial-intelligence-from-the-2023-gartner-hype-cycle

The purpose of this post is to share two hard problems (in the CS sense of the term) that I and a friend solved with an LLM.

The problem and impetus

I’m friends with a couple of guys who have a podcast. (I know, right?) It’s been going for seven years or more now, is information-packed and more than once I’ve wanted to be able to search for something previously mentioned. Then, via Simon Willison I think, I learned about the amazing Whisper.cpp project that can run OpenAI’s Whisper speech-to-text model at high speed on desktop hardware. As others have said, “speed is a feature” and being able to do an hour of audio in a few minutes on a MacBookPro or Mini made the project interesting and feasible.

The project and goals

The overall goal of the project was to generate transcripts of every episode, index them with a local search engine, and serve the results as a static website. Open source code, no plan to monetize or commercialize, purely for the fun of it.

The code uses Python for logic, Makefiles for orchestration, wget for web downloads, mkdocs for website generation, xmltodict for the RSS parsing, Tenacity for LLM retries and rsync to deploy code. Nothing too exciting so far.

This let us generate a decent website. However, it was quickly obvious that Whisper would not suffice, since it doesn’t indicate who is speaking, a feature known as ‘diarization.’ After sharing the work on the TGN Slack, a member offered his employers’ product as a potential improvement. WhisperX, hosted on OctoAI, includes diarization. So instead of this wall of text

you get something more like this (shown processed a bit)

Enter the LLM

So now we have roughly an hours’ worth of audio, as JSON, with labels as to speaker. But the labels are not the names of the speakers.SPEAKER_00‘ isn’t helpful. We need the names.

We need to somehow process 50-100KB of text, with all of the peculiarities of English, and extract from it the names of the speakers. Normally it’s the same two guys, but sometimes they do call-in type episodes with as many as 20 callers.

This is super hard to do with programming. I tried some crude “look for their usual intros” logic but it only worked maybe half of the time, and I didn’t want to deep-dive into NLP and parsing. At my day job, I was working on LLM-related things so it made sense to try one, but our podcasts were too large for the ChatGPT models available. Then came Claude, with 200k token windows and we could send the entire episode in a single go.

The code simply asks Claude to figure out who’s speaking. Here is the prompt:

The following is a public podcast transcript. Please write a two paragraph synopsis in a <synopsis> tag
and a JSON dictionary mapping speakers to their labels inside an <attribution> tag.
For example, {"SPEAKER_00": "Jason Heaton", "SPEAKER_01": "James"}.
If you can't determine speaker, put "Unknown".
If for any reason an answer risks reproduction of copyrighted material, explain why.

We get back the JSON dictionary, and the Python code uses that to build correct web pages. That works! Since we were paying maybe a nickel per episode, we also ask for a synopsis, another super hard programming task that LLMs can do easily. The results look like this:

Note the synopsis and the speaker labels.

LLMs are not just hype

It works great! We now have a website, with search index, that is decent looking and a usable reference. There’s more to do and work continues, but I’m super pleased and still impressed at how easy an LLM made two intractable problems. It’s not all hype; there are real, useful things you can do and I encourage you to experiment.

Lastly, please check out the website and enjoy. A complete time capsule of the podcast. I wonder if Archive.org needs a tool like this?

Links

Notes and caveats

  • This uses Claude 3.5 ‘Sonnet’, their mid-tier model. The smaller model didn’t do as well, and the top-tier cost roughly 10x as much for similar results.
  • An hour of podcast audio converted to text is about 45k to 150k tokens. No problem at all for our 200k limit.
  • Anthropic has a billing system that appears designed to defend against stolen credit cards and drive-by fraud. I had to pay a reasonable $40 and wait a week before I could do more than a few requests. For two podcasts, it took about a week to process ~600 episodes. Even at the $40 level, I hit the 2.5M limit in under a hundred episodes.
  • About one episode in ten gets flagged as a copyright violation. Which they are not. Super weird. Even more weird is that making the same call again with no delay usually fixes the error. A Tenacity one-liner suffices. As you can see, we tried to solve this in the prompt but it seems to make no difference.

A Richardson for today

A friend is dealing with a large bureaucracy and it brought to mind this aphorism. It’s way too damned difficult to get a human on the phone any more and it’s not just cost.

214.
Someone’s deceleration to exit, read a sign or
rubberneck starts a little chain of responses
that becomes a five-mile backup. So much of
what turns out to be the huge evil of systems
is the amplification of tiny reluctances to let go
of a habit, to lift a phone, to look up and meet
someone’s eyes.

-James Richardson, ‘Vectors: Aphorisms and Ten-Second Essays.’