radieo/README.md

# radieo

A personal music radio: an always-on HTTP audio stream, automatically fed from
several sources and broadcast with [Liquidsoap](https://www.liquidsoap.info/).

The goal is a hassle-free stream that always has something playing, where the
next track is picked automatically. It is meant for personal use (a couple of
simultaneous listeners), not for public broadcasting.

## How it works

radieo is built as two layers, each running in its own Docker container and
sharing a cache volume:

- **`ingest`** (Python) — the brain. It decides what to play next, resolves and
  downloads tracks into a local cache, keeps a pre-filled queue, and exposes the
  next track over HTTP at `GET /next`. *(currently it only serves the cache
  directory; the download providers come in later milestones — see roadmap)*
- **`stream`** (Liquidsoap) — deliberately dumb. It pulls the next track from
  the `ingest` daemon, broadcasts the audio over HTTP, and never goes silent
  thanks to a local cache fallback.

Playback sources (planned): a [Navidrome](https://www.navidrome.org/) library
via the OpenSubsonic API, arbitrary tracks fetched with
[yt-dlp](https://github.com/yt-dlp/yt-dlp) (Bandcamp, SoundCloud, YouTube…), and
listening suggestions from a ListenBrainz RSS feed.

## Usage

Requirements: Docker with Compose v2.

```sh
# Drop some .mp3 files into the cache directory
cp /path/to/music/*.mp3 cache/

# Build and start the stream
docker compose up -d

# Listen (VLC, a browser, any audio player)
#   http://localhost:8000/radio.mp3
```

Stop it with `docker compose down`.

The stream is MP3 at 192 kbps. Multiple clients can listen at the same time.
New files dropped into `cache/` are picked up automatically (the playlist is
reloaded when the directory changes).

## Configuration

Copy `.env.example` to `.env` and fill in your Navidrome details:

```sh
cp .env.example .env
# edit .env: RADIEO_NAVIDROME_URL / USER / PASSWORD / PLAYLIST
```

If the Navidrome variables are left empty, the source is simply disabled and
the stream plays whatever is already in `cache/` (the milestone-1/2 behaviour).

For the yt-dlp source, list the URLs to draw from in `config/urls.txt` (copy
`config/urls.txt.example`). Each line is either a direct track URL or a
container URL (playlist, album, label, artist page) from which one track is
picked at random.

For the ListenBrainz source, set `RADIEO_LISTENBRAINZ_URL` to your
recommendations feed (the Atom syndication URL, e.g.
`https://listenbrainz.org/syndication-feed/user/<you>/recommendations/weekly-exploration`,
or a local file path under `config/` for testing). ListenBrainz only *names*
tracks, so each suggestion is resolved to a real file: Navidrome first
(a `search3` lookup), then yt-dlp (`ytsearch1:`) as a fallback. The
MusicBrainz recording MBID that the feed already carries is used as the
track's canonical identity (no extra lookup needed).

The relative mix between sources is set by `RADIEO_WEIGHT_NAVIDROME` /
`RADIEO_WEIGHT_YTDLP` / `RADIEO_WEIGHT_LISTENBRAINZ` (a weight of 0 disables a
source); an empty URL / missing file also disables the corresponding source.

## Current status

**Milestone 6 — ListenBrainz provider: done.**

- Three playback sources feed a weighted scheduler: a Navidrome/OpenSubsonic
  playlist, a hand-maintained list of yt-dlp URLs (`config/urls.txt`), and a
  ListenBrainz recommendations feed. Container URLs (playlist/album/label/artist)
  are expanded and one track is drawn at random.
- ListenBrainz suggestions carry a MusicBrainz recording MBID, a title and an
  artist; each is resolved to a concrete file (Navidrome `search3` first, then
  a yt-dlp `ytsearch1:` fallback) and keyed directly by its MBID — so the same
  song is de-duplicated across all three sources for free.
- Each track is canonicalized to a MusicBrainz recording MBID (no API key
  needed; ~1 req/s, best-effort, results cached in SQLite). This gives a
  source-agnostic identity, so the same song from two sources collapses to one;
  when no confident match is found it falls back to a normalized
  `(artist, title)` key. The scheduler uses this canonical key for anti-repeat,
  with the providers applying a cheap locator filter first.
- Each source has its own fetcher (Subsonic stream / yt-dlp download); files are
  cached ahead of playback (prefetch buffer) and decoded by Liquidsoap.
- Play history and LRU retention are tracked in a SQLite database under
  `state/`: only the N most recently played files are kept on disk
  (`RADIEO_RETENTION_KEEP`, default 20). Orphaned download temp files are swept
  on startup.
- `GET /next` returns the next track as an annotated Liquidsoap URI with real
  title/artist metadata (or an empty body when nothing is ready).
- `stream` (Liquidsoap v2.4.5) pulls via `request.dynamic` and falls back to the
  local `cache/` directory; `mksafe` guarantees silence rather than a crash.
- HTTP stream served at `http://localhost:8000/radio.mp3` (MP3, 192 kbps),
  multiple simultaneous listeners supported.

Polish comes next (crossfade tuning, robustness, optional web player, config
file). (Known cosmetic quirk: at startup the fallback logs a few harmless
ffmpeg "Invalid data" warnings while probing non-audio files such as
`.gitkeep`; to be quieted in the polish milestone.)

## Roadmap

1. ✅ **Broadcasting skeleton** — Liquidsoap serving the cache directory.
2. ✅ **Ingestion daemon** — Python daemon exposing `GET /next`; Liquidsoap
   switches to a `request.dynamic` source with the cache as fallback.
3. ✅ **Navidrome provider** — play from an OpenSubsonic playlist, with caching,
   LRU retention and play history.
4. ✅ **yt-dlp provider** — fetch tracks from a maintained URL/artist list;
   weighted mixing between sources.
5. ✅ **Canonicalizer** — MusicBrainz MBID lookup for source-agnostic
   de-duplication.
6. ✅ **ListenBrainz provider** — parse the recommendations feed and resolve
   each suggestion to Navidrome or yt-dlp.
7. **Polish** — crossfade, robustness, optional web player, config file.