PodRecast is a podcast aggregator for downloading, post-processing, then republishing podcasts.

Quick links:
Download, Configuration and Use

See Installation.


Please report issues on GitHub.


For me, two reasons:

  • Podcast loudness can vary greatly across feeds. Some can be too loud, some too quiet. And I don’t want to fiddle with the volume for each new podcast.
    Post-processing with utilities such as MP3Gain can be used to make all podcasts sound equally loud, eliminating the need to fiddle with the volume.

  • In the EU, there are rules governing the maximum power output which consumer devices, such as smart phones, can generate through headphones. In a noisy environment (for me, that’s when cycling) the mandated maximum output level can be too low, particularly for spoken podcasts.
    Again, MP3Gain can be used to pre-boost the gain such that podcasts can be listened to comfortably in both quiet and noisy environments.

PodRecast enables such post processing by downloading podcasts to a local system, post processing them, then republishing them.

PodRecast might also be used to change the bit rate, possibly to reduce storage and/or transmission costs.


Key idea:
  • republish podcasts, keeping the rss/xml exactly as it was, except replacing enclosure URLs with URLs referencing local, post-processed files.

By keeping the rss/xml exactly as it originally was, metadata such as the title, the publication date, the GUID and any associated imagery remain embedded in the feed.

The workflow is as follows:
  • Download feeds and new podcasts to local storage.
    PodRecast does this.

  • Post-process podcasts.
    PodRecast does not do this, although there are a couple of suitable post-processing scripts included in the distribution. See also here.

  • PodRecast, serving now as a CGI script, then publishes exactly the original feed’s rss/XML, but with each enclosure’s URL replaced with a local URL referencing a local, post-processed file.
    PodRecast does this.

All of this can be driven by cron.



  • Python: 2.6 or 2.7; not 3.x.

  • lxml: this is python-lxml on Debian.

  • wget

Getting the Source

One of:
  • download the latest release from PyPi, or

  • download the latest release from here, or

  • clone the git repository:

    • git clone https://github.com/smblott-github/podrecast.git

Install — Executables

sudo python ./setup.py install
python ./setup.py install --prefix=$HOME/local

Install — CGI

The installation script tries to install podrecast as a CGI script in one of:


However, the location of the CGI bin directory and its configuration are system dependent, in which case the installation may fail to find a suitable CGI bin, or may install it in the wrong place. Observe the last few lines of output from the installation script.

If the installation script fails to install the CGI script in the correct location, then it may be possible to hard link the podrecast executable into the CGI bin, or link to it with a symbolic link. You will have to do this manually.

Or, install it directly; something like one of the following…

sudo install -v -m 0555 ./podrecast /YOUR/CGI/BIN
install -v -m 0555 ./podrecast /YOUR/CGI/BIN


PodRecast searches for its configuration file in the following locations (and in this order):


Here is an example configuration file…

root /home/blott/tmp/podrecast

Cycling/Velo-Cast        http://velocastcc.squarespace.com/race-radio/rss.xml
BBC/More-Or-Less         http://downloads.bbc.co.uk/podcasts/radio4/moreorless/rss.xml
BBC/Life-Scientific      http://downloads.bbc.co.uk/podcasts/radio4/tls/rss.xml
Economics/Freakonomics   http://feeds.feedburner.com/freakonomicsradio?format=xml

The root is the directory under which feeds and podcasts will be stored. This must come first.

The subsequent feed lines each consist of exactly two tokens:

  • the feed’s subdirectory, and

  • the feed’s URL

A commented version of this sample is installed by the setup script in:

  • PREFIX/etc/podrecast.conf.sample

Feeds must not share subdirectories.

Download Podcasts

One of:
podrecast        # download all available podcasts
# or
podrecast -c     # catch up
# or
podrecast -a     # automatic catch up (only catch up on newly-added feeds)
# or
podrecast -m 2   # download at most 2 podcasts from each feed

Post Processing

When podcasts are downloaded, they are not immediately available for delivery. First, they must be post processed.

Here is an example.
A longer version of this script is installed by the setup script.

# Balance the gain across podcasts, and boost it.


for file in `podrecast --downloaded`
   nice nice mp3gain -r -c -p -t -m $gain $file \
      && podrecast --ready $file

Or, if you don’t want to do any post processing at all, then it can be as simple as:


podrecast --ready `podrecast --downloaded`

This script is also installed by the setup script.


To republish content, podrecast must be installed as a CGI script. This should be done on installation.
See here.

The URLs for republished feeds are of the form:


For example, with the sample configuration file above, they might be:


Feeds List

PodRecast also publishes a simple HTML page listing its feeds' URLs.

This can be useful for testing or for entering the URLs on mobile devices: point your device towards the feeds list and either click or copy the feed URLs you’re interested in.

The URL for the feed list is:

  • http://SERVER/CGI-BIN/podrecast/
    (The trailing / is required.)

For example, in my own case, it’s:


Set up cron

My own crontab reads like:

# download podcasts
05 2,12,17,20,23 * * * chronic podrecast -a

# post processing
55 2,12,17,20,23 * * * chronic podrecast_mp3gain

(chronic is part of the moreutils package. There’s an explanation of the what and why of chronic here.)

If you don’t want to use chronic, then this should do the trick:

05 2,12,17,20,23 * * * podrecast -a      > /dev/null
55 2,12,17,20,23 * * * podrecast_mp3gain > /dev/null


Downloading options:
-m MAX | --max=MAX

The maximum number of podcasts to download, per feed. The default is 0, meaning download all available podcasts.

-c | --catchup

Catch up.
Mark all podcasts are downloaded, but do not actually download them.

-a | --autocatchup

Auto catch up.
Whenever (and only when) PodRecast encounters a newly-configured feed, it marks all content as downloaded, but does not actually download it.
The idea is that newly-configured feeds may contain tens or even hundreds of podcasts, far more than most people actually want to download.

Post-processing options:
-d | --downloaded

Output a list of files which have been downloaded and are ready for post processing.

-r | --ready [FILES...]

Register these previously-downloaded files as now post processed, and ready for delivery as podcasts.

Release Notes

Development (not yet in release)

(Noting, currently)

Additional Information


podrecast uses wget for all downloads. For authenticated feeds, create a suitable .netrc entry.
wget consults .netrc for usernames and passwords.


When downloading podcasts, podrecast locks (with flock) its configuration file to prevent two instances running at the same time (although that probably wouldn’t be too much of a problem, see below…)

File names

podrecast handles all downloads via temporary files (c.f. mkstemp). Therefore, you should not encounter partial downloads with normal-looking file names.

Removing files

podrecast does not delete downloaded files. Just use cron and find for that:

find /MY/ROOT/DIRECTORY -type f -name '[^.]*' -mtime +20 -delete

podrecast stores its state in hidden files in the feed directory itself. Whenever it downloads a feed, it touches all of the state files related to enclosures mentioned in the feed. Therefore, state files with old modification times can safely be deleted. Something like:

# This assumes that podrecast has in fact been run recently.
find /MY/ROOT/DIRECTORY -type f -name '.*.podrecast_done' -mtime +20 -delete

However, there are usually not so many state files, and they take up next to no space; so I just let them be.


podrecast decides that it is running as a CGI script if all of the following four environment variables are set: REQUEST_METHOD, HTTP_HOST, SCRIPT_NAME and PATH_INFO. Otherwise, it considers itself to be running normally.

In general, there is no reason why multiple root directories cannot be defined. However, when running as a CGI script, podrecast requires that the configuration file defines exactly one such root.


podrecast is quite chatty on standard output. This can be helpful when run interactively, but not when run (incorrectly) as a cron job.

You can always redirect standard output to /dev/null. But, really, the best approach is to use chronic. That way, when an error occurs, you get to see all of the output.

podrecast disables the wget progress bar when standard output is not attached to a terminal.


podrecast makes the following (moderate) checks on incoming CGI requests:

  • The final path for a directory or file must be inside the root directory.
    This may be an issue if you use symbolic links.

  • The path may not contain the sequence " .. ".
    (Why on earth should it?)

  • The path may not be more than one thousand characters long.
    (That’s a lot of characters. Anything longer may be someone probing for vulnerabilities.)

If you’re running podrecast on a public server, then it may be best to configure your web server to require some form of authentication.