For me, two reasons:
Podcast loudness can vary greatly across feeds. Some can be too loud, some too quiet. And I don’t want to fiddle with the volume for each new podcast.
Post-processing with utilities such as MP3Gain can be used to make all podcasts sound equally loud, eliminating the need to fiddle with the volume.
In the EU, there are rules governing the maximum power output which consumer devices, such as smart phones, can generate through headphones. In a noisy environment (for me, that’s when cycling) the mandated maximum output level can be too low, particularly for spoken podcasts.
Again, MP3Gain can be used to pre-boost the gain such that podcasts can be listened to comfortably in both quiet and noisy environments.
PodRecast enables such post processing by downloading podcasts to a local system, post processing them, then republishing them.
PodRecast might also be used to change the bit rate, possibly to reduce storage and/or transmission costs.
republish podcasts, keeping the rss/xml exactly as it was, except replacing enclosure URLs with URLs referencing local, post-processed files.
By keeping the rss/xml exactly as it originally was, metadata such as the title, the publication date, the GUID and any associated imagery remain embedded in the feed.
Download feeds and new podcasts to local storage.
PodRecast does this.
PodRecast does not do this, although there are a couple of suitable post-processing scripts included in the distribution. See also here.
PodRecast, serving now as a CGI script, then publishes exactly the original feed’s rss/XML, but with each enclosure’s URL replaced with a local URL referencing a local, post-processed file.
PodRecast does this.
All of this can be driven by cron.
Python: 2.6 or 2.7; not 3.x.
lxml: this is
Getting the Source
Install — Executables
sudo python ./setup.py install
python ./setup.py install --prefix=$HOME/local
Install — CGI
The installation script tries to install podrecast as a CGI script in one of:
However, the location of the CGI bin directory and its configuration are system dependent, in which case the installation may fail to find a suitable CGI bin, or may install it in the wrong place. Observe the last few lines of output from the installation script.
If the installation script fails to install the CGI script in the correct location, then it may be possible to hard link the podrecast executable into the CGI bin, or link to it with a symbolic link. You will have to do this manually.
Or, install it directly; something like one of the following…
sudo install -v -m 0555 ./podrecast /YOUR/CGI/BIN
install -v -m 0555 ./podrecast /YOUR/CGI/BIN
PodRecast searches for its configuration file in the following locations (and in this order):
Here is an example configuration file…
root /home/blott/tmp/podrecast Cycling/Velo-Cast http://velocastcc.squarespace.com/race-radio/rss.xml BBC/More-Or-Less http://downloads.bbc.co.uk/podcasts/radio4/moreorless/rss.xml BBC/Life-Scientific http://downloads.bbc.co.uk/podcasts/radio4/tls/rss.xml Economics/Freakonomics http://feeds.feedburner.com/freakonomicsradio?format=xml
The root is the directory under which feeds and podcasts will be stored. This must come first.
The subsequent feed lines each consist of exactly two tokens:
the feed’s subdirectory, and
the feed’s URL
A commented version of this sample is installed by the setup script in:
Feeds must not share subdirectories.
podrecast # download all available podcasts # or podrecast -c # catch up # or podrecast -a # automatic catch up (only catch up on newly-added feeds) # or podrecast -m 2 # download at most 2 podcasts from each feed
When podcasts are downloaded, they are not immediately available for delivery. First, they must be post processed.
Here is an example.
A longer version of this script is installed by the setup script.
#!/bin/sh # Balance the gain across podcasts, and boost it. gain=6 for file in `podrecast --downloaded` do nice nice mp3gain -r -c -p -t -m $gain $file \ && podrecast --ready $file done
Or, if you don’t want to do any post processing at all, then it can be as simple as:
#!/bin/sh podrecast --ready `podrecast --downloaded`
This script is also installed by the setup script.
To republish content,
podrecast must be installed as a CGI script. This
should be done on installation.
The URLs for republished feeds are of the form:
For example, with the sample configuration file above, they might be:
PodRecast also publishes a simple HTML page listing its feeds' URLs.
This can be useful for testing or for entering the URLs on mobile devices: point your device towards the feeds list and either click or copy the feed URLs you’re interested in.
The URL for the feed list is:
For example, in my own case, it’s:
crontab reads like:
# download podcasts 05 2,12,17,20,23 * * * chronic podrecast -a # post processing 55 2,12,17,20,23 * * * chronic podrecast_mp3gain
If you don’t want to use
chronic, then this should do the trick:
05 2,12,17,20,23 * * * podrecast -a > /dev/null 55 2,12,17,20,23 * * * podrecast_mp3gain > /dev/null
The maximum number of podcasts to download, per feed. The default is
0, meaning download all available podcasts.
Mark all podcasts are downloaded, but do not actually download them.
Auto catch up.
Whenever (and only when) PodRecast encounters a newly-configured feed, it marks all content as downloaded, but does not actually download it.
The idea is that newly-configured feeds may contain tens or even hundreds of podcasts, far more than most people actually want to download.
Output a list of files which have been downloaded and are ready for post processing.
Register these previously-downloaded files as now post processed, and ready for delivery as podcasts.
wget for all downloads. For authenticated feeds, create
.netrc for usernames and passwords.
When downloading podcasts,
podrecast locks (with
flock) its configuration file to prevent two instances running
at the same time (although that probably wouldn’t be too much of a problem,
podrecast handles all downloads via temporary files (c.f.
mkstemp). Therefore, you
should not encounter partial downloads with normal-looking file names.
podrecast does not delete downloaded files. Just use cron and
find for that:
find /MY/ROOT/DIRECTORY -type f -name '[^.]*' -mtime +20 -delete
podrecast stores its state in hidden files in the feed directory
itself. Whenever it downloads a feed, it touches all of the state files
related to enclosures mentioned in the feed. Therefore, state files with
old modification times can safely be deleted. Something like:
# This assumes that podrecast has in fact been run recently. # find /MY/ROOT/DIRECTORY -type f -name '.*.podrecast_done' -mtime +20 -delete
However, there are usually not so many state files, and they take up next to no space; so I just let them be.
podrecast decides that it is running as a CGI script if all of the following
four environment variables are set:
PATH_INFO. Otherwise, it considers itself to be
In general, there is no reason why multiple root directories cannot be
defined. However, when running as a CGI script,
podrecast requires that
the configuration file defines exactly one such root.
podrecast is quite chatty on standard output.
This can be helpful when run interactively,
but not when run (incorrectly) as a cron job.
You can always redirect standard output to
/dev/null. But, really, the
best approach is to use
chronic. That way, when an error occurs, you get
to see all of the output.
podrecast disables the
wget progress bar when standard output
is not attached to a terminal.
podrecast makes the following (moderate) checks on incoming CGI requests:
The final path for a directory or file must be inside the root directory.
This may be an issue if you use symbolic links.
The path may not contain the sequence "
(Why on earth should it?)
The path may not be more than one thousand characters long.
(That’s a lot of characters. Anything longer may be someone probing for vulnerabilities.)
If you’re running
podrecast on a public server, then it may be best to
configure your web server to require some form of authentication.