Creating my own POD >> Read this to me

Working with acloudguru on an AWS certificate I stumbled across a cool project that I’ve had in mind for some time.

Right now I’ve got Back then I had this broader idea to…

  1. Take an RSS feed
  2. Convert the reads into audio stories
  3. Smash all the audio stories together into one MP3
  4. Publish the MP3 daily into a podcast
  5. Listen to my fave blogs on the way to work b/c candidly writing is better than interviewing MOST of the time.

I already figured out how to do #4 quite easily in an earlier survey into this idea. My wife wanted this for Mr. Money Mustache blogs.

The harder step was 1 + 2 + 3. Well now I’ve got a small tool up @ http://johntrhoads.com/polly/ that will take any* amount of text and convert it into MP3.

Hop on over, drop in some text, and get a really good sounding reading of it w/Amazon Polly. The default voice I’ve setup (Salli as of 7/8/18) is actually REALLY good sounding straight out of the box. A few months ago when I explored doing this more deeply, I was a little disappointed that the voices weren’t good enough. Seems like they are going to be EXCELLENT in just a few months/years.

The whole setup is built on AWS Lambda + S3, so it is effectively free for me to operate. Been loving this incremental step towards something larger for now. Need to figure out how to take an RSS feed and send in the right SSML (markup speech intonation) to make the conversions, but this basically hands most of #2 and #3 above, with the destination file sitting in an S3 bucket; so very ready for #4.

In the meantime, it has been great to convert some digital books/long reads into Mp3. Huzzah.

Missing steps:

  • Build RSS feed (feedly is my guess on where to start)
  • Parse RSS feed and get contents (can’t be THAT hard can it?) in Python on AWS Lambda
  • Feed into my existing Lambda chain that takes text > SNS/DB > Polly > Mp3 > s3
  • Take s3 and move through autopublishing steps to put into my personal podcast feed.
  • Identify a triggering factor for this daily + a good time for it to run (somewhat dependent on the blogs?). Thinking IFTTT for now, but not sure.

*not quite sure what the limits are of this b/c the Lambda function can only do 5mins. So far… pretty close to any tho.

Update: 9/9/2018 … I have a podcast that works. Woah!?!!

So I wound up using this as a stepping stool as described above. I am using Lambda and have my own podcast updating each day. There are several functions with SNS inbetween to navigate around the inherent limits to Lamba. I am quite certain that for a big feed, this would break down but so far no issues w/ the few blogs I suck in each day.

Some rundown:

  1. Using an RSS mixer to combine all my blogs into one RSS to make fetching easier. Very welcome and free! Ideally this would be coded up via a front-end but less is more at this stage of the project.
  2. Using Lambda with a last run time to parse blogs and only look for stuff since the last run. This helped avoid duplicating the conversion of posts.
  3. Then I send to Polly using the speech mark up language for pauses and emphasis. I can’t believe how powerful and accessible Polly is. I have no doubt someone will be building power things on top of it soon…
  4. On Polly response, take the returned info and format it into XML as follows
    1. Take a header.xml file I manually constructed for my feed. Think of this as the thumbnails and name and etc for the Podcast. Again, I could build a UI for this, but at present no need. Just running super lean at the moment.
    2. Then smash in the current date to update the <lastBuildDate> tag (so Overcast knows to update)
    3. Then drop in the latest MP3 rending for the day and other relevant info formatted properly as XML. This is a little sparse because I don’t store the blog post info I am parsing at present. It would be a good idea to do that, but again a nice to have all things being equal.
    4. Then parse out the old .xml file my podcast is hosted on for everything after </lastBuildDate>
    5. Update the XML file in place so that
    6. Trigger from a Cloudwatch Event cron everyday at 6am or smth like that.

Overall, very happy with the first pass results! If you’d like to give a listen, check out this Overcast link: https://overcast.fm/p952398-USmupT. Feeling pretty crafty on this one so far!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s