This post is a bit rushed, but if I don’t post it now, I know I won’t have a chance for another few weeks. I leave for the airport to go back to CMU tomorrow. My bags are packed, everything’s all squared away… except I want to write about a little project I’ve been working on this summer. So, without further ado…
If you don’t already know, Reddit is a hub of internet links and associated comment sections. The top links shown are the ones that receive the most upvotes (user-submitted clicks of approval.) Reddit is divided into “subreddits”, each with its own community theme. For example, reddit.com/r/dexter is a community where people share links about the tv show Dexter. /r/wallpapers is a subreddit where people post links to their favorite desktop wallpaper images. I like to look at /r/aww when I need some cheering up. People post links to pictures of adorable kittens and puppies and all sorts of cute things.
But I’ve been thinking, wouldn’t it be great if, instead of navigating to Reddit to see those pictures, I just had to look at a folder on my computer? Or even just at my desktop wallpaper? And what if there were a different adorable kitten on my desktop every time I looked?
I created an app that can grab the top pictures from any subreddits and save them to any directories on your computer. Mac OS X has the option to change the desktop wallpaper, at a set interval, to a random image from a given directory. So, I configured Reddit Scraper to grab the top 5 pictures from /r/aww every day at 8:00 PM (using cron) and put them in my wallpapers directory.
The app is very configurable. You can take images from any number of subreddits and save them in whichever directories you choose. If you want to, you can even take non-image files, like HTML files, and save them for offline viewing. See the “How to use it” section below for details on how to start getting daily (or even hourly, or minutely, or weekly…) updated wallpapers or archival folders from /r/wallpapers, /r/earthporn, you name it!
How to use it:
1. Fork the source on GitHub (link at the top of this post)
2. Make sure you have Python 2.7 and other dependencies installed (notably PRAW)
3. Run __main__.py and use the GUI to configure your settings (make sure to press Save)
4. Use your favorite process scheduler to run scrape.py at whatever interval you choose
5. Sit back and let reddit_scrape do all the work.
That’s it! If just one person gets use out of this, I’ll be happy. Please comment below with any feedback (suggestions, bugs, etc.)