NaKyle On Tech talkin bout tech

Steam Bot - Shopping better, with Python!

I spend a lot of time on Steam, and almost as much money. I also spend far too much time on Reddit, so to save myself some time and money I built a Reddit bot that will make reports on the best stuff on sale and post them! You can see how I did it after the jump.

Steam is just about the best place to buy games online anymore. Not necessarily for price reasons but for the combination of good prices, excellent community, and ease of use.

An issue I've ran into time and again is how difficult it can be to figure out everything that's on sale and how much it's going for. So to fill this gap I started work on what is now /u/steam_bot on Reddit.

Basically it pulls down current store data for every app in Steam and formats it into a nice little report every six hours. Sounds simple enough right? Well not exactly.

How'd Ya Do It!?

The Toolbox

  • Python 2.7 - My Python version of choice due to working conditions.
  • Requests - Best HTTP request handling module out there!
  • PRAW - Best Python Reddit API Wrapper!

This is my standard toolset I use when building a new reddit bot. Keeps it nice, simple, and light!

Building It

Note: For this walkthrough I'm going to take the path of least resistance, there will be better ways of doing a lot of these things and a lot of uncaught exceptions but I leave finding those as an exercise for the reader.

First we need to gather up info on all the apps in Steam, app ids and names mostly. We can get that from the public API endpoint here:

Here's the beginning of what you should see returned:

    "applist": {
        "apps": [
                "appid": 5,
                "name": "Dedicated Server"

You don't even need an API key, though I always recommend being nice to public API's and avoiding anything that could be misconstrued as abuse. Steam has pretty specific rules on API usage you can find here.

That endpoint will give you a list of all the app ids and names that Steam currently tracks. This includes servers, movies, dlc, and stuff we have to remove later.

You can easily load up that info and parse it out into a nice list of dicts like this:

Now that we've got all the app ids and names into a big list we can start doing some fun stuff!

It's nice to add the store url to each of the items which we can do like so:

Python never copies variables except in very specific cases or if you do it explicitly, this lets us update all the apps in place while we iterate over the loop without having to worry about indexes and such. Just remember never to remove or add anything to a loop you're iterating over!

Next we can grab the price and other store page data. Unfortunately there isn't a valid public API endpoint for this. But a little snooping of the JavaScript on the store pages and a few trips to google turned up an undocumented endpoint!

Note: Undocumented endpoints can change suddenly and should only be used with caution!

This endpoint gives us basically all the worthwhile content from the store page, including price and discount percent.

There are only a few parameters for this endpoint that I'm aware of:

Parameter Description
appids List of comma separated appids to retrieve info for. Up to 10 seems to work fine.
cc Country code to retrieve other countries prices and localizations.

The response you get from this endpoint looks basically like this:

    "<APPID>": {
        "data": {
            "steam_appid": "<APPID>",
            "price_overview": {
                "currency": "USD",
                "initial": 5999,
                "final": 5999,
                "discount_percent": 0

And so on, there are a ton of fields with all kinds of neat data you can explore, for now we're only concerned with the price overview and a few other bits.

So we can go about loading the information for each app like this:

That's the simplest way to get the data, but I recommend building a list of apps around 5 long and requesting all at once to drastically lower your request count as well as other optimizations to filter out non-game items.

Now I also wanted to get a bit of information about current players to help decide what the best apps really are. Fortunately Steam has another handy documented endpoint for us:

This endpoint requires a parameter, appid which is the integer appid like we got from the first request. Here's some sample output:

    "response": {
        "result": 1,
        "player_count": 205

We'll use requests again to get the current players for all the apps:

So now we have a big list of all the apps on Steam as well as their store page information and how many players are currently online for each. It's all just a mish mash right now so let's start filtering out the cream of the crop!

Python has some really nice simple dataset manipulation tools out of the box, first lets remove all apps that aren't on sale using the builtin filter and a quick lambda expression. We can also quickly sort by current players and pull out the top 20 using the sorted function:

Now we can start formatting a Reddit post! Reddit self posts need a title and a string of body text that can be up to 10,000 characters long. We'll start with the title:

That will give you a nice little post title with the date and UTC time in.

Next we can build the post body, I basically use a bunch of tables, reddit uses a nice set of Markdown syntax, you can find a reference here.

You can see how I used Python's implicit line continuation within parenthesis to keep the code clean and under the usual 80 character limit as well as using some of the more complex string operations to format the price and discount numbers on line 7.

Now all we need to do is initialize a reddit instance using praw and get a subreddit instance to submit to.

All four of those lines have the potential to raise exceptions but I'll leave catching and handling them to you.

Now that we have our subreddit context to post to, our post title and body to submit there's only one thing left to do!

result will contain things like your post id, url, etc. If it's unable to post it will raise an exception.

And that's it, you've made a Steam sale report bot and you have the tools and know-how to start pulling data from any other number of resources and enriching your bot's posts! To keep things simple you can just set up a cron job to run your bot on a schedule, as well as adding logging and reporting functions.

You can also take a look at the much more fleshed out source of my own steam_bot over on BitBucket!

comments powered by Disqus