Reddit Technical SEO

By 4th February 2016Blog

I was having a quick browse of /r/bigseo after doing some work on duplicate content for a client, and out of the habit I’d developed that day, stuck a trailing slash on the end of the url to see what would happen.

It loaded the version with the trailing slash of course (existing alongside the non trailing slash url), and then I thought – “I bet if I was to look at Reddit and write some stuff on their technical SEO, it’ll be quite interesting to people on Reddit interested in technical SEO and get me a million inbound links to my underpopulated blog so it can finally rank number 1 for “SEO implementation headache journal”. (Just checked – it already does. Better find something else.)

Anyway – “this should be quick” is what I thought.

Let’s dive in.

I’ve crawled all the default subreddits (added bigseo, webdev and web_design because they love this sort of shit and I don’t want them feeling left out and bitter that they weren’t appropriately content marketed to.)

Nothing too fancy, just Screaming Frog and a few standard tests was enough to give me more blog material than I initially bargained for.

Dupe, dupe da dupe, dupe da dupe, dupe da dupe da dupe da dupe.

Going for a slash

As I mentioned above, you can load a version of a page without a trailing slash, and with a trailing slash.

So, both:

https://www.reddit.com/r/bigseo

and

https://www.reddit.com/r/bigseo/

exist.

That’s not the end of the world right though?  Nah – but from Reddit’s perspective, it’s a massive waste of their crawl budget.  The crawlers are going to be duplicating effort and spending allocated budget unnecessarily, and when you consider the size of Reddit and the speed at which new content is added, that crawl budget could definitely be spent better.  It’s effectively limiting the exposure of the site in search (new posts especially I expect).

It happens for posts too:

https://www.reddit.com/r/pics/comments/92dd8/test_post_please_ignore

https://www.reddit.com/r/pics/comments/92dd8/test_post_please_ignore/

And it’s fairly clear from the search engine results pages that Google isn’t making any assumptions on a default, as different variations are indexed.

No trailing slash reddit SERP
Trailing slash reddit SERP

Screaming Frog & The On Page Optimisation Of The Front Page Of The Internet

Okay – time for the crawl data – that requires tables.  Everybody loves tables, however I have no truck with html tables as they’ve betrayed me too many times in the past.  Here’s an embedded fusion table instead:

Subreddit Page titles.

They’re the subreddit title.  Mods can add this, and they do, so there’s massive variety and no real consistency.  Some are descriptive, some are 1 word.  Not much you can do about this really, cat’s out the bag – I’d be tempted to add a separator and a brand after each user generated title along the lines of “Movie News and Discussion | Reddit” – at least that confirms the destination.  Google does that anyway by itself because it’s so disgusted by the page titles, but I’d still want to sort it out on the site itself.

Subreddit Meta Descriptions

“reddit: the front page of the internet”

On every subreddit.

The open graph description pulls the ‘Description’ from the subreddit admin settings, and it would make sense for the meta description to do so too.  It doesn’t though, so all the meta descriptions are the same 38 characters of “reddit: the front page of the internet”.

It actually says in the subreddit description for admins: “Appears in search results and social media links. 500 characters max.” so I wonder if the code is borked and nobody realised.

Anyway – Google sensibly ignores that front page of the internet nonsense and pulls a snippet from the page content, but the defined description is probably the most sensible thing to use in the subreddit meta description because it’s a description of the subreddit.  I guess anything would be better than “reddit: the front page of the internet”.

Meta Keywords

reddit, reddit.com, vote, comment, submit

There’s an obscure Peruvian search engine where these help people find a website on which they can vote, comment and submit.

More Subreddit SEO

H1’s are alright I guess, not going to worry about anything after that in terms of on-page stuff.  There’s no canonical though, which helps explain the variation in trailing slash search results. “Worth having a canonical” I always say.  Why wouldn’t you want one? “Go on Go on Go on. Have a canonical.”

Go On

Posts (top 5 all-time)

Page title is the post title, that makes sense.

Meta description is either the text from the self post, or, if there isn’t any, it’s everybody’s favourite: “reddit: the front page of the internet”.

Same meta keywords for the “vote, comment, submit” crowd.

No canonicals.

That’s it, I’m done with the on-page stuff for now.

Other stuff I feel compelled to add now I’ve started this nonsense.

Microdata.

Looks like this (some of it at least) would be relevant:
https://schema.org/DiscussionForumPosting

Pagespeed.

Google hates the mobile version of Reddit.  With it’s tiny millions of font sizes, no mobile viewport set and touchy things all too close together, it hates it.

But so does everybody else, which is why nobody ever uses Reddit on mobile, you don’t need a tool to tell you how bad it is.  Although I just used one for that purpose.

Here’s a link to the pagespeed result.

Summary

So, quick wins, that’s what we’re after here. Let’s see:

    • Sort out the slash based duplication, improve crawl efficiency, potentially get greater exposure in search.
    • Sort out the meta descriptions, show more relevant content in search results, improve CTR and brand awareness.
    • Add some structured data, increase Google’s certainty over relevance of post as a search result, get greater visibility.
    • Add canonicals. Go on.

erm…. fix the mobile site?

Vincent Van Sloth by shitty watercolour

Vincent Van Sloth by Shitty Watercolour

One Comment

Leave a Reply