SEO 101: Ensuring Your Site Is Crawlable

From the archives…

This is part three of my series from BlogHer. You can read the intro, part one on relevance, part two on discoverablity, or just read on for tips about crawlability.

You’ve got great content. The search engines know about it. But can they access it? There’s lots of reasons why search engines may not be able to access the content on your pages.

Are you blocking the bots?
You can keep search engines from indexing particular sections of your site using a robots.txt file or robots meta tag. There are lots of valid reasons for blocking search engines from particular pages, but sometimes sites block access accidentally and then wonder why pages aren’t being indexed. Sometimes, choosing the wrong option in your blogging software (in WordPress, this is under “Privacy”) will do it.

Is your server having trouble?
If your server is down or having timeout issues when a search engine comes by, it can’t access the pages of your site. The bots will come back later and try again, but it can be good to know if this is happening, in case there’s a deeper issue you need to look at.

You can see any issues Google had accessing the site in Google Search Console (Crawl > Crawl Errors) or from your site’s server logs.

Is your content in text?
Search engines are text based. Your content needs to be in text in order to get indexed. This means that a site that anything in images, video, javascript (and subsequently Ajax), or Flash won’t be read. [2015 update: Google does a pretty good job now of crawling JavaScript but it’s still not perfect and Bing still has substantial trouble.] Does this mean you can’t use any of these elements? Of course not! Just use them wisely.

This won’t just help search engines. It’ll help your visitors too. Anyone on a screen reader or older mobile devices or slow connections has trouble with these types of elements too. Flash has the additional problem that you can’t get direct links to any of the pages (as I’ve ranted about before) and who wants to get to a site just to see a “loading” graphic for twenty seconds?

So how can you incorporate these elements into your site wisely?

Images:

  • Don’t put large blocks of text in images. Keep text in HTML and use images for, well graphics and pictures.
  • Use alt text to describe images. You do this in the source code. For instance: <img src="https://www.vanessafox.com/imagename.jpg" alt="My Cute Cats" />. Your blogging software may make this even easier for you. In WordPress, the Add Media button includes an Alt Text field.And make sure the alt text really is descriptive. “Logo” may seem descriptive but “Anya’s House of Cheesecake” is a bit more specific.
  • Use descriptive filenames. anyas-cheesecake.jpg is better than logo.jpg.
  • Give each image a caption and ensure the page has lots of text.

Videos

  • If possible, post a text transcript of the video.
  • At the very least, post a textual description.

Flash

  • Use Flash sparingly. Put the text in HTML and use Flash for animated objects.
  • Don’t put navigation in Flash. Search engines will never get past your home page.
  • Do you really need a Flash splash page? Most sites don’t. They only slow down visitors trying to access your content. And search engines think your site is about “loading, loading, loading”.
  • Don’t put your product pages in Flash. No one can link to them or bookmark them, and that makes your visitors very sad indeed. It also makes search engines see your site as one big page with one URL.

An easy test
How can you tell what search engines like Bing can see? Simply turn off images, Flash, and Javascript in your browser. You can also use the Google Search Console Fetch as Googlebot feature to see more specifically what Google can see.

 

And here ends my three part series on using search to drive traffic to your blog. For those of you who were at the BlogHer session, I had a great time talking to you! The important thing is to keep writing great stuff!

16 thoughts on “SEO 101: Ensuring Your Site Is Crawlable

  • Pingback: » the power of search: driving traffic to your blog (a BlogHer recap) | Vanessa Fox. Nude.

  • Pingback: » the power of search: making your blog content discoverable | Vanessa Fox. Nude.

  • aaron

    Ok great info. Buffy, but the title “using search to drive traffic to your blog” leads to another question. 🙂

    I recently changed my static 5-10 page site (that I sell my product from) into a wordpress blog to take advantage of RSS and a few other useful things. I removed “comments” but left RSS, pings and tagging to make it a website that can communicate better with bots and social media.

    Anyhoo, here is a question or two:

    Do search engines like Google treat “content” on blogs any differently than good ol’ static sites? I have noticed the site drop in ranking ever since I changed it into a blog sadly.

    If I remove comments and other social features and blog until the pages are 20 or more (then stop) will search engines like Google realize that I have a good product that needs to continue to rank or will pages become stagnant because “blogs” are treated differently in algorithms. Wheew, out of breath…

    Thank you, much more useful stuff than someone who still works at Google could provide.

    Reply
  • Vanessa

    I don’t know of any ways that search engines treat sites differently based on the kind of site they are (static vs. blog, etc.). The differences in treatment would have to do with side effects of the site type. For instance, a blog might get crawled more often than some static sites simply because it has new content more often.

    If you’ve noticed a rankings drop, I would think that would most likely be due to the migration of your URLs.

    Reply
  • aaron

    Yes, the migration of URLs and the removal of a few pages has really screwed things up. I am the founder of my niche/product and now others are outranking me simply by rewriting the information that no longer exists on my site.

    Not even the most “known” loud mouth, SEO pretty boys have covered or even seen this stuff, thanks for the help V!

    Reply
  • Daniel Dessinger

    Hey, Vanessa,

    Did you happen to meet Penelope Trunk at BlogHer? Or do you two already know each other? I know she spoke in the session on personal branding. Just wondering who knows who of the people I read.

    I’m going to be posting an interview with Penelope next Monday on CultureFeast…

    I’d like to get your thoughts on BlogHer as well. Perhaps you’d consider a brief interview?

    Reply
  • Vanessa

    I don’t know her, and unfortunately I wasn’t able to make that session. I’ll have to check out your interview. I’d be happy to do a brief interview. Send me a quick email (on my about page). I’ll be on a plane today, so I may not be able to grab it right away.

    Reply
  • Neuromancer

    interesting you mention Nike

    At work we did an in house Blog site (as a proof of concept) about football (soccer) boots and we totaly spank Nike, Puma, Adidas for a lot of terms that realy they should have.

    http://www.footy-boots.com/

    if your interested

    Reply
  • Huge Vanessa Fan

    Vanessa,

    Thanks so much for taking the time to spell this stuff out.. you are awesome and i love your writing style.. seriously i want to give you cookies and stuff..

    keep up the great posts

    Reply
  • Sébastien Billard

    If you really want to see a website as a search engine sees it (or some disabled users), you may also want to try directly a text browser like Lynx, which has the advantage of linearizing the content and to totally ignore CSS, Flash, Javascript and so on. It is truely an enriching expérience.

    A pretty nice Windows version is available here : http://csant.info/lynx.htm

    Reply
  • Sean

    It’s been a long time since a post, and I am having Vanessa withdrawal.

    Reply
  • Paul

    Hi Vanessa, just stumbled across your blog via a series of SEO links and wanted to say thanks for the info. Nice style, great to have info ‘from the horses mouth’ (as it were), even if just to confirm or deny the various Google myths that abound. And your own Wikapedia entry – how cool is that!? I particularly liked the note “(the title is a joke; the site is work safe)”
    Look forward to reading more.
    Paul
    PS: Shame about the pictures 😉

    Reply
  • Benj Arriola

    Search engines are text based. Your content needs to be in text in order to get indexed. This means that a site that anything in images, video, javascript (and subsequently Ajax), or Flash won’t be read. Does this mean you can’t use any of these elements? Of course not! Just use them wisely.

    I was playing around with AJAX making it search engine friendly and ended up with http://www.ajaxoptimize.com

    *Each ajax view still has its own permalink.
    *On each permalink, it dynamically loads a default content for that page as plain static html.
    *Javascript only comes in when links are click.
    *Links are plain static URLs so it still gets crawled well.

    Now I got to fix the browser forward and back buttons…

    Reply
  • Get crawled even with AJAx

    You can get crawled even with content such as AJAX and Flash by using a sitemap.

    Reply
  • Vanessa

    Unforunately, that’s not the case. The sitemap can help let search engines know about the pages, but doesn’t help the bots extract content from those pages.

    (FYI – I worked on the Sitemaps project at Google.)

    Reply
  • rob

    Im just testing the type of comment system you have here.

    Reply

Leave a comment to aaron Cancel reply

Your email address will not be published. Required fields are marked *