Knockout JS and SEO

Making sure your webapp is searchable

by

Jason Dent

Revinate Inc.

What Knockout JS

  • Knockout JS is a light weight functional reactive framework and templating system that helps with rapid web application development.
  • Similar to AngularJS and EmberJS
  • MVVM - Model View ViewModel
  • Copy/Pasted from knockoutjs.com

Knockout is a JavaScript library that helps you to create rich, responsive display and editor user interfaces with a clean underlying data model.

Inline Markup

Knockout JS uses data-bind


The item is <span data-bind="text: priceRating"></span> today.
It costs <span data-bind="text: price"></span> euros.
  <script type="text/javascript">     var viewModel = {
price: ko.observable(24.95)
  };
 
viewModel.priceRating = ko.computed(function() { return this.price() > 50 ? "expensive":"affordable"; }, viewModel); </script>

Motivation

  • Goal: Enable Google to search m.interglot.com -- A translation web application aimed at mobile devices.
    • Uses Knockout JS to render JSON from Ajax queries.
    • Has over 1 million pages.
    • PHP backend server.

  • Challenge: To turn content generated by Knockout JS into something a crawler can download.

  • This problem is not limited to Knockout JS
    • Backbone.js, Meteor, Ember.js, Angular JS
    • Any framework without a server-side renderer.

m.interglot.com


What a Crawler Sees


Today

Google's Advice

Ajax Crawling

Overview of Solution

Briefly, the solution works as follows: the crawler finds a pretty AJAX URL (that is, a URL containing a #! hash fragment). It then requests the content for this URL from your server in a slightly modified form. Your web server returns the content in the form of an HTML snapshot , which is then processed by the crawler. The search results will show the original URL.

Aaaaaaaaaaaah! #! #! #!

Nice Diagram

of the hash-bang (#!) approach

Pushstate

will save us!
Long live pushstate!
huhhh? #!#!#!

Maybe someday, but not just yet.

Did many of the SEO Experts just go Brain Dead?
Sadly, there is a long list of SEO Experts saying pushstate will save the day.  They are confused. :-(
Example: http://www.seomoz.org/blog/create-crawlable-link-friendly-ajax-websites-using-pushstate

Verify!
Fetch as Google


Be sure to use webmaster/tools Fetch as Google
to make sure Google is getting what you expect.
or use Chrome's

view-source: http://m.interglot.com/en/nl/verify
If you don't see your dynamically rendered KnockoutJS Ajax content in the html, then Google will NOT index it.

At least not yet...
So, why take the chance. 

What to do?
Use Pushstate

  • To create nice urls:
    http://m.interglot.com /en/nl/create
    Green: webapp url
    Red: dynamic state
  • These nice urls can be used by the crawlers to request the right dynamic content from the server.
  • *** Have the server serve an Html Snapshot ***

Making an
HTML Snapshot

https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot

#3. Much of your content is created on the client-side with JavaScript

If a lot of your content is created in JavaScript, you may want to consider using a technology such as a headless browser to create an HTML snapshot. 

Headless
Browsers

  • phantomjs
    • Webkit
    • most platforms
    • Actively under development
  • zombiejs
    • Webkit
    • Node.js
    • Actively under development

Headless Diagram


Using a
headless Browser

In combination with pushstate and Knockout JS
<html>
  <head>
  ...
  </head>
  <body>
    <div id="dynamic_content" 
      data-bind="template: 
        {name: 'template_pagecontent', data: $data}">
      ... will be filled in by rendered content from server ...
      ... Replaced by the knockoutjs binding ...  
    </div>
    <script type="html/text" id="template_pagecontent">
      ... your html and knockout content 
      ... that will replace dynamic_content element.
    </script>
  </body>
</html>

Page Speed

Using a headless browser isn't free.  
There is both a time and CPU cost.

Remember, your server gets hit twice.
First on the initial page request and
Second on the headless browser request.

So your pages will be twice as slow plus the time it takes the headless browser to render the page.

Do NOT make your visitors wait.

Cache HTML Snippets

You can use a cache to speed everything up.

This is very important if you have ads on your site.  The user visit will cause the ad server to hit your site immediately afterwards. So a cache will make the second call much faster.


Think about using a soft expiration date.  Meaning, if you have the item in the cache, but it is a little bit out of date, serve it anyway and launch a background cache refresh.


Google Bot


Real Person


Headless Browser


Tips for PhantomJS


  1. Capture and stub out Analytics, Ad code, facebook, etc.
    To capture use:
    page.onResourceRequested = function(requestData, networkRequest) {...}); 
    To point to another location use:
    networkRequest.changeUrl("./blank.js"); 
    networkRequest.changeUrl("./ga.js"); 
  2. Use a local copy of your JS files and common libraries like JQuery.
  3. Set the Agent String to a known value so your server knows it is a headless browser request.

Sample Stubs

Google Analytics:

// ga.js
var _gaq = _gaq || [];

Google Tag Services (DFP)

// gpt.js
window.googletag = window.googletag || {};
window.googletag.cmd = window.googletag.cmd || [];
window.googletag.display = window.googletag.display || function(){}; 





Things to Remember

  • Watch out for third-party scripts
    Don't let the headless browser screw up your stats.
    You need to stub out:
    • Google Analytics
    • Adsense/DFP
    • Facebook and other social plugins.
  • Watch out for code that does document.write
  • For a better end user experience, include both the rendered html code and the JSON data for the model view template in the document from the server.
    This will reduce the perceived wait time.
  • Don't get stuck in an infinite loop!

Other Solutions

Now Hiring


We are looking for an Architect level developer for a total revamp of our frontend application using Angular JS and Bootstrap


http://www.revinate.com/about/jobs/

Jason@revinate.com

Made with Slides.com