Can You Now Trust Google To Crawl Ajax Sites?

Home » Can You Now Trust Google To Crawl Ajax Sites?

On October 14, Google announced it no longer recommends the Ajax crawling scheme they published in 2009. Columnist Mark Munroe dives into the question of whether this means you can now count on Google to successfully crawl and index an Ajax site.

javascript-js-ss-1920

Web designers and engineers love Ajax for building Single Page Applications (SPA) with popular frameworks like Angular and React. Pure Ajax implementations can provide a smooth, interactive web application that performs more like a dedicated desktop application.

With a SPA, generally, the HTML content is not loaded into the browser on the initial fetch of the web page. Ajax uses JavaScript to dynamically communicate with the web server to create the HTML to render the page and interact with the user. (There is a technique called “Server-Side Rendering” where the JavaScript is actually executed on the server and the page request is returned with the rendered HTML. However, this approach is not yet supported on all the SPA frameworks and adds complexity to development.)

One of the issues with SPA Ajax sites has been SEO. Google has actually been crawling some JavaScript content for a while. In fact, this recent series of tests confirmed Google’s ability to crawl links, metadata and content inserted via JavaScript. However, websites using pure SPA Ajax frameworks have historically experienced challenges with SEO.

Back in 2009, Google came up with a solution to make Ajax crawlable. That method either creates “escaped fragment” URLs (ugly URLs) or more recently, clean URLs with a Meta=”fragment” tag on the page.

The escaped fragment URL or meta fragment tag instructs Google to go out and get a pre-rendered version of the page which has executed all the JavaScript and has the full HTML that Google can parse and index. In this method, the spider serves up a totally different page source code (HTML vs. JavaScript).

With the word out that Google crawls JavaScript, many sites have decided to let Google crawl their SPA Ajax sites. In general, that has not been very successful. In the past year, I have consulted for a couple of websites with an Ajax Angular implementation. Google had some success, and about 30 percent of the pages in Google’s cache were fully rendered. The other 70 percent were blank.

A popular food site switched to Angular, believing that Google could crawl it. They lost about 70 percent of their organic traffic and are still recovering from that debacle. Ultimately, both sites went to pre-rendering HTML snapshots, the recommended Ajax crawling solution at the time.

And then, on Oct 14, Google said this:

We are no longer recommending the AJAX crawling proposal we made back in 2009.

Note that they are still supporting their old proposal. (There have been some articles announcing that they are no longer supporting it, but that is not true — they are simply no longer recommending that approach.)

In deprecating the old recommendation, they seemed to be saying they can now crawl Ajax.

Then, just a week after the announcement, a client with a newly launched site asked me to check it out. This was an Angular site, again an SPA Ajax implementation.

Upon examining Google’s index and cache, we saw some partially indexed pages without all the content getting crawled. I reiterated my earlier recommendation of using HTML snapshots or progressive enhancement.

This site was built with Angular, which does not yet support server-side rendering (again, in this case, the server initially renders a page to serve up the HTML document), so progressive enhancement would be difficult to support, and HTML snapshots are still the best solution for them.

She replied, “But why? Everything I read tells me Google can crawl Ajax.”

Can they? Let’s take a deeper look at the new recommendation in regard to Ajax.

[Read the full article on Search Engine Land.]

Source – MarketingLand.com