pPosted by a href=\”http://moz.com/community/users/4896\”Everett/a/pp
strongThis is a href=”http://www.goinflow.com” target=”_blank”Inflow’s/a process for doing content audits./strong
strong /strongIt may not be the “best” way to do them every time, but we’ve managed to keep it fairly agile in terms of how you choose to analyze, interpret and make recommendations on the data. The fundamental parts of the process remain about the same across numerous types of websites no matter what their business goals are: Collect all of the content URLs on the site and fetch the data you need about each URL. Then analyze the data and provide recommendations for each content URL. Theoretically it’s simple. In practice, however, it can be a daunting exercise if you don’t have a plan or process in place. By the end of this post we hope you’ll have a good start on both./ph2Table of Contents/h2
div class=”box-content”
ul class=”nav nav-side”
li class=”item”a href=”#h2-1″ class=”item-header”The many purposes of a content audit/a/li
li class=”item”a href=”#h2-2″ class=”item-header”A content audit case study/a/li
li class=”item”a href=”#h2-3″ class=”item-header”50,000-foot overview of the process/a/li
li class=”item”a href=”#h2-4″ class=”item-header”Our documents/a/li
li class=”item”a href=”#h3-1″Content audit scenarios/a/li
li class=”item”a href=”#h3-2″Content audit dashboard spreadsheet/a/li
li class=”item”a href=”#h3-3″Content strategy/a/li
li class=”item”a href=”#h2-5″ class=”item-header”Recommended exports and data sources/a/li
li class=”item”a href=”#h2-6″ class=”item-header”A step-by-step example of our process/a/li
li class=”item”a href=”#h3-4″Step 1: Assess the situation and choose a scenario/a/li
li class=”item”a href=”#h3-5″Step 2: Scan the site/a/li
li class=”item”a href=”#h3-6″Step 3: Import the URLs and start the tool/a/li
li class=”item”a href=”#h3-7″Step 4: Import the tool output into the dashboard/a/li
li class=”item”a href=”#h3-8″Step 5: Import GWT data/a/li
li class=”item”a href=”#h3-9″Step 6: Perform keyword research/a/li
li class=”item”a href=”#h3-10″Step 7: Tying the keyword data together/a/li
li class=”item”a href=”#h3-11″Step 8: Time to analyze and make some decisions!/a/li
li class=”item”a href=”#h3-12″Step 9: Content gap analysis and other value-adds/a/li
li class=”item”a href=”#h3-13″Step 10: Writing up the content audit strategy document/a/li
li class=”item”a href=”#h2-7″ class=”item-header”Resources, links and post-scripts…/a/li
/divh2 id=”h2-1″The many purposes of a content audit/h2p
A content audit can help in a variety of different ways, and the approach can be customized for any given scenario. I’ll write more about potential “scenarios” and how to approach them below. For now, here are some things a content audit can help you accomplish…/pol
liDetermine the most effective way to escape a Panda penalty/li liDetermine which pages need copywriting / editing/li liDetermine which pages need to be updated and made more current, and prioritize them/li liDetermine which pages should be consolidated due to overlapping topics/li liDetermine which pages should be pruned off the site, and what the approach to pruning should be/li liPrioritize content based on a variety of metrics (e.g. visits, conversions, PA, copyscape risk score…)/li liFind content gap opportunities to drive content ideation and editorial calendars/li liDetermine which pages are ranking for which keywords/li liDetermine which pages “should” be ranking for which keywords/li liFind the strongest pages on a domain and develop a strategy to leverage them/li liUncover content marketing opportunities/li liAuditing and creating an inventory of content assets when buying/selling a website/li liUnderstanding the content assetsnbsp;of a new client (i.e. what you have to work with)/li liAnd many more…/li/olh2 id=”h2-2″A content audit case study/h2p
img style=”float: left; width: 192px; margin: 0px 10px 10px 0px;” alt=”8 Times the Leads” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53c5716ebd2d40.53437931.jpg”/pp
Inflow’s technical SEO specialist
a target=”_blank” href=”http://www.goinflow.com/author/rick/”Rick Ramos/a performed an earlier version of our content audit last year for
a href=”http://designfiles.net/” target=”_blank”Phases Design Studio/a, who graciously permitted us to share their case study. After taking an inventory of all content URLs on the domain, Rick outlined a plan to noindex/follow and remove from their sitemap many of the older blog posts that were no longer relevant, and weren’t candidates for a content refresh. The site also had a series of campaign-based landing pages dating back from 2006. These pages typically had a life cycle of a few months, but were never removed from the site or Google’s index. Rick recommended that these pages be 301 redirected to a few evergreen landing pages that would be updated whenever a new campaign was launched—a tactic that works particularly well on seasonal pages for eCommerce sites (e.g. 2014 New Years Resolution Deals). Still more pages were candidates to be updated / refreshed, or improved in other ways./ph4
strongThe results/strong/h4p
Shortly after the recommendations were implemented the client called to ask if we knew why they were suddenly seeing eight times the amount of leads they were used to seeing month over month./pp style=”text-align: center;”
img style=”display: block; margin: auto;” alt=”Analytics traffic graph after a content audit” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53c5fcfb5d2060.35990402.jpg”/pp
strongWhy we thinknbsp;it worked/strong/h4p
There are several probable reasons why this approach worked for our client. Here are a few of them…/pol
liThe ratio of useful, relevant, unique content to thin, irrelevant, duplicate content was greatly improved./li liThe PageRank from dozens of expired campaign landing pages was consolidated into a relatively few evergreen pages (via 301 redirects and consolidation of internal linking signals)./li liCrawl budget is now being used more efficiently./li/olp
strongThis improved the overall customer experience on the site, as well as organic search rankings for important topic areas that were consolidated./strong/pp
Since then we have refined and improved the process and have been performing them on a variety of sites with great success. It works particularly well for panda recoveries on large-scale content websites, and for prioritizing which eCommerce product copy needs to be rewritten first./ph2 id=”h2-3″A 50,000-foot overview of our process/h2p
Inflow’s content auditing process changes depending on the client’s goals, needs and budget. Generally speaking, however, here is how we approach it…/pol
listrongGather all available URLs on the site
liUse a href=”http://www.screamingfrog.co.uk/” target=”_blank”Screaming Frog/a (or another crawl tool), CMS Exports, Google Analytics and Webmaster Tools/li /ul/li listrongImport URLs into a tool that gathers KPIs and other data for each URL
liUse a href=”http://urlprofiler.com/” target=”_blank”URL Profiler/a, a custom in-house tool, or other data-gathering resources
liThings to gather: Moz Metrics, Google Analytics KPIs, GWT Data, Magestic SEO metrics, Titles, Descriptions, Wordcounts, canonical tags…/li /ul/li /ul/li listrongAnalyze the content
liChoose to keep as-is, improve, remove or consolidate.
liWrite detail strategies for each./li /ul/li /ul/li listrongPerform keyword research
liOptional: Provide relevancy scores, topic buckets and buying stage/s for each keyword/li liMatch keywords to pages that already rank within a keyword matrix/li liMatch non-ranking keywords to the best page for guiding on-page changes/li /ul/li listrongDo content gap ideation
liUse keywords that did not have an appropriate page match to fill in the Content Gap tab.
liOptional: Incorporate buying cycles into content gap ideation/li /ul/li /ul/li listrongWrite the content strategy
liSummarize the findings and present a strategy for optimizing existing pages, creating new pages to fill gaps, explain how many pages are being removed, redirected, etc…/li /ul/li/olp
Each piece of the process can be customized for the needs of a particular website.nbsp;/strong/pp
For example, when auditing a very large content site with lots of duplicate/thin/overlapping content issues we may skip the entire keyword research and content gap analysis part of the process and focus on pruning the site of these types of pages and improving the rest. Alternatively, a site without much content may need to focus on keyword research and content gaps. Other sites may be looking specifically for content assets that they can improve, repeat in new ways or leverage for newer content. One example of a very specific goal would be to identify interlinking opportunities from strong, older pages to promising, newer pages.nbsp;For now it is sufficient to know that
strongthe framework can be changed as needed in a way that could dramatically affect where you spend your time in the process, or even which steps you may want to skip altogether./strong/ph2 id=”h2-4″Our documents/h2p
There are several major steps in the content auditing process that require various documents. While I’m not providing links to our internal SOP documentation (mainly because it’s still evolving), I will describe each document and provide screenshots and links to examples / templates so you can have a foundation around which to customize one for your own needs./ph3 id=”h3-1″Content audit scenarios/h3p
strong/strongWe keep a list of recommendations for common scenarios to guide our approach to content audits. While every situation is unique in its own ways, we find this helps us get 90% of the way to the appropriate strategy for each client much faster. I discuss this in more detail later, but if you’d like to take a peek a href=”http://www.goinflow.com/content-audit-strategies/” target=”_blank”click here/a./ph3 id=”h3-2″Content audit dashboard spreadsheet/h3p
We were originally working within Google Docs, but as we started pulling in from more sources and performing more vLookups the spreadsheet would load so slowly on big sites as to make it nearly impossible to complete an audit. For this reason we have recently moved the entire process over to Excel, though
a href=”https://docs.google.com/spreadsheet/ccc?key=0Aupv-89nVqawdGdPTFZmeEUxSnIwdW9UWmFFWlVlVkEusp=sharing” target=”_blank”this template/a we’re providing is in Google Docs format. Below are some of the tabs you may want in this spreadsheet…/ph4
strongThe “Content Audit” tab/strong/h4p
This tab within the dashboard is where most of the work is done. Other tabs pull data from this one by VLookup. Whether the data is fetched by API and compiled by one tool (e.g. URL Profiler) or exported manually from many tools and compiled manually (by VLookup), the end result should be that you have all of the metrics needed for each URL in one place so you can begin sorting by various metrics to discern patterns, spot opportunities and make educated decisions on how to handle each piece of content, and the content strategy of the site as a whole./pp
img style=”display: block; margin: auto;” alt=”Content Audit Metrics Screenshot” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53c60573cd3f63.17502062.jpg”/pp
You can customize the process to include whatever metrics you’d like to use. Here are the ones we’ve ended up with after some experimentation, as well as the source of the data:/pul
liAction (internal)
liLeave As-Is/li liImprove/li liConsolidate/li liRemove/li /ul/li liStrategy (internal)
liA more detailed version of “action”. Example: Remove and 301 redirect to /another-page/./li /ul/li liPage Type (internal via URL patterns or CMS export)
liThis is and optional step for certain situations. Example: Article, Product, Category…/li /ul/li liSource (original source of the URL, e.g. Google Analytics, Screaming Frog)/li liCopyScape Risk Score (copyscape API)/li liTitle Tag (Screaming Frog)/li liTitle Length (Screaming Frog)/li liMeta Description (Screaming Frog)/li liWord Count (Screaming Frog)/li liGA Entrances (Google Analytics API)/li liGA Organic Entrances (Google Analytics API)/li liMoz Links (Moz API)/li liMoz Page Authority (Moz API)/li liMozRank (Moz API)/li liMoz External Equity Links (Moz API)/li liStumbleupon (Social Count API)/li liFacebook Likes (Social Count API)/li liFacebook Shares (Social Count API)/li liGoogle Plus One (Social Count API)/li liTweets (Social Count API)/li liPinterest (Social Count API)/li/ulp
img style=”display: block; margin: auto;” alt=”Screenshot of Content Audit Dashboard” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53c603927c4f44.39711151.jpg”/pp
strongOur recommendations typically fall into one of four “Action” categories: “Keep As-Is”, “Remove”, “Improve”, or “Consolidate”./strong Further details (e.g. remove and 404, or remove and 301? If 301, to where?) are provided in a column called “Strategy”. Some URLs (the important ones) will have highly customized strategies, while others may have been bulk processed, meaning thousands could share the same strategy (e.g. rewriting duplicate product description copy). The “Action” column is limited in choices so we can sort the data effectively (e.g. see all pages marked as “removed”) while the “Strategy” column can be more free-form and customized to the URL (e.g. consolidate /buy-blue-widgets/ content into /buying-blue-widgets/ and 301 redirect the former to the latter to avoid duplicating the same topic)./ph4
strongThe “Keyword Research” tab/strong/h4p
This tab includes keywords gathered from a variety of sources, including brainstorming for seed keywords, mining Google Webmaster Tools, PPC campaigns, the AdWords Keyword Planner and several other tools. Search Volume and Ad Competition (not shown in this screenshot) are pulled from Google’s
a href=”http://www.google.com/sktool/”Keyword Planner/a. The average ranking position comes from GWT, as does the top ranking page. The relevancy score is something we typically ask the client to do once we’ve cleaned out most of the obvious junk keywords./pp
img style=”display: block; margin: auto;” alt=”Keyword Research Screenshot” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53c60cffc458b6.21013284.jpg”/ph4
strongThe “Keyword Matrix” tab/strong/h4p
This tab includes URLs for important pages, and those that are ranking for – or are most qualified to rank for – important topics. It essentially matches up keywords with the best possible page to guide our copywriting and on-page optimization efforts./pp style=”text-align: center;”
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53cf0f40de73e4.07674876.jpg”/pp
Sometimes the KWM tab plays an important role in the process, like when the site is relatively new or unoptimized. Most of the time it takes a back-seat to other tabs in terms of strategic importance./ph4
strongThe “Content Gaps” tab/strong/h4p
This is where we put content ideas for high-volume, highly relevant keywords for which we could not find an appropriate page. Often it involves keywords that represent stages in the buying cycle or awareness ladder that have been overlooked by the company. Sometimes it plays an important role, such as with new and/or small sites. Most of the time this also takes a back-seat to more important issues, like pruning./ph4strongThe “Prune” tab/strong/h4p
If it was marked for “Remove” or “Consolodate” it should be on this tab. Whether it is supposed to be removed and 301 redirected, canonicalized elsewhere, consolidated into another page, allowed to stay up but with a robots “noindex” meta tag, removed and allowed to 404/410… or any number of “strategies” you might come up with, these are the pages that will no longer exist once your recommendations have been implemented. I find this to be a very useful tab. For example, one could export this tab, send it to a developer (or a company like
a href=”http://wpcurve.com/”WP Curve/a), and have someone get started on most or all of the implementation. Our mantra for low-quality, under-performing content on sites that may have a Panda-related traffic drop is to strongimprove it or remove it/strong./ph4
strong”Imported Data” tabs/strong/h4p
In addition to the tabs above, we also have data tabs that are in the spreadsheet to house exported data from the various sources so we can perform Vlookups based on the URL to populate data in other tabs. These data tabs include:/pul
liGWT Top Queries/li liGWT Top Pages/li liCopyScape Scores (typically for up to 1,000 URLs)/li liKeyword Data/li/ulp
The more data that can be compiled by a tool like URL Profiler, the fewer data tabs you’ll need and the faster this entire process will go. Before we built the internal tool to automate parts of the process, we also had tabs for GA data, Moz data, and the initial Screaming Frog export./pp
img style=”float: left; width: 188px; margin: 0px 10px 10px 0px;” alt=”Vlookup Master!” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53d666f941f326.58302588.jpg”/pp
If you don’t know how to do a Vlookup there are plenty of online tutorials for Excel and GoogleDocs Spreadsheets.
a href=”http://www.howtogeek.com/howto/13780/using-vlookup-in-excel/” target=”_blank”Here’s one/a I found useful for Excel. Alternatively, you could import all of the data into the tabs and ask someone more spreadsheet-savvy on your team to do the lookups. Our resident spreadsheet guru is a href=”http://www.goinflow.com/author/caesar/” target=”_blank”Caesar Barba/a, and he has great hair. Below is an example of a simple Vlookup used to bring the “Action” over from the Content Audit tab for a URL in the Keyword Matrix tab…/pblockquote
p style=”text-align: center;”
=VLOOKUP(A2,’Content Audit’!A:C,3,FALSE)
/p/blockquoteh3 id=”h3-3″Content Strategy/h3p
The Content Audit Dashboard is just what we need internally: A spreadsheet crammed with data that can be sliced and diced in so many useful ways that we can always go back to it for more insight and ideas. Some clients appreciate it as well, but most are going to find the greater benefit in our final content strategy, which includes a high-level overview of our recommendations from the audit./pp
img alt=”Content Strategy Screenshot from Inflow” style=”width: 507px; display: block; margin: auto;” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53cf29a1e297a0.72544449.jpg”/ph2 id=”h2-5″Recommended exports and data sources/h2p
There are many options for getting the data you need into one place so you can simultaneously see a broad view of the entire content situation, as well as detailed metrics for each URL. For URL gathering we use
strongScreaming Frog/strong and strongGoogle Analytics/strong. For data we use strongGoogle Webmaster Tools/strong (GWT), Google Analytics (GA), strongSocial Count/strong (SC), strongCopyscape/strong (CS), strongMoz/strong, strongCMS exports/strong, and a few other data sources as needed./pp
However we’ve been experimenting with using
strongURL Profiler /stronginstead of our internal tool to pull all of these data-sources together much faster. URL Profiler is a few hundred bucks and is very powerful. It’s also somewhat of a pain to set up the first time, so be prepared for several hours of wrangling down API keys before getting all of the data you need./pp
No matter how you end up pulling it all together in the end, doing it yourself in Excel is always an option for the first few times./ph2 id=”h2-6″A step-by-step example of our process/h2p
Below is the step-by-step process for an “average” client – whatever that means. Let’s say it is
stronga medium-sized eCommerce client with about 800-900 pages indexed by Google/strong, including category, product, blog posts and other pages. They don’t have an existing penalty that we know of, but could certainly be at risk of being affected by Panda due to some thin, overlapping, duplicate, outdated and irrelevant content on the site./ph3 id=”h3-4″Step 1: Assess the situation and choose a scenario/h3p
Every situation is different, but we have found common similarities based on two primary factors – The size of the site and its content-based penalty risk. Below is a screenshot from our list of recommended strategies for common content auditing scenarios, which can be found
a href=”http://www.goinflow.com/content-audit-strategies/”here on GoInflow.com/a./pp
a href=”http://www.goinflow.com/content-audit-strategies/”img style=”display: block; margin: auto;” alt=”Inflow Content Audit Scenarios” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ce7633866325.01363804.jpg”/a/pp style=”text-align: center;”
Each of the colored boxes drops down to reveal the strategy for that scenario in more detail.
Hat tip to
a href=”http://www.portent.com/blog/internet-marketing/marketers-stop-honeymoon-oblivion-cycle.htm” target=”_blank”Ian Lurie’s Marketing Stack/a for design inspiration./pp
The site described above would fall into the second box within purple column (
strongFocus: Content Audit with an eye to Improve and/or Prune, followed by KWM for key pages/strong). Here is the reasoning behind that…/pp
The site is in danger of a penalty (though it does not appear to have one “yet”) so we follow the Panda matra:
strongImprove it or Remove it. /strongThe size of the site determines which of those two (improve or remove) gets the most attention. Smaller sites need less pruning (scalpel), while larger sites need much more (hatchet). Smaller sites often need some keyword research to determine if they are covering all of the topic areas for various stages in the customer’s buying cycle, while larger sites typically have the opposite problem —gt; too many pages covering overlapping topic areas with low-quality (thin, duplicate, irrelevant, outdated, poorly written, automated…) content. Such a site would not require the keyword research, and would therefore not be getting a keyword matrix or content gap analysis, as the focus would be primarily about pruning the site.br
Our focus in this example will be to audit the content with an eye to improve and/or Remove low performing pages, followed by keyword research and a keyword matrix for the primary pages, including the home page, categories, blog home and key product pages, as well as certain other topical landing pages./pp
As it turns out, this hypothetical website has lots of manufacturer-supplied product descriptions. We’re going to need to prioritize which ones get rewritten first because the client does not have the cash-flow to do them all at once. When budget and time is a concern, we typically shoot for the 80/20 rule: Write great content for the top 20% of pages right away, and do the other 80% over the course of 6-12 months as time/budget permit./pp
Because this site doesn’t have an existing penalty, we will recommend that all pages stay indexed. If they had a penalty already, we would recommend they noindex,follow the bottom 80% of pages, gradually releasing them back into the index as they are rewritten. This may not be the way you choose to handle the same situation, which is fine, but the point is you can easily sort the pages by any number of metrics to determine a relative “priority”. The bigger the site and tighter the budget, the more important it is to prioritize what gets worked on first./pp
strongCauses of Content-Related Penalties/strong/pp
For the purpose of a content audit we are only concerned with content-related penalties (as opposed to links and other off-page issues), which typically fall under three major categories: Quality, Duplication, and Relevancy. These can be further broken down into other issues, which include – but are not limited to:/pul
liTypical low quality content
liPoor grammar, written primarily for search engines (includes keyword stuffing), unhelpful, inaccurate…/li /ul/li liCompletely irrelevant content
liOK in small amounts, but often entire blogs are full of it./li liA typical example would be a “linkbait” piece circa 2010./li /ul/li liThin / Short content
liGlossed over the topic, too few words, all image-based content…/li /ul/li liCurated content with no added value
liComprised almost entirely of bits and pieces of content that exists elsewhere./li /ul/li liMisleading Optimization
liTitles or keywords targeting queries for which content doesn’t answer or deserve to rank/li liGenerally not providing the information the visitor was expecting to find/li /ul/li liDuplicate Content
liInternally duplicated on other pages (e.g. categories, product variants, archives, technical issues…)/li liExternally duplicated (e.g. manufacturer product descriptions, product descriptions duplicated in feeds used for other channels like Amazon, shopping comparison sites and eBay, plagiarized content…)/li /ul/li liStub Pages (e.g. “No content is here yet, but if you sign in and leave some user-generated-content then we’ll have content here for the next guy”. By the way, want our newsletter? Click an AD!)/li liIndexable internal search results/li liToo many indexable blog tag or blog category pages/li liAnd so forth and so-on…/li/ulp
If you are unsure about the scale of the site’s content problems, feel free to do step 2 before deciding on a scenario…/ph3 id=”h3-5″Step 2: Scan the site/h3p
We use
a href=”http://www.screamingfrog.co.uk/” target=”_blank”Screaming Frog/a for this step, but you can adapt this process to whatever crawler you want. This is how we configure the spider’s “Basic” and “Advanced” tabs…/pp style=”text-align: center;”
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ce7fd401b519.70511833.jpg”/pp
nbsp;And the advanced tab…/pp style=”text-align: center;”
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ce805f5345b7.30413187.jpg”/pp
Notice that “crawl all subdomains” is checked. This is optional, depending on what you’re auditing. We are respecting “meta robots noindex”, “rel = canonical” and robots.txt. Also notice that we are
strongnot/strong crawling images, CSS, JS, flash, external links…. This type of stuff is what we look at in a Technical SEO Audit, but would needlessly complicate a “Content” Audit. What we’re looking for here are all of the indexable HTML pages that might lead a visitor to the site from the SERPs, though it may certainly lead to the discovery of technical issues./pp
strongExport the complete list of URLs/strong and related data from Screaming Frog into a CSV file./ph3 id=”h3-6″Step 3: Import the URLs and start the tool/h3p
We have our own internal “Content Auditing Tool”, which takes URLs and data from Screaming Frog and Google Analytics, de-dupes them, and pulls in data from Google Webmaster Tools, Moz, Social Count and Copyscape for each URL. The tool is a bit buggy at times, however, so I’ve been experimenting with
a href=”http://urlprofiler.com/” target=”_blank”URL Profiler/a, which can essentially accomplish the same goal with fewer steps and less upkeep. We need the “Agency” version, which is about $400 per year, plus tax. That’s not too bad, considering we’d already spent several thousand on our internal tool by the time a href=”https://twitter.com/gareth_brown” target=”_blank”Gareth Brown/a released URL Profiler publicly. :-//pp
Below is a screenshot of what you’ll see after downloading the tool. I’ve highlighted the boxes we currently check, though it depends on the tools/APIs to which you already subscribe and will differ by user. We’ve only just started playing with uClassify for the purpose of semi-automating our topic bucketing of pages, but I don’t have a process to share yet (feel free to comment with advice)…/pp
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ce8595c23042.24842448.jpg”/pp
Right-click on the URL List box and choose “Import From File”, then choose the ScreamingFrog export or any other list of URLs. There are also options to import from the clipboard or XML sitemap. Full documentation for URL Profiler
a href=”http://urlprofiler.com/documentation/#/documentation/getting-started/” target=”_blank”can be found here/a. Below are two output screenshots to give you an idea of what you’re going to end up with…/pp style=”text-align: center;”
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ce8a180be8b1.01806648.jpg”/pp style=”text-align: center;”
The output changes depending on which boxes you check and what API access you have./pp style=”text-align: center;”
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ce8a33708573.30062820.jpg”/ph3 id=”h3-7″Step 4: Import the tool output into the dashboard/h3p
As described in the 50,000 foot overview above, we have a spreadsheet template with multiple tabs, one of which is the “Content Audit” tab.
strongThe tool output gets brought into the Content Audit tab of the dashboard./strong Our internal tool automatically ads columns for Action, Strategy, Page Type and Source (of the URL). You can also add these to the tab after importing the URL Profiler output. Page Type and URL Source are optional, but Action and Strategy are key elements of the process./pp style=”text-align: center;”
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ce8df2198446.63487289.jpg”/pp
Our hypothetical client requires a Keyword Matrix. However, if your “scenario” does not involve keyword research (i.e. if it is a big site with content penalty risks) you can skip steps 5-7 and move straight to “Step 8 – Time to Analyze and Make Some Decisions”./ph3 id=”h3-8″Step 5: Import GWT data/h3p
strongMatch existing URLs from the content audit to keywords for which they already rank in Google Webmaster Tools/strong/pp
There may be a way to do this with URL Profiler. If so, I haven’t found it yet. Here is what we do to grab the landing page and associated keyword/query data from Google Webmaster Tools, which we then import into two tabs (GWT Top Queries and GWT Top Pages). These tabs are helpful when filling out the Keyword Matrix because they tell you which pages Google is already associating with each ranking keyword. This step can actually be skipped altogether for huge sites with major content problems because the “Focus” is going to be on pruning the site of low quality content, rather than doing any keyword research or content gap analysis./pp
strongInstructions for Importing Top Pages from GWTbr
liLog into GWT from a Chrome browser/li liGo to Search Traffic —gt; Search Queries/li liSwitch the view to “Top pages” (default is “Top queries”)/li liChange the date range to start as far back as possible (i.e. 3 months)/li liExpand the amount of rows to show to the maximum of 500 rows
liThis will put the s=500 parameter in the URL. Change s=500 to s=10000 or however many rows of data are available/li liSee bottom of GWT page (e.g. 1-500 of ####)./li /ul/li liIn the Chrome menu go to View —gt; Developer —gt; Javascript Console/liliCopy and Paste the following script into the console window and press Enter./liliThis action should expand all of the drop-downs to show the keywords under each “page” URL and then open up a dialog window that will ask you to save a CSV file: (a href=”http://www.lunametrics.com/blog/2014/01/23/google-webmaster-tools-data-not-provided/” style=”background-color: initial;”more info here/a and a href=”http://www.lunametrics.com/blog/2014/04/30/gwt-top-pages-export-bookmarklet/” style=”background-color: initial;”here/a)./li liThe script is also available in a javascript bookmarklet a target=”_blank” href=”http://www.lunametrics.com/blog/2014/04/30/gwt-top-pages-export-bookmarklet/”on Lunametrics.com/a…/li /ulli
pre(function(){eval(function(p,a,c,k,e,r){e=function(c){return(clt;a?”:e(parseInt(c/a)))+((c=c%a)gt;35?String.fromCharCode(c+29):c.toString(36))};if(!”.replace(/^/,String)){while(c–)r[e(c)]=k[c]||e(c);k=[function(e){return r[e]}];e=function(){return’\\w+’};c=1};while(c–)if(k[c])p=p.replace(new RegExp(‘\\b’+e(c)+’\\b’,’g’),k[c]);return p}(‘C=M;k=0;v=e.q(\’1g-1a-18 o-y\’);z=16(m(){H(v[k]);k++;f(kgt;=v.c){15(z);A()}},C);m H(a){a.h(\’D\’,\’#\’);a.h(\’11\’,\’\’);a.F()}m A(){d=e.10(\’Z\’).4[1].4;2=X B();u=B.W.R.Q(d);7=e.q(\’o-G-O\’);p(i=0;ilt;7.c;i++){d=u.J(7[i]);2.K([d,7[i].4[0].4[0].j])}7=e.q(\’o-G-14\’);p(i=0;ilt;7.c;i++){d=u.J(7[i]);2.K([d,7[i].4[0].4[0].j])}2.N(m(a,b){P a[0]-b[0]});p(i=2.c-1;igt;0;i–){r=2[i][0]-2[i-1][0];f(r===1){2[i-1][1]=2[i][1];2[i][0]++}}5=”S\\T\\U\\V\\n”;9=e.q(“o-y-Y”);6=0;I:p(i=0;ilt;9.c;i++){f(2[6][0]===i){E=2[6][1];12{6++;f(6gt;=2.c){13 I}r=2[6][0]-2[6-1][0]}L(r===1);2[6][0]-=(6)}5+=E+”\\t”;l=9[i].4[0].4.c;f(lgt;0)5+=9[i].4[0].4[0].j+”\\t”;17 5+=9[i].4[0].w+”\\t”;5+=9[i].4[1].4[0].w+”\\t”;5+=9[i].4[3].4[0].w+”\\n”;5=5.19(/”|\’/g,\’\’)}x=”1b:j/1c;1d=1e-8,”+1f(5);s=e.1h(“a”);s.h(“D”,x);s.h(“1i”,”1j.1k”);s.F()}’,62,83,’||indices||children|thisCSV|count|pageTds||queries|||length|temp|document|if||setAttribute||text|||function||url|for|getElementsByClassName|test|link||tableEntries|pages|innerHTML|encodedUri|detail|currInterval|downloadReport|Array|timeout1|href|thisPage|click|expand|expandPageListing|buildCSV|indexOf|push|while|25|sort|open|return|call|slice|page|tkeyword|timpressions|tclicks|prototype|new|row|grid|getElementById|target|do|break|closed|clearInterval|setInterval|else|block|replace|inline|data|csv|charset|utf|encodeURI|goog|createElement|download|GWT_data|tsv’.split(‘|’),0,{}))})();
Ignore any dialog windows that pop up.
You can check “Prevent this page from creating additional dialogs” to disable them.
img src=”http://d2v4zi8pl64nxt.cloudfront.net/content-audit-tutorial/53cefd996c96f6.48416937.png” height=”281px;” width=”502px;”
liImport the resulting download.csv file from GWT into the “GWT Top Pages” tab in the Content Auditing Dashboard./li /ul/olp
strongInstructions for Importing Top Queries from GWTbr
liWithin GWT switch back to Top Queries./li li Adjust the date to go back as far as you can./li li Expand the amount of rows to show to the maximum of 500 rows
liThis will put the s=500 parameter in the URL. Change s=500 to s=10000 or however many rows of data are available/li ol
liSee bottom of GWT page (e.g. 1-500 of ####)./li /ol/ol/li li Select “Download this table” as a CSV file/li li Import the resulting TopSearchQueries.csv file from GWT into the “GWT Top Queries” tab in the Content Auditing Dashboard./li /olp
img src=”http://d2v4zi8pl64nxt.cloudfront.net/content-audit-tutorial/53cefd9a46ccf4.74583615.png” height=”421px;” width=”624px;”
/p/li/olh3 id=”h3-9″Step 6: Perform keyword research/h3p
This is another optional step, depending on the focus/objective of the audit. It is also highly customizable to your own KWR process. Use whatever methods you like for gathering the list of keywords (e.g. brainstorming, SEMRush, Google Trends, Uber Suggest, GWT, GA…). Ensure all “junk” and irrelevant keywords are removed from the list, and run the rest through a single tool that collects search volume and competition metrics. We use the Google Adwords Keyword Planner, which is outlined below./pol
liGo to a href=”http://www.google.com/sktool/”www.google.com/sktool//a while logged into our Google email account associated AdWords./li liSelect “Get search volume for a list of keywords or group them into ad groups”, paste in your list of keywords and click “Get search volume”.
liNote: At this point you should have already expanded the list as much as you need/want to so you’re just gathering data and organizing them now./li liNote: The copy/paste method is limited to 1,000 keywords. You can get up to 3,000 by uploading your simple .txt file./li /ol/li liGo to the “Keyword Ideas” tab on the next screen and Add All keywords to the plan./li liGo to the “Ad Group Ideas” tab and choose to Add All of the ad groups to the plan./li liDownload the plan, as seen in the screenshot below./li liImport the data into the AdWords Data tab of the Content Auditing Dashboard/li/olp
img src=”http://d2v4zi8pl64nxt.cloudfront.net/content-audit-tutorial/53cefd9b4e8884.45123493.png” height=”253px;” width=”624px;”/pp
Use the settings below when downloading the plan:/pp
img src=”http://d2v4zi8pl64nxt.cloudfront.net/content-audit-tutorial/53cefd9c620f44.87092884.png” style=”border: medium none; transform: rotate(0rad);” height=”281px;” width=”554px;”/ph3 id=”h3-10″Step 7: Tying the keyword data together/h3p
Again, you don’t need to do this step if you’re working on a large site and the focus is on pruning out low quality content. The GWT Queries and KWR steps provide data needed to develop a “Keyword Matrix” (KWM), which isn’t necessary unless part of your focus is on-page optimization and copywriting of key pages. Sometimes you just need to get a client out of a penalty, or remove the danger of one. The KWM comes in handy for the important pages marked as “Improve” within the Content Audit tab just so the person writing the copy understands which keywords are important for that page. It’s SEO 101 and you can do it anyway you like using whatever tools you like./pp
Google Adwords has given you the keyword, search volume and competition. Google Webmaster Tools has given you the ranking page, average position, impressions, clicks and CTR for each keyword. Pull these together into a tab called “Keyword Research” using Vlookups. You should end up with something like this:/pp
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ceb2629f3057.63072307.jpg”/pp
The purpose of these last few steps was to help with the
strongKWM/strong, an example of which is shown below:/pp
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53ceb7fe511605.17614749.jpg”/ph3 id=”h3-11″Step 8: Time to analyze and make some decisions!/h3p
All of the data is right in front of you, and your path has been laid out using the
a href=”http://www.goinflow.com/content-audit-strategies/” target=”_blank”Content Audit Scenarios tool/a. From here on the actual step-by-step process becomes much more open to interpretation and your own experience / intuition. Therefore, do not consider this a linear set of instructions meant to be carried out one after another. You may do some of them and not others. You may do them a little differently. That is all fine as long as you are working toward the goal of determining what to do, if anything, for each piece of content on the website./pul
listrongSort by Copyscape Risk Score/strong
liWhich of these pages should be rewritten?
liRewrite key/important pages, such as categories, home page, top products/li liRewrite pages with good Link and Social metrics/li liRewrite pages with good traffic/li liAfter selecting “Improve” in the Action column, elaborate in the Strategy column:
li”Improve these pages by writing unique, useful content to improve the Copyscape risk score.”/li /ul/li /ul/li liWhich of these pages should be removed / pruned?
liRemove guest posts that were published elsewhere/li liRemove anything the client plagiarized/li liRemove content that isn’t worth rewriting, such as:
liNo external links, no social shares, and very few or no entrances / visits/li /ul/li liAfter selecting “Remove” from the Action column, elaborate in the Strategy column:
li”Prune from site to remove duplicate content. This URL has no links or shares and very little traffic. We recommend allowing the URL to return 404 or 410 response code. Remove all internal links, including from the sitemap./li /ul/li /ul/li liWhich of these pages should be consolidated into others?
liPresumably none, since the content is already externally duplicated/li /ul/li liWhich of these pages should be marked “Leave As-Is”
liImportant pages which have had their content stolen
liIn the Strategy column provide a link to the CopyScape report and instructions for filing a DMCA / Copyright complaint with Google./li /ul/li /ul/li /ul/li listrongSort by Entrances or Visits (filtering out any that were already finished)/strong
liWhich of these pages should be marked as “Improve”?
liPages with high visits / entrances but low conversion, time-on-site, pageviews per session…/li liKey pages that require improvement determined after a manual review of the page/li /ul/li liWhich of these pages should be marked as “Consolidate”?
liWhen you have overlapping topics that don’t provide much unique value of their own, but could make a great resource when combined.
liMark the page in the set with the best metrics as “Improve” and in the Strategy column outline which pages are going to be consolidated into it. This is the canonical page./li liMark the pages that are to be consolidated into the canonical page as “Consolidate” and provide further instructions in the Strategy column, such as:
liUse portions of this content to round out /canonicalpage/ and then 301 redirect this page into /canonicalpage/ Update all internal links./li /ul/li /ul/li liCampaign-based or seasonal pages that could be consolidated into a single “Evergreen” landing page (e.g. Best Sellers of 2012 and Best Sellers of 2013 —gt; Best Sellers)./li /ul/li liWhich of these pages should be marked as “Remove”?
liPages with poor link, traffic and social metrics related to low-quality content that isn’t worth updating
liTypically these will be allowed to 404/410./li /ul/li liIrrelevant content
liThe strategy will depend on link equity and traffic as to whether it gets redirected or simply removed./li /ul/li liOut-of-Date content that isn’t worth updating or consolidating
liThe strategy will depend on link equity and traffic as to whether it gets redirected or simply removed./li /ul/li /ul/li liWhich of these pages should be marked as “Leave As-Is”?
liPages with good traffic, conversions, time on site, etc… that also have good content.
liThese may or may not have any decent external links/li /ul/li /ul/li /ul/li/ulp
strongAnother Way of Thinking About It…/strong/pp style=”text-align: center;”
img style=”width: 392px;” src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53cf1fd75b7f55.58191184.jpg” align=”middle”/pp
For big sites It is best to use a hatchet-approach as much as possible, and finish up with a scalpel in the end. Otherwise you’ll spend way too much time on the project, which eats into the ROI./pp
This is not a process that can be documented step-by-step. For the purpose of illustration, however, here are a few different
strongexamples of hatchet approaches/strong and when to consider using them./pul
liParameter-based URLs that shouldn’t be indexed
liDefer to the Technical Audit, if applicable. Otherwise, use your best judgement:
lie.g. /?sort=color, size=small/li ul
liAssuming the Tech Audit didn’t suggest otherwise these pages could all be handled in one fell swoop. Below is an “example” action and an “example” strategy for such a page:/li ul
liAction = Consolodate/li liStrategy = Rel canonical to the base page without the parameter/li /ul/ul/ul/li /ul/li liInternal search results
liDefer to the Technical Audit if applicable. Otherwise, use your best judgement:
lie.g. /search/keyword-phrase//li ul
liAssuming the Tech Audit didn’t suggest otherwise:/li ul
liAction = Remove/li liStrategy = Apply a noindex meta tag. Once they are removed from the index, disallow /search/ in the robots.txt file./li /ul/ul/ul/li /ul/li liBlog tag pages
liDefer to the Technical Audit if applicable. Otherwise…:
lie.g. /blog/tag/green-widgets/ , blog/tag/blue-widgets/ …/li ul
liAssuming the Tech Audit didn’t suggest otherwise:/li ul
liAction = Remove/li liStrategy = Apply a noindex meta tag. Once they are removed from the index, disallow /search/ in the robots.txt file./li /ul/ul/ul/li /ul/li lieCommerce Product Pages with Manufacturer Descriptions
liIn cases where the “Page Type” is known (i.e. it’s in the URL or was provided in a CMS export) and Risk Score indicates duplication…
lie.g. /product/product-name//li ul
liAssuming the Tech Audit didn’t suggest otherwise:/li ul
liAction = Improve/li liStrategy = Rewrite to improve product description and avoid duplicate content/li /ul/ul/ul/li /ul/li lieCommerce Category Pages with No Static Content
liIn cases where the “Page Type” is known…
lie.g. /category/category-name/ or category/cat1/cat2//li ul
liAssuming NONE of the category pages have content…/li ul
liAction = Improve/li liStrategy = Write 2-3 sentences of unique, useful content that explains choices, next steps or benefits to the visitor looking to choose a product from the category./li /ul/ul/ul/li /ul/li liOut-of-Date Blog Posts, Articles and Other Landing Pages
liIn cases where the Title tag includes a date or…/li liIn cases where the URL indicates the publishing date….
liAction = Improve/li liStrategy = Update the post to make it more current if applicable. Otherwise, change Action to “Remove” and customize the Strategy based on links and traffic (i.e. 301 or 404)/li /ul/li /ul/li/ulh3 id=”h3-12″Step 9: Content gap analysis and other value-adds/h3p
Although most of these could be put as optional items during the keyword research process, I prefer to save them until last because I never knows how much time I’ll have after taking care of more pressing issues.
strongContent gaps/strongbr
If you’ve gone through the trouble of identifying keywords and the pages already ranking for them, it isn’t much of a step further to figure out which keywords could lead to ideas about how to fill content gaps./pp
img src=”http://d1avok0lzls2w.cloudfront.net/uploads/blog/53cecc0a076dc0.61922054.jpg”/pp
At Inflow we like to use the “Awareness Ladder” developed by Ben Hunt, as featured in his book
a href=”http://www.amazon.com/Convert-Designing-Increase-Traffic-Conversion/dp/0470616334″Convert!/a. You can learn more about it a href=”http://www.goinflow.com/keyword-planning-pt-2-awareness-ladder/”here/a./pp
strongContent levels/strongbr
If time permits, or the situation dictates, we may also add a column to the Keyword Matrix or Content Audit which identifies which level of content the page would need to compete in its keyword space. We typically choose from Basic, Standard and Premium. This goes a long way in helping the client allocate copywriting resources to work where they’re needed the most (i.e. best writers do the Premium content)./pp
strongLanding page or keyword topic buckets/strongbr
If time permits, or the situation dictates, we may provide topic bucketing for landing pages and/or keywords. More than once this has resulted in recommendations for adding to or changing existing taxonomy with great results. The most frequent example is in the “How To” or “Resources” space for any given niche./pp
strongKeyword relevancy scores/strongbr
This is a good place to enlist the help of a client, especially in complicated niches with a lot of jargon. Sometimes the client can be working on this while the strategist is doing the content audit./ph3 id=”h3-13″Step 10: Writing up the content audit strategy document/h3p
The Content Strategy, or whatever you decide to call it, should be delivered at the same time as the audit, and summarizes the findings, recommendations and next steps from the audit. It should start with an Executive Summary and then drill deeper into each section outlined therein./pp
Here is a
strongreal example/strong of an executive summary from one of Inflow’s Content Audit Strategies:/pp
As a result of our comprehensive content audit, we are recommending the following, which will be covered in more detail below:
liRemoval of about 624 pages from Google index by deletion or consolidation:
li203 Pages were marked for Removal with a 404 error (no redirect needed)/li li110 Pages were marked for Removal with a 301 redirect to another page/li li311 Pages were marked for Consolidation of content into other pages
liFollowed by a redirect to the page into which they were consolidated/li /ul/li /ul/li liRewriting or improving of 668 pages
li605 Product Pages are to be rewritten due to use of manufacturer product descriptions (duplicate content), these being prioritized from first to last within the Content Audit./li li63 “Other” pages to be rewritten due to low-quality or duplicate content./li /ul/li liKeeping 26 pages as-is with no rewriting or improvements needed unless the page exists in the Keyword Matrix, in which case it requires on-page optimization best practices be reviewed/applied./li liOn-Page optimization focus for 25 pages with keywords outlined in the Keyword Matrix tab./li /ulp
These changes reflect an immediate need to “improve or remove” content in order to avoid an obvious content-based penalty from Google (e.g. Panda) due to thin, low-quality and duplicate content, especially concerning Representative and Dealers pages with some added risk from Style pages.
strongThe Content Strategy should end with recommended next steps/strong, including action items for the consultant and the client. Here is a real example from one of our documents:/pblockquote
We recommend the following actions in order of their urgency and/or potential ROI for the site:
liRemove or consolidate all pages in the “Prune” tab of the Content Audit Dashboard
liDetailed instructions for each page can be found in the “Strategy” column of the Prune tab/li /ol/li liBegin a copywriting project to improve/rewrite content on Style pages to ensure unique, robust content and proper keyword targeting.
liInflow can provide support for your own copywriters, or we can use our in-house copywriters, depending on budget and other considerations. As part of this process, these items can also be addressed:
liImprove/rewrite all pages in the Keyword Matrix to match assigned keywords.
liInclude on-page optimization (e.g. Title, description, alt attributes, keyword use, etc.)
liSee the “Strategy” column for more complete instructions for each page./li /ol/li /ol/li liImprove/rewrite all remaining pages from the “Content Audit” tab listed as “Improve”./li /ol/li /ol/li/olp
emstrong/strong/em/ph2 id=”h2-7″Resources, links, and post-scripts…/h2p
a href=”https://docs.google.com/spreadsheet/ccc?key=0Aupv-89nVqawdGdPTFZmeEUxSnIwdW9UWmFFWlVlVkEusp=sharing” Example Content Auditing Dashboard/abr
Make a copy of this Google Docs spreadsheet, which is a basic version of how we format ours at Inflow./pp
a href=”http://www.goinflow.com/content-audit-strategies/” Content Audit Strategies for Common Scenarios/abr
This page/tool will help you determine where to start and what to focus on for the majority of situations you’ll encounter while doing comprehensive content audits./pp
a href=”http://www.quicksprout.com/2014/04/24/how-to-conduct-a-content-audit-on-your-site/”How to Conduct a Content Audit on Your Site/a emby Neil Patel of QuickSprout/embr
Oh wait, I can’t in send everyone to a page that makes them navigate a gauntlet of pop-ups to see the content, and another one to leave. So nevermind…/pp
a href=”https://www.distilled.net/blog/seo/how-to-perform-a-content-audit/” How to Perform a Content Audit/a emby Kristina Kledzik of Distilled /embr
This one focuses mostly on categorizing pages by buying cycle stage./pp
a href=”http://www.goinflow.com/expanding-horizons-ecommerce-content-strategy/”Expanding the Horizons of eCommerce Content Strategy/a emby Dan Kern of Inflow/embr
Dan wrote an epic post recently about content strategies for eCommerce businesses, which includes several good examples of content on different types of pages targeted toward various stages in the buying cycle./pp
a href=”https://www.distilled.net/content-guide/”Distilled’s Epic Content Guide/abr
See the section on Content Inventory and Audit./pp
a href=”http://blog.braintraffic.com/2009/03/the-content-inventory-is-your-friend/”The Content Inventory is Your Friend/aem by Kristina Halvorson on BrainTraffic/embr
Praise for the life-changing powers of a good content audit inventory./pp
a href=”http://www.verticalmeasures.com/content-marketing-2/how-to-perform-a-content-marketing-audit/”How to Perform a Content Marketing Audit/a by emTemple Stark on Vertical Measures/embr
Temple did a good job of spelling out the “how to” in terms of a high-level overview of his process to inventory content, assess its performance and make decisions on what to do next./pp
a href=”http://contentmarketinginstitute.com/2011/01/content-audits/”Why Traditional Content Audits Aren’t Enough/a emby Ahava Leibtag on Content Marketing Institute’s blogbr
/emWhile not a step-by-step “How To” like this post, Ahava’s call for marketing analysts to approach these proejcts from both a quantitative (content inventory) and qualitative (content quality audit) resonated with me the first time I read it, and is partly responsible for how I’ve approached the process outlined above./pbr /pa href=”http://moz.com/moztop10″Sign up for The Moz Top 10/a, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!/pimg src=”http://feeds.feedburner.com/~r/MozBlog/~4/91VKfIHPyyM” height=”1″ width=”1″/