Popularity: 100% [?]

Google Hacks for Dorks and SEO prowlers

Google Hacks… or more aptly Google Dorks are a handy tool for anyone that not only enjoys SEO, but searching in general. Originally termed as such by the hacker/cracker community – you can get lot’s of interesting information. They call ‘em dorks because if you’re leaving information open to the search engines that shouldn’t be… then yer a dork!

And I figure if Google is allowed to read certain files and feels like serving related data back to me…then great! There should be no reason for me not to play with them and entertain you, titillate and hopefully even educate. Oh, and yea… we’ll get to some SEO stuff too…but later.

Google Advanced Operators

As any good search geek knows, the advanced operators are a great way to mine for a variety of data. You know the ones, site: link: intitle: and the rest of the family. But let’s look at how they got that name of ‘dorks’ just to get the idea… of some dorky info floating around at Google.

Let’s have some fun with a few shall we? First let’s go looking for some sensitive data via Robots.txt. Now I am not going to show you any dirty laundry you cheeky monkey, but if one spent enough time (and there are those that do) often sensitive info is thought to be invisible by webmasters with this little command; the Disallow

"robots.txt" "Disallow:" filetype:txt

or even…

"robots.txt" "Disallow:" "private" filetype:txt

Which can always be fun for an evenings reading…. Obviously you can play with keywords and get inventive. But that’s not really a dork since robots files are publicly available… ok so let’s move along…

 

Getting Sensitive

"not for public release"
This is an oldie but a goodie and one can certainly play with it as well by looking closer at some .edu – .gov or .mil TLDs as well. For example;

"not for public release" inurl:edu

Or how about;

"not for distribution" confidential filetype:pdf

This will tighten it up to only showing PDFs which I find to be ever so much more helpful. And let’s say for fun we’re in the travel sector looking for some good tidbits for link bait of other general business intelligence we add our KW too the mix;

"not for distribution" confidential, travel, filetype:pdf

You get the idea here…. all I can say is that one can start to apply the concepts behind these hacks to find all kinds of interesting reading material. And if you’re a reporter…well, I am sure your nose is tingling at the moment.

 

Robots.txt (aka business intelligence)

Let’s say we’re working the ‘florida’ market and wanted to see what other sites in the space are up to we could use;

"robots.txt" "Disallow:" filetype:txt …or even better – (inurl:"robot.txt" | inurl:"robots.txt" ) intext:disallow filetype:txt and "robots.txt" "Disallow:" filetype:txt inurl:florida

What any of that information is.. or how it can be used, I leave to your imagination – I’m just sayin’…

 

AW Stats (aka Keyword Research)

Sticking with our Florida theme, now go looking for some stats from .edu domains with ‘florida’ related…
florida intitle:"statistics of" "advanced web statistics"

Maybe we’re only interested in some .edu domains?
florida intitle:"statistics of" "advanced web statistics" inurl:.edu

Or maybe we want to see what keyphrases are being used to find .edu sites; 
keyphrases intitle:"statistics of" "advanced web statistics" inurl:.edu

Webalizer; and of course we can also do the same with Webalizer (or other popular program)
intitle:"Usage Statistics for" "Generated by Webalizer" 

and the ‘florida’ niche with these
intitle:"Usage Statistics for" "Generated by Webalizer" inurl:florida

or….

florida intitle:"Usage Statistics for" "Generated by Webalizer"

You could even search images – inurl:/webalizer/ intitle:usage statistics + hosting

You get the idea… play with it to find more goodies. If these dorks want to leave me research data to mine for KWs and so forth…what am I to do? I merely asked Google questions and went for a random walk.

 

And what can you use it for?

I say there is no end to the information both educational and entertaining out there thanks to the dorks and Google. Some of the more interesting uses I have found are;

  • KW research
  • Link Building
  • Content creation
  • Competitive intelligence
  • Nefarious things (for you A types)

And I am being tame with the examples… so one wonders, are we dorks?

During the research and many hours playing around I have found the deeper darker side and what I have posted here merely scratches the surface as far as nefarious ways to use them. Giving pause, the consideration of ranting about Google’s (and other search engines) enabling of this misses the fact we are dorks. Through laziness or lack of foresight we often leave things in public as much as leaving our open laptop unaccompanied in the park in summer. Don’t be a dork

I like to use them to find things like lists of directories and other reports to see what others are up to; directory filetype:xls inurl:SEO OR report filetype:xls inurl:SEO – that time looking for XLS files…

Link Builders dream…

Maybe you’re a happy little link builder that is looking for some nice spots to drop your legitimate/spammy links. Let’s try this;

add-links, last-updated 2000  inurl:.edu

Using advanced search operators such as we did with the Yahoo Site Explorer is another great way to track down opportunities for the fastidious link builder. First off let’s use the ‘linkdomain’ operator

linkdomain:huomah.com site:.com "SEO Blog" 

  1. linkdomain: – searches for links to Huomah.com
  2. Site; – tells it to look for results from ‘.com’ extensions.
  3. “SEO Blog” searches the KWs on the page (or hopefully in the anchor text)

 

That’s the basics to give you the idea… now we’ll step it up some.

We’re looking for target pages where there is a link to the site (my blog again) and has the target term we’re after. This is by no means full-proof and does require some leg work, but it will make the targeting of relevant themes in your linking somewhat easier.

We can also do the same for .edu or .gov websites, which are perceived to be more valuable as trusted sources of search engines – we’d do so as such;

  1. linkdomain:example.com site:.edu "keyword"
  2. linkdomain:example.com site:.gov " keyword"

…. Play with them… always some goodies to be found. We’re getting warmer….

Now let’s look at another route, which is to look at the linking sites and associated page titles. Considering the theme of the page is important to the value of the link, pages with related keywords in the page title are of interest to us. So for the keyword SEO (researching my blog as a competitor) I could do something like this;

linkdomain:huomah.com -huomah.com intitle:SEO

And when we have multiple terms such as; ‘search engine optimization’ we would use quotations;

linkdomain:huomah.com -huomah.com intitle:"search engine optimization"

Once more, we can also use Inurl: which looks for the keyword(s) in the url from linking pages; another reasonably strong ranking signal.

linkdomain:huomah.com -huomah.com inurl:"search engine optimization"

I advise playing around to find other angles which these can be used. This is a great method ( allintitle: and allinurl: for Google – whose link data sucks)

Don’t be a Dork

There are as many ways to utilize them as the imagination will allow. Advanced search operators are one of the greatest tools for the SEO practitioner; and hackers alike. Understanding not only how to use them, but how to protect against them (from a hackers viewpoint) is huge. If you want to learn more there is some further reading below;

Google HACKS – more reading

the Google Hacking DataBase – I Hack Stuff
Google Hacking Not Fun For You – WebPro News
Advanced operators reference guide – Google Guide
Advanced search operators – Van SEO Design
The ultimate guide to advanced operators – Hybrid SEM

And even a Video for you;


Google Hacks 2.0Click here for the most popular videos

  • del.icio.us
  • Digg
  • Technorati
  • Reddit
  • StumbleUpon
  • Facebook
  • Twitter
  • RSS
  • email
This entry was posted in Google, Tools/Resources and tagged , , . Bookmark the permalink.

21 Responses to Google Hacks for Dorks and SEO prowlers

  1. Ryan Steyn says:

    Dave, you have outdone yourself… I dont think many people know such things exist yet alone can be used by the average joe. Personally I had no clue about anything past the basic dorks (thanks Firefox) – so being able to search on this level is really something else.

    I tell you what though, its going to bring an entire new meaning to going “postal” – why copy your ass on the office printer when you can get images of your boss and secretary making out (via security camera) and printing thousands of copies via shared printers… personally i am more inclined to use this knowledge for some serious competition overrun but i am sure you see where i am going.

    Only problem being, the average has no clue about ip masking or any any other form of nefarious ip hiding tactics… which means any network administrator will pick up the intruder quite quickly – can you say jail bait? I am only referring to the contents/information within the video clip though.

    I can safely say though, this is all getting copied and pasted repeatedly until i know it by heart.

    Again, great post.

  2. Adam says:

    Awesome list… never thought of some of these before.

    Great post, thanks!
    ~A

  3. Dave says:

    Lol @Ryan; I figured you’d get a kick out of them. There are serious implications for security and even a few desktop apps that crackers use for hunting.

    Instead of merely talking about the SEO implications, I figured we’d look at the wider issues surrounding being a Google Dork ;0)

  4. Now I will know how to implement counter-measures. Most inquiries I see like this across the stats are for link hunting, but the tactics you outlined were definitely unique. Very informative, thanks…

  5. Robert says:

    Some great tips there. While I’ve done a little searching for more insecure sensitive info I never thought to use a simple Google check! Ha… it’s always the simple things we forget.

    Thanks! :)

  6. Pingback: Link Building this Week (43.2008) | Wiep.net

  7. Pingback: Vote for this article at blogengage.com

  8. Pingback: Added by a PAL to FAQ PAL

  9. John Dobson says:

    I can see many applications in the private investigation field, particularly in searching a specific person in private files, this is the best article I’ve seen on SR in 4 yrs of membership, great idea David

  10. Dave says:

    Gee thanks for the high praise there John… what the heck do I do for an encore though? Sigh, retirement is nigh

    I will certainly put my thoughts to uses in the find people world though… could be interesting.

  11. Ryan Steyn says:

    I really am glad you took it the step further… i need not say more :)

    For an encore you should consider an investigation into google staff… ye know, a finale kinda deal. Maybe Matte Cutts has a secret identity… like that guy who drops milk by the front door but is really the postman…

    Retirement… na…

  12. Mohan says:

    Very use full niformation.

  13. Bill Platt says:

    I will be back to check this time and again. Nice info. Thanks.

  14. Pingback: Interleado - SEO Software for Professionals » Blog Archive » SEO analysis tips and tools

  15. Pingback: Link Love Time! SEO Forecasting, SEM Tools, Free Ivy League Business Courses And More | SEO ROI Services

  16. Pingback: Research your posts with Buzz Monitoring | Collective Thoughts

  17. Pingback: Link Building Strategies: 69 Solid Tactics For 2009 | Wiep.net

  18. These hacks demonstration the adage that “if you don’t know what you are doing, don’t do it”.

    A while back I had learned about some Google Hacks geared toward software I had installed. I experimented with them to see what was vulnerable and thus learn how to protect my own sites.

    What surprised me the most was that when I would email the site owner their username and password, they often didn’t care.

    I even ran into e-commerce sites where their customer database, including complete credit card information, was completely exposed.

    Everyone who builds a website needs to learn about these things.

    Ryan Steyn: It is not jail time to clicking on a publicly available Google link. (On the other hand, URL manipulation such as “sql injection” is.)

  19. Great share…a lot of people don’t give up the good stuff. Got way distracted looking at nyt’s robots.txt file!!!

  20. Pingback: 21 Link Builders Share Advanced Link Building Queries

Comments are closed.