Automate Everything w/ Bash, Linux & Command Line
  1. Blekko Inbound Links Extractor Bookmarklet

    Let’s get right to it. Start using the bookmarklet by dragging the following bookmarklet link to your bookmark toolbar.

    Blekko - Inbound Links

    1. Go to a Blekko Inbound Links SEO report like: ubuntuforums.com/ /domainlinks
    2. Click on the Blekko - Inbound Links bookmarklet from your toolbar.
    3. Watch for a new <textarea></textarea> to be added to the top of the page containing all the link data from the report.

    If you’re pulling data for a popular site like Twitter or Google, be prepared to be patient because there’s a shit ton of scraping going on. Average and/or small sites typically take a second or two. The example provided above takes only 3 or 4 seconds.

    On the topic of links…link to me or share it if you like it!

    The How and Why

    I’m still working to improve my jQuery skills. Recently I needed to automate some tasks relating to scraping data from a site. That particular site was built relying heavily on Javascript and Ajax requests, which meant that I couldn’t use my favorite combination of cURL, grep, cut and Bash to do the scraping. I needed something that would run client-side and execute Javascript.

    After thinking about it for a while, I decided I should learn about whether or not I’d be able to create a bookmarklet to do the job. I’ve been learning jQuery over the past few months, so I also wanted to make sure I could pair my developing jQuery skills with this bookmarklet project. I found this great post that provided the exact starting point I was looking for.

    After using the framework provided in that post, I was able to complete that project. In fact, the project ended up working so well that I started to wonder about other ways I could use this technique to automate daily tasks.

    Blekko’s SEO Inbound Links Report

    Very frequently, I use Blekko’s SEO tools to research sites. I’m a big fan of all the free information Blekko provides webmasters and search engine marketers. In particular, I use the Blekko Inbound Links report often. This report shows all the incoming links for any site that Blekko has in its index. It’s really easy to use and navigate. To see this report for yourself, you’ll need to create an account with Blekko (FREE).

    After your account is created, you can enter automateeverything.tumblr.com/ /domainlinks to see all the incoming links for this site. My blog is still very young (at the time of writing), so my link profile isn’t very impressive (you can help change that by linking to me!!). Here’s a screenshot of what you get by getting the same report for plus.google.com/ /domainlinks.

    So, here’s the rundown of what you’re seeing:

    • An overall summary of how many incoming links by how many unique domains.
    • A list of domains, sorted by most authority to least (as judged by Blekko’s “Host Rank”).
    • The count of how many links there are from each host/domain.

    While this is good summary information, I also want to know which pages on the domain the links are coming from, what the anchor text is, what page is being linked to and whether it’s a nofollow link or not. To get that extra detail, you have to click on each of the link counts for each linking domain. That will load a detailed list of links and will show all the information I mentioned above.

    Let me say again, I think it’s great that Blekko is providing all this information for free. But…I want more. I want it all in one place. I want it in Excel. I don’t want to try to combine browser addons and/or macros to get all this data. I don’t want to go to each individual from domain links page to get all the to, from and anchor link details. This would take a ton of time to do manually. This time would be far better spent actually analyzing the link data.

    At the time of writing, Blekko does not provide an export option. That’s why I’m going to do it for everyone!

    The Bookmarklet

    This bookmarklet will automate the entire process. In just a second or so, it will scrape all of the information on the page, request each “links” page, scrape all that and then finally output a nicely formatted report in a text box at the top of the page. From there you’ll be able to just copy and paste into Excel to be used in a pivot table, graph or whatever.

    I love Ubuntu, so I thought I’d do a demo on UbuntuForums.com to show the power of this bookmarklet.

    Like I said, easy and fast! Here’s a few sample lines from the output of the UbuntuForums.com report:

    From_Host   Host_Rank   Link_From_URL   Link_From_Anchor    nofollow    Link_To_URL
    www.43things.com    468.7   http://www.43things.com/things/view/2140/use-linux      nf  http://ubuntuforums.com/
    www.linuxjournal.com    1343.3  http://www.linuxjournal.com/content/post-penguicon-unity-unification-story          http://ubuntuforums.com/
    ubuntuforums.org    1167.3  http://ubuntuforums.org/showthread.php?t=520091&page=2351           http://ubuntuforums.com/bump.php?
    ubuntuforums.org    1167.3  http://ubuntuforums.org/showthread.php?t=635117 www.ubuntuforums.com        http://www.ubuntuforums.com/
    bugs.launchpad.net  653.8   https://bugs.launchpad.net/ubuntu/+source/evolution/+bug/349312     nf  http://ubuntuforums.com/
    bugs.launchpad.net  653.8   https://bugs.launchpad.net/ubuntu/+source/fglrx-installer/+bug/545257       nf  http://ubuntuforums.com/
    bugs.launchpad.net  653.8   https://bugs.launchpad.net/ubuntu/+source/fglrx-installer/+bug/545257/comments/6        nf  http://ubuntuforums.com/
    www.geek.com    927.7   http://www.geek.com/articles/geek-pick/microsoft-expertzone-training-teaches-best-buy-employees-about-linux-inferiority-2009097/        nf  http://www.ubuntuforums.com/
    www.ubuntugeek.com  257.5   http://www.ubuntugeek.com/avast-antivirus-for-ubuntu-desktop.html       nf  http://www.ubuntuforums.com/
    www.ubuntugeek.com  257.5   http://www.ubuntugeek.com/avast-antivirus-for-ubuntu-desktop.html/comment-page-2        nf  http://www.ubuntuforums.com/
    www.piratbyran.org  100.6   http://www.piratbyran.org/index.php?view=forum&a=thread&id=37918&fview=34           http://www.ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/    Official Ubuntu forums/suppport site        http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21953   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21959   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21969   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21974   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21985   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/www.43things.com    468.7   http://www.43things.com/things/view/2140/use-linux      nf  http://ubuntuforums.com/
    www.linuxjournal.com    1343.3  http://www.linuxjournal.com/content/post-penguicon-unity-unification-story          http://ubuntuforums.com/
    ubuntuforums.org    1167.3  http://ubuntuforums.org/showthread.php?t=520091&page=2351           http://ubuntuforums.com/bump.php?
    ubuntuforums.org    1167.3  http://ubuntuforums.org/showthread.php?t=635117 www.ubuntuforums.com        http://www.ubuntuforums.com/
    bugs.launchpad.net  653.8   https://bugs.launchpad.net/ubuntu/+source/evolution/+bug/349312     nf  http://ubuntuforums.com/
    bugs.launchpad.net  653.8   https://bugs.launchpad.net/ubuntu/+source/fglrx-installer/+bug/545257       nf  http://ubuntuforums.com/
    bugs.launchpad.net  653.8   https://bugs.launchpad.net/ubuntu/+source/fglrx-installer/+bug/545257/comments/6        nf  http://ubuntuforums.com/
    www.geek.com    927.7   http://www.geek.com/articles/geek-pick/microsoft-expertzone-training-teaches-best-buy-employees-about-linux-inferiority-2009097/        nf  http://www.ubuntuforums.com/
    www.ubuntugeek.com  257.5   http://www.ubuntugeek.com/avast-antivirus-for-ubuntu-desktop.html       nf  http://www.ubuntuforums.com/
    www.ubuntugeek.com  257.5   http://www.ubuntugeek.com/avast-antivirus-for-ubuntu-desktop.html/comment-page-2        nf  http://www.ubuntuforums.com/
    www.piratbyran.org  100.6   http://www.piratbyran.org/index.php?view=forum&a=thread&id=37918&fview=34           http://www.ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/    Official Ubuntu forums/suppport site        http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21953   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21959   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21969   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21974   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    www.forevergeek.com 122.7   http://www.forevergeek.com/2005/04/ubuntu_504_hoary_hedgehog_review/?replytocom=21985   Official Ubuntu forums/suppport site    nf  http://ubuntuforums.com/
    

    Notes on Usage

    • The bookmarklet is cross-browser and cross-platform. If you run into any compatibility issues at all, please drop me a comment with the details of your platform and the query in Blekko.
    • This bookmarklet works by calling the actual Javascript file from my public Dropbox folder. You can see the full code here: http://dl.dropbox.com/u/56691816/automateeverything/bookmarklets/blekko-inbound-links.js
    • Since this bookmarklet executes code that’s pulled from an external source, it’s functionality and feature set is subject to change. Not to freak you out but if I decided to do nasty things with the Javascript at some point in the future, the only way you’d know is if you inspected the file each time before running it. But, isn’t that the case with any software that updates its-self?

    I hope you enjoy it. Please feel free to let me know if you have feature requests or bugs.

    Enjoy and happy automation!

     
  2. Blekko Keyword Suggest Scraper

    I love keyword research. That’s a good thing because as a Search Marketing Professional, I do a ton of it. One thing that’s important for keyword research is using a variety of sources to keep your results fresh. Another benefit of having several keyword research sources is having a diverse audience.

    When performing keyword research for Paid Search, estimating potential traffic volume is less important than for search engine optimization. As I’ve posted about several times before, my favorite style of keyword research (when search volume estimates don’t matter) is scraping search/keyword suggestion tools. I’ve been using Blekko for about a year now, both as a search engine but also for leveraging their search suggest functionality. It’s only fitting that I show a simple command for returning Blekko keyword suggestions using a Bash shell.

    Blekko Keyword Suggestions

    Here’s the minimum command to run to get back formatted keyword suggestions from Blekko.

    curl -s "http://blekko.com/autocomplete?query=keyword+research" | sed 's/.*\[//g;s/\].*//g;s/","/\n/g;s/"//g'
    

    The only portion of the command above that you’ll need to modify is the keyword+research text, which is equal to the root seed keyword (with spaces replaced with a plus sign). Running the command above produces these results.

    keyword research
    keyword research tips
    keyword research tool
    keyword research services
    keyword research seo tool
    keyword research pro
    keyword research software
    keyword research seo
    keyword research 2
    keyword research service
    

    Blekko’s search suggest tool is limited to 10 results at a time, which is very common. If you’ve read my other posts on how this is done in Google, Yahoo, Bing, etc. then you’ll know that I commonly query several search suggestion sources with one command to make this research and comparison even more simple. Below is this Blekko code integrated with my suggestion tool which combines Google, Amazon, Yahoo, Bing and now Blekko.

    #!/bin/bash
    
    q=$(echo "$1" | sed 's/ /%20/g')
    
    clear
    echo -e "\nGetting Suggestions for \"$1\""
    
    echo -e "\nGoogle:"
    curl -s "http://www.google.com/s?sugexp=pfwl&cp=15&q=$q" | sed 's/\[/\n\[/g' | cut -d'"' -f2 | tail -n +4
    
    echo -e "\nAmazon:"
    curl -s "http://t1-completion.amazon.com/search/complete?method=completion&q=$q&search-alias=aps&client=amazon-search-ui&mkt=1&x=updateISSCompletion&sc=1" | sed 's/,\[{".*//g;s/,/\n/g' | cut -d'"' -f2 | grep -v '\[\|\]\|\{\|\]' | tail -n +2
    
    echo -e "\nYahoo:"
    y_1=$(curl -s "http://sugg.us.search.yahoo.net/gossip-us-ura?droprotated=1&output=sd1&command=$q&nresults=10")
    # this way uses our keyword as a suffix rather than prefix. usually there are some duplicates with the first method but those are removed later.
    y_2=$(curl -s "http://sugg.us.search.yahoo.net/gossip-us-ura?droprotated=0&output=sd1&command=$q&nresults=10")
    echo $y_1 $y_2 | sed 's/{/\n{/g' | grep '"k"' | cut -d'"' -f4 | sort -u
    
    echo -e "\nBing:"
    curl -s "http://api.bing.com/qsonhs.aspx?FORM=ASAPIW&mkt=en-US&type=cb&cb=sa_inst.apiCB&q=$q&cp=13&bq=$q" | sed 's/{/\n{/g' | grep '"Txt"' | cut -d'"' -f4
    echo
    
    echo "Blekko:"
    curl -s "http://blekko.com/autocomplete?query=$q" | sed 's/.*\[//g;s/\].*//g;s/","/\n/g;s/"//g'
    echo; echo
    

    If you save the script as sugg.sh, you’ll be able to run it like this.

    ./sugg.sh "keyword tool"
    

    That will produce the following results:

    Getting Suggestions for "keyword tool"
    
    Google:
    keyword tool
    keyword tool free
    keyword toolbox
    keyword tool youtube
    keyword tool estimator
    keyword tool dominator
    keyword tool by google
    keyword tool traffic estimator
    keyword tool seo
    keyword tool reviews
    
    Amazon:
    keyword tool
    
    Yahoo:
    
    Bing:
    keyword tool bing
    keyword tool google
    keyword tool
    keyword tool external
    keyword tools free
    keyword tool dominator
    keyword tool adwords
    keyword tool wordtracker
    
    Blekko:
    keyword tool
    keyword tool api
    keyword tool search
    keyword tool software
    keyword tool download
    keyword tool 2
    keyword tool review
    keyword tool ppc
    keyword tool crack
    keyword tool training
    

    Pretty cool right! Unfortunately, Yahoo doesn’t seem to have any knowledge of what a “keyword tool” is. I can’t say I’m surprised.

    Keyword Suggestion Enumeration

    I mentioned above that Blekko limits the keyword suggestion results to 10 phrases and that it’s very common to do so. I’ve also posted about how it’s trivial to start with a root phrase, like “keyword tools”, and then add each letter of the alphabet on to the end to see what results get returned. Doing that for every letter of the alphabet, removing duplicates and then sorting alphabetically is the next natural step in the evolution of any keyword suggestion based keyword research tool.

    So, here’s the code for my keyword suggestion enumeration tool with the Blekko code added in. This takes a bit longer to run because it’s sending many, many HTTP GET requests and has to wait for the search engine’s responses. This version reaches out to 5 different search suggestion services and makes a total of 6 different queries per letter of the alphabet, which equals 156 requests. Think about how long that would take you to manually place those searches in your browser and then note the results. Waiting 10 or 15 seconds for a result doesn’t seem so bad now!

    Anyway, here is the code.

    #!/bin/bash
    
    q=$(echo "$1" | sed 's/ /%20/g')
    tmp=tmp.txt
    
    echo "" > $tmp
    for suffix in {a..z}
    do
        curl -s "http://www.google.com/s?sugexp=pfwl&cp=15&q=$q%20$suffix" | sed 's/\[/\n\[/g' | cut -d'"' -f2 | tail -n +4 >> $tmp
        curl -s "http://t1-completion.amazon.com/search/complete?method=completion&q=$q%20$suffix&search-alias=aps&client=amazon-search-ui&mkt=1&x=updateISSCompletion&sc=1" | sed 's/,\[{".*//g;s/,/\n/g' | cut -d'"' -f2 | grep -v '\[\|\]\|\{\|\]' | tail -n +3 >> $tmp
        curl -s "http://sugg.us.search.yahoo.net/gossip-us-ura?droprotated=1&output=sd1&command=$q%20$suffix&nresults=10" | sed 's/{/\n{/g' | grep '"k"' | cut -d'"' -f4 >> $tmp
        # this way uses our keyword as a suffix rather than prefix. usually there are some duplicates with the first method but those are removed later.
        curl -s "http://sugg.us.search.yahoo.net/gossip-us-ura?droprotated=0&output=sd1&command=$q%20$suffix&nresults=10" | sed 's/{/\n{/g' | grep '"k"' | cut -d'"' -f4 >> $tmp
        curl -s "http://api.bing.com/qsonhs.aspx?FORM=ASAPIW&mkt=en-US&type=cb&cb=sa_inst.apiCB&q=$q%20$suffix&cp=13&bq=$q" | sed 's/{/\n{/g' | grep '"Txt"' | cut -d'"' -f4 >> $tmp
        curl -s "http://blekko.com/autocomplete?query=$q" | sed 's/.*\[//g;s/\].*//g;s/","/\n/g;s/"//g' >> $tmp
        echo >> $tmp
    done
    
    sed '/^$/d' $tmp | sort | uniq
    

    If you save it as “enum_sugg.sh”, here’s how you would run it.

    ./enum_sugg.sh "keyword tools"
    

    And here’s the result you would get (well, at least this is what I get right now).

    adwords keyword tools external
    best free keyword tools
    best keyword tools 2012
    free keyword tools online
    google adwords keyword tools external
    google keyword tool kit
    google keyword tools google searches
    google keyword tools uk
    google keyword tool youtube
    keyword density tools
    keyword discovery tools
    keyword lookup tool
    keyword niche tools
    keyword optimization tools
    keyword phrase tool
    keyword preview tool
    keyword question tool
    keyword research tools 2012
    keyword research tools free
    keyword research tools review
    keyword search tool yahoo
    keyword selector tool yahoo
    keyword suggestion tool youtube
    keyword tool adsense
    keyword tool analyzer
    keyword tool api
    keyword tool average search volume
    keyword tool bar
    keyword tool beta
    keyword tool bing
    keyword tool book
    keyword tool comma seperated
    keyword tool copy list free
    keyword tool density
    keyword tool discover
    keyword tool dominator
    keyword tool download
    keyword tool estimator
    keyword tool external adwords
    keyword tool external google
    keyword tool finder
    keyword tool find little competion warrior
    keyword tool for app store
    keyword tool for niche ideas
    keyword tool generator
    keyword tool golf instruction
    keyword tool google adwords
    keyword tool google external
    keyword tool google too many searches
    keyword tool in english
    keyword tool in google
    keyword tool in spanish
    keyword tool kit
    keyword tool microsoft word
    keyword tool multilple
    keyword tool multiple
    keyword tool old
    keyword tool old version
    keyword tool overture
    keyword tool plr
    keyword tool previous interface
    keyword tool previous months
    keyword tools accuracy
    keyword tools accurate
    keyword tools adsense
    keyword tools adwords
    keyword tools adwords google
    keyword tools affiliate
    keyword tools alexa
    keyword tools analysis
    keyword tools articles
    keyword tools available
    keyword tools based
    keyword tools beginners
    keyword tools best
    keyword tools beta
    keyword tools bing
    keyword tools blog
    keyword tools by google
    keyword tools by microsoft
    keyword tools campaign
    keyword tools canada
    keyword tools chinese
    keyword tools commercial intent
    keyword tools compared
    keyword tools comparison
    keyword tools competition
    keyword tools competitors
    keyword tools content
    keyword tools cpc
    keyword tools de adwords
    keyword tools dictionary
    keyword tools digitalpoint
    keyword tools directory
    keyword tools disadvantages
    keyword tools download
    keyword tools dreamweaver
    keyword tool search
    keyword tool search by city
    keyword tool search google
    keyword tool search on yahoo google and msn
    keyword tools ebay
    keyword tools english
    keyword tool seo
    keyword tool seo organic
    keyword tools estimator
    keyword tools exact match
    keyword tools excel
    keyword tools external
    keyword tools external google
    keyword tools external suggestion tool
    keyword tools for adsense
    keyword tools for adwords
    keyword tools for analysis
    keyword tools for bing
    keyword tools for ebay
    keyword tools for mac
    keyword tools for seo
    keyword tools for yahoo
    keyword tools free
    keyword tools free in google
    keyword tools from google
    keyword tools generator
    keyword tools google
    keyword tools google adsense
    keyword tools google adwords
    keyword tools google old version
    keyword tools google search global monthly
    keyword tools google uk
    keyword tools google vs wordtracker
    keyword tools guide
    keyword tools help
    keyword tools home
    keyword tools html
    keyword tools html code
    keyword tools hubpages
    keyword tools ideas
    keyword tool simple
    keyword tools in english
    keyword tools in google
    keyword tools internet
    keyword tools in yahoo
    keyword tools ireland
    keyword tools kei
    keyword tools keywordspy
    keyword tools link
    keyword tools link building
    keyword tools linux
    keyword tools list
    keyword tools local searches
    keyword tools location
    keyword tools long tail
    keyword tools mac
    keyword tools mac free
    keyword tools malaysia
    keyword tools market
    keyword tools market samurai
    keyword tools microsoft
    keyword tools misspell
    keyword tools moneyword matrix
    keyword tools msn
    keyword tools music
    keyword tools nichebot
    keyword tools niche finder
    keyword tool software
    keyword tools old interface
    keyword tools omniture
    keyword tools on google
    keyword tools online
    keyword tools online local
    keyword tools on the internet
    keyword tools on yahoo
    keyword tools organic seo
    keyword tools overture
    keyword tools page optimization
    keyword tools paid search
    keyword tools pay per click
    keyword tools pdf
    keyword tools photos
    keyword tools ppc
    keyword tools price
    keyword tools rank
    keyword tools rapid
    keyword tools rating
    keyword tools remove duplicates
    keyword tools research
    keyword tools resume
    keyword tools review
    keyword toolss
    keyword tools samurai
    keyword tools search
    keyword tools search engine optimization
    keyword tools seo
    keyword tools seobook
    keyword tools site
    keyword tools software
    keyword tools south africa
    keyword tools spy
    keyword tools statistics
    keyword tools techniques
    keyword tools that work
    keyword tools tips
    keyword tools to compete
    keyword tools tracker
    keyword tools traffic
    keyword tools traffic estimator
    keyword tools traffic travis
    keyword tools trends
    keyword tools tricks
    keyword tools twitter
    keyword tools uk
    keyword tools url
    keyword tools warez
    keyword tools webmaster
    keyword tools websites
    keyword tools wiki
    keyword tools word
    keyword tools wordpress
    keyword tools wordtracker
    keyword tools wordze
    keyword tools work
    keyword tools worth
    keyword tools yahoo
    keyword tools yahoo panama
    keyword tools youtube
    keyword tool target geographic location
    keyword tool that shortens urls
    keyword tool tips
    keyword tool traffic estimator
    keyword tool uk
    keyword tool volume
    keyword tool word document
    keyword tool yahoo
    keyword tool yuri arcurs
    lsi keyword tools
    paid keyword tools
    seo tools keyword density
    seo tools keyword list generator
    

    I dare you to find a better keyword research tool or method than this. I doubt you will, but if you do, I’d love to know about it and be proven wrong!

    Anyway, I hope you enjoyed the post. Please feel free to “steal” my code, use it, modify it, etc.

    As always, happy automation!