Automate Everything w/ Bash, Linux & Command Line
  1. 16 Lines of Bash = Keyword Research Glory

    I’ve previously talked a bit about why I perform keyword research in the manner I do. I also shared several short Bash scripts that allow you to scrape search suggestion tools provided by Google, Yahoo, Bing and Amazon. It’s now time to show you how to take those tools and combine them into a thoroughly beastly keyword research tool.

    The Starting Point

    Here is the version of the script we ended with in the last post.

    #!/bin/bash
    
    q=$(echo "$1" | sed 's/ /%20/g')
    
    clear
    echo -e "\nGetting Suggestions for \"$1\""
    
    echo -e "\nGoogle:"
    curl -s "http://www.google.com/s?sugexp=pfwl&cp=15&q=$q" | sed 's/\[/\n\[/g' | cut -d'"' -f2 | tail -n +4
    
    echo -e "\nAmazon:"
    curl -s "http://t1-completion.amazon.com/search/complete?method=completion&q=$q&search-alias=aps&client=amazon-search-ui&mkt=1&x=updateISSCompletion&sc=1" | sed 's/,\[{".*//g;s/,/\n/g' | cut -d'"' -f2 | grep -v '\[\|\]\|\{\|\]' | tail -n +2
    
    echo -e "\nYahoo:"
    y_1=$(curl -s "http://sugg.us.search.yahoo.net/gossip-us-ura?droprotated=1&output=sd1&command=$q&nresults=10")
    # this way uses our keyword as a suffix rather than prefix. usually there are some duplicates with the first method but those are removed later.
    y_2=$(curl -s "http://sugg.us.search.yahoo.net/gossip-us-ura?droprotated=0&output=sd1&command=$q&nresults=10")
    echo $y_1 $y_2 | sed 's/{/\n{/g' | grep '"k"' | cut -d'"' -f4 | sort -u
    
    echo -e "\nBing:"
    curl -s "http://api.bing.com/qsonhs.aspx?FORM=ASAPIW&mkt=en-US&type=cb&cb=sa_inst.apiCB&q=$q&cp=13&bq=$q" | sed 's/{/\n{/g' | grep '"Txt"' | cut -d'"' -f4
    echo
    

    If you run the script like this…

    ./script.sh "galaxy nexus"
    

    then you’ll get the following results:

    Getting Suggestions for "galaxy nexus"
    
    Google:
    galaxy nexus
    galaxy nexus release date
    galaxy nexus review
    galaxy nexus verizon
    galaxy nexus dock
    galaxy nexus sprint
    galaxy nexus accessories
    galaxy nexus extended battery
    galaxy nexus at\u0026t
    galaxy nexus specs
    
    Amazon:
    galaxy nexus
    galaxy nexus case
    galaxy nexus dock
    galaxy nexus extended battery
    galaxy nexus battery
    galaxy nexus screen protector
    galaxy nexus accessories
    galaxy nexus unlocked
    galaxy nexus verizon
    galaxy nexus car dock
    
    Yahoo:
    galaxy nexus
    galaxy nexus 4g lte
    galaxy nexus accessories
    galaxy nexus prime
    galaxy nexus release date
    galaxy nexus review
    galaxy nexus sprint
    galaxy nexus update
    galaxy nexus verizon
    galaxy nexus vs iphone 4s
    samsung galaxy nexus
    
    Bing:
    galaxy nexus
    

    This is great, but there are many keyword opportunities that we are missing out on. Each of these sites return a maximum quantity of suggestions at one time. Each engine will choose the 10 “best” results. We can a different set of 10 results for each by just appending a space and then one letter to the end of our root keyphrase like you’re starting to type another word on to the end of your query. You can see why I mean by trying it yourself in Google.

    Since Google will evaluate each keystroke you type and send a GET request back to their servers to return updated suggestions, we can use this to our advantage to greatly expand the list of results returned. Let’s rerun the script above, but this time we’ll append an a to the original keyphrase…

    ./script.sh "galaxy nexus a"
    

    and you’ll get this:

    Getting Suggestions for "galaxy nexus a"
    
    Google:
    galaxy nexus accessories
    galaxy nexus at\u0026t
    galaxy nexus amazon
    galaxy nexus at\u0026t release date
    galaxy nexus apps
    galaxy nexus accessories verizon
    galaxy nexus armband
    galaxy nexus accessories dock
    galaxy nexus adb driver
    galaxy nexus amazonwireless
    
    Amazon:
    galaxy nexus accessories
    galaxy nexus armband
    galaxy nexus att
    galaxy nexus android smartphone
    galaxy nexus anti glare
    galaxy nexus android case
    galaxy nexus androidified
    galaxy nexus assesories
    galaxy nexus android
    galaxy nexus anti glare screen protector
    
    Yahoo:
    galaxy nexus accessories
    galaxy nexus ad mix up
    galaxy nexus amazon
    galaxy nexus and facebook
    galaxy nexus android
    galaxy nexus and verizon
    galaxy nexus apps
    galaxy nexus at
    galaxy nexus at&t
    galaxy nexus availability
    
    Bing:
    

    Bing is the only engine which doesn’t offer additional suggestions (maybe Microsoft doesn’t like the Galaxy Nexus?). Anyway, you get the idea. You could just iterate through the alphabet by rerunning the script for each letter. But, why would we want to do that? We want to automate everything!

    This is the perfect opportunity to use a for loop as provided by Bash. There are many ways to use them, so I recommend running through the examples provided here if you want to get a better understanding of how they work.

    Keyword Suggest Script w/ Iteration

    #!/bin/bash
    
    q=$(echo "$1" | sed 's/ /%20/g')
    tmp=tmp.txt
    
    echo "" > $tmp
    for suffix in {a..z}
    do
        curl -s "http://www.google.com/s?sugexp=pfwl&cp=15&q=$q%20$suffix" | sed 's/\[/\n\[/g' | cut -d'"' -f2 | tail -n +4 >> $tmp
        curl -s "http://t1-completion.amazon.com/search/complete?method=completion&q=$q%20$suffix&search-alias=aps&client=amazon-search-ui&mkt=1&x=updateISSCompletion&sc=1" | sed 's/,\[{".*//g;s/,/\n/g' | cut -d'"' -f2 | grep -v '\[\|\]\|\{\|\]' | tail -n +3 >> $tmp
        curl -s "http://sugg.us.search.yahoo.net/gossip-us-ura?droprotated=1&output=sd1&command=$q%20$suffix&nresults=10" | sed 's/{/\n{/g' | grep '"k"' | cut -d'"' -f4 >> $tmp
        curl -s "http://api.bing.com/qsonhs.aspx?FORM=ASAPIW&mkt=en-US&type=cb&cb=sa_inst.apiCB&q=$q%20$suffix&cp=13&bq=$q" | sed 's/{/\n{/g' | grep '"Txt"' | cut -d'"' -f4 >> $tmp
    done
    
    sed '/^$/d' $tmp | sort | uniq
    rm $tmp
    

    This script will run the previous commands one time per letter of the alphabet for each site per keyword. That means we’ve gone from running the previous script 104 times, to running this script once. It also does the tedious work of removing duplicate suggestions and then sorts the results alphabetically. By using "galaxy nexus" as my root phrase again, I get a total of 551 unique keyword suggestions.

    I urge you to try this yourself, but don’t abuse it or Google will just end up giving you the dreaded CAPTIA page. Of course I recommend reviewing the results and filtering the lists down further.

    I bet this method of keyword research will replace your previous method very quickly. If not, prove it by sharing your method in the comments below or by contacting me through Google+.

    Happy automation!

     
    1. automateeverything posted this