Harishvk27’s Weblog

December 4, 2008

collecting url list from web…

Filed under: random - collection — harishvk27 @ 2:31 pm

i wanted to collect random urls from net… and this post talks about my current dirty way of doing so.

(1) Visited http://www.adddirectoryeasy.com/ and collected the source pages and executes following command;

(2) grep “a href=” web-sites.txt | grep “target=\”_blink\”" | awk ‘{ print $2 }’ | grep -v src | cut -d’=’ -f2 | cut -d’”‘ -f2 | uniq | wc -l > websites.txt

No Comments Yet »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.