Project Description:
I need a web scraper written in ruby with the subsequent features:
*Must be run from the command line
*Must be written in Ruby
*Will work with Ubuntu 12.04 LTS
*Use well supported Gems
*Must be able to identify website to be scraped
*Age of pages to be scraped (every pages that i want to scrape have a date on them)
*Must scrape all TEXT (title and body of page) and output to JSON
*Must scrape all HTML and output to JSON
*Rate limiting
Skills required:
Ruby, Data Mining, Software Architecture, Web Scraping