Web scraping project (crawling and scraping user forums)
We are seeking to create a web crawler/scraper that visits a few specific web forums and pulls data from a couple of fields (example a forum member's name, date joined, and so on). The crawler will likely want to be customized for each specific forum we are looking at (specifically for the user forums of Tableau Software and QlikTech).
Key fundamentals-
a) Generate a list of all members of the forum, with a master key for each member (likely the URL of that member's page)
b) Pull specific data fields relating to each forum member (e.g. Name, Title, Joined Date)
c) Generate a CSV or XLS file
d) The script(s) must be in a form that we can run each week (and possibly modify), thus that we can continuously update the data.
The two web forums are described below (attached file), with the URL, necessary fields, and an instance of the data output for each
particular member.
Desired Skills are MySQL Administration, HTML, PHP