Hi,
I’m trying to script a scrapper to gather data from this site: http://viaziz.com/auctions/closed
I wanted to build a list of item prices so I can attempt to predict the trend of how much each item normally sells for.
However, I’m running into some issues. Right now I’m using
file_get_html('http://viaziz.com/auctions/closed');
to retrieve the page and then parse each item. It works on my computer, but once I upload to my remote server, instead of retrieving the target site, it retrieves google.com instead.
I’m guessing this is something they put in place. So, now I’m trying cURL, as I heard that cURL can imitate a browser, so in theory they wouldn’t be able to distingush between script or actual browser.
This is the function I’m using…
function get_data($url){
$ch = curl_init();
$timeout = 5;
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.1 (KHTML, like Gecko) Ubuntu/11.04 Chromium/14.0.825.0 Chrome/14.0.825.0 Safari/535.1');
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
However, it keeps coming up on my server as :
HTTP Error 500 (Internal Server Error): An unexpected condition was encountered while the server was attempting to fulfill the request.
That really isn’t much to go on…
So, my question is, what can I use to script something that can pull data from that page?
Thanks!