- Joined
- 5/1/08
- Messages
- 137
- Points
- 26
This question is oriented around python, but it is really about client authentication in general. My goal is to parse data from a website that requires a user name and password. Having been unsuccessful there, I decided to see if I could parse quantnet ( specifically the 9841 stats in finance course forum ). I have been dually unsuccessful, so I am wondering if there is something wrong with my method. Any help would be greatly appreciated. My code is in python, but I think it should be fairly universal to anyone with knowledge of http protocol:
For the lexical nightmare class names, I may have messed up spelling them here, but they are right in my script. I am able to open the page, and I get no error code ( code = 200 - ok ). However, when I look at the page I see that I'm logged in as a guest, and I am unable to view anything that would require a login. Further, I can see in the generated page where my request did generate an error because I am not logged in.
Note: I have also changed the user-agent header from python to Mozilla just in case quantnet doesn't like robots roaming around.
Many thanks in advance to anyone who can show me the light.
Code:
import urllib2
my_username = 'blablabla'
my_password = 'ferferfer'
#stats forum
forum = 'http://www.quantnet.com/forum/forumdisplay.php?f=139'
#request object
request = urllib2.Request( forum )
#password manager
manager = urllib2.HTTPPasswordManagerWithDefaultRealm()
manager.add_password(realm = None,uri = 'http://www.quantnet.com',
user=my_username, passwd = my_password)
#Authentication handler
handler = urllib2.HTTPBasicAuthenticationHandler( manager )
#build the opener
opener = urllib2.build_opener( handler )
# open the page
page = opener.open( request ).read()
Note: I have also changed the user-agent header from python to Mozilla just in case quantnet doesn't like robots roaming around.
Many thanks in advance to anyone who can show me the light.