Get redirected url from response object - treq (twisted)

treq follows page redirects (by default) like how a browser does. So, no extra work needed here. But at times, we might need to know from where actually we got the response from.

Unlike python-request library, the treq/twisted response object doesn't have the 'url' attribute which contains the redirected url. But fortunately response object (in Twisted > 13.1.0) have 'previousResponse' attributes, from which we get the redirected url as below.


#!/bin/env python 
 
__author__ = 'vignesh' 
 
import treq
from twisted.internet import defer, reactor


def get_page(url):
    print "Requesting URL : %s" % url
    d = treq.get(url=url)

    def callback(result):
        previous = result.previousResponse
        if previous:
            location = previous.headers.getRawHeaders("location")
            if location:
                response_from_url = location[0]
            else:
                response_from_url = url
        else:
            response_from_url = url

        print "Got response from %s" % response_from_url

    def errback(error):
        print error

    d.addCallbacks(callback, errback)
    d.addBoth(lambda x: reactor.stop())

    reactor.run()

if __name__ == "__main__":
    get_page("http://google.com")
 
 
Output:

Requesting URL : http://google.com
Got response from 'http://www.google.co.in/?gfe_rd=cr&ei=leGSVdrDB6fv8wfvn4G4BQ'

 
 

Comments

Popular posts from this blog

Can a Coroutine is useful without Asynchronous IO ?

Different way to search for files in Gnome

Effective Use Of For Loop in Bash Shell Scripting