-
Notifications
You must be signed in to change notification settings - Fork 83
Open
Description
Udemy.com is blocking the default User Agent of opengraph.
I'm getting
How do I set a custom user agent for OpenGraph module
urllib2.HTTPError: HTTP Error 403: Unauthorized
As a workaround I have created a custom getter using requests module
def custom_get_img_from_link(link):
"""
"""
#headers = {"User-Agent":get_random_UA()}
headers = {"User-Agent": "My bot"}
r = requests.get(link, headers=headers)
parsed_uri = urlparse(link)
domain = '{uri.scheme}://{uri.netloc}/'.format(uri=parsed_uri)
OpenGraph.parser = parser
OpenGraph.scrape = True # workaround for some subtle bug in opengraph
page = OpenGraph(html=r.content)
if page.is_valid():
image_url = page.get('image', None)
if not image_url.startswith('http'):
image_url = urljoin(domain, page['image'])
return image_url
bzimor, advance512 and herrsergio
Metadata
Metadata
Assignees
Labels
No labels