0π
β
If I do any more parsing of html I probably will look into one of the libraries suggested. But for now I have solved this by:
startImgPos = post.find('<img', 0, len(post)) + 4
if(startImgPos > -1):
endImgPos = post.find('>', startImgPos, len(post))
imageTag = post[startImgPos:endImgPos]
startSrcPos = imageTag.find('src="', 0, len(post)) +5
endSrcPos = imageTag.find('"', startSrcPos , len(post))
linkTag = imageTag[startSrcPos:endSrcPos]
r['linktag'] = linkTag
Iβll improve this later, but for now it does the trick. Feel free to suggest any more ideas/improvements to the above code.
π€Daniel Ryan
9π
You can use BeautifulSoup to do this:
http://www.crummy.com/software/BeautifulSoup/
Itβs a XML/HTML parser. So you pass in the raw html, and then you can search it for particular tags/attrs etc.
something like this should work:
tree = BeautifulSoup(raw_html)
img_link = (tree.find('img')[0]).attr['src']
π€Timmy O'Mahony
- [Django]-ManagementForm data missing error while formset validation
- [Django]-Permission classess decorator is ignored. "Authentication credentials were not provided" response
- [Django]-How should error corresponding to an AJAX request be passed to and handled at the client-side?
3π
This is exactly what Iβm looking for. Actually, the real code is like this:
tree = BeautifulSoup(raw_html)
img_link = tree.find_all('img')[0].get('src')
Works great! thanks timmy-omahony
π€toledano
- [Django]-Postgres 9.4 Django 1.9 get all json keys
- [Django]-Using django-taggit, is it possible to limit the tags to pre approved values?
- [Django]-Deploy Django\Tornado on Heroku
- [Django]-Django custom admin dashboard error
- [Django]-Is there a Django template filter that turns a datetime into "5 hours ago" or "12 days ago"?
Source:stackexchange.com