2
There are posts in the returned json with media=null
so post['data']['media']
will not have oembed
field (and hence, url
field):
{
"kind" : "t3",
"data" : {
"downs" : 24050,
"link_flair_text" : null,
"media" : null,
"url" : "http://youtu.be/aNJgX3qH148?t=4m20s",
"link_flair_css_class" : null,
"id" : "rymif",
"edited" : false,
"num_reports" : null,
"created_utc" : 1333847562,
"banned_by" : null,
"name" : "t3_rymif",
"subreddit" : "videos",
"title" : "An awesome young man",
"author_flair_text" : null,
"is_self" : false,
"author" : "Lostinfrustration",
"media_embed" : {},
"permalink" : "/r/videos/comments/rymif/an_awesome_young_man/",
"author_flair_css_class" : null,
"selftext" : "",
"domain" : "youtu.be",
"num_comments" : 2260,
"likes" : null,
"clicked" : false,
"thumbnail" : "http://a.thumbs.redditmedia.com/xUDtCtRFDRAP5gQr.jpg",
"saved" : false,
"ups" : 32312,
"subreddit_id" : "t5_2qh1e",
"approved_by" : null,
"score" : 8262,
"selftext_html" : null,
"created" : 1333847562,
"hidden" : false,
"over_18" : false
}
},
It also seems to be that your exception message doesn’t really fit: there are many kinds of exceptions that can be thrown when urlopen
blows up, such as IOError
. It does not check on whether the returned format is valid JSON as your error message imply.
Now, to mitigate the problem, you need to check if "oembed" in post['data']['media']
, and only if it does can you call post['data']['media']['oembed']['url']
, notice that I am making the assumption that all oembed
blob has url
(mainly because you need an URL to embed a media on reddit).
**UPDATE:
Namely, something like this should fix your problem:
for post in reddit_posts:
if isinstance(post['data']['media'], dict) \
and "oembed" in post['data']['media'] \
and isinstance(post['data']['media']['oembed'], dict) \
and 'url' in post['data']['media']['oembed']:
print post["data"]["media"]["oembed"]["url"]
reddit_feed.append(post["data"]["media"]["oembed"]["url"])
print reddit_feed
The reason you have that error is because for some post
, post["data"]["media"]
is None
and so you are basically calling None["oembed"]
here. And hence the error: 'NoneType' object is not subscriptable
. I’ve also realized that post['data']['media']['oembed']
may not be a dict and hence you will also need to verify if it is a dict and if url
is in it.
Update 2:
It looks like data
won’t exist sometimes either, so the fix:
import json
import urllib
try:
f = urllib.urlopen("http://www.reddit.com/r/videos/top/.json")
except Exception:
print("ERROR: malformed JSON response from reddit.com")
reddit_posts = json.loads(f.read().decode("utf-8"))
if isinstance(reddit_posts, dict) and "data" in reddit_posts \
and isinstance(reddit_posts['data'], dict) \
and 'children' in reddit_posts['data']:
reddit_posts = reddit_posts["data"]["children"]
reddit_feed = []
for post in reddit_posts:
if isinstance(post['data']['media'], dict) \
and "oembed" in post['data']['media'] \
and isinstance(post['data']['media']['oembed'], dict) \
and 'url' in post['data']['media']['oembed']:
print post["data"]["media"]["oembed"]["url"]
reddit_feed.append(post["data"]["media"]["oembed"]["url"])
print reddit_feed