3👍
Over at Fashiolista we’ve opensourced our approach to building feed systems.
https://github.com/tschellenbach/Feedly
It’s currently the largest open source library aimed at solving this problem. Think it also solves your problem of development time vs premature optimization. 🙂
To start out I would Redis as a datastorage. Later when your site gets larger it often makes sense to move to Cassandra.
The same team which built Feedly also offers a hosted API, which handles the complexity for you. Have a look at getstream.io At the moment we have client APIs for Python, Ruby, Node and PHP. In addition since its based on a heavily optimized Cassandra setup we can price it far below which a self hosted solution based on Redis would cost you.
In addition have a look at this high scalability post were we explain some of the design decisions involved:
http://highscalability.com/blog/2013/10/28/design-decisions-for-scaling-your-high-traffic-feeds.html
This tutorial will help you setup a system like Pinterest’s feed using Redis. It’s quite easy to get started with.
To learn more about feed design I highly recommend reading some of the articles which we based Feedly on:
- Yahoo Research Paper
- Twitter 2013 Redis based, with fallback
- Cassandra at Instagram
- Etsy feed scaling
- Facebook history
- Django project, with good naming conventions. (But database only)
- http://activitystrea.ms/specs/atom/1.0/ (actor, verb, object, target)
- Quora post on best practises
- Quora scaling a social network feed
- Redis ruby example
- FriendFeed approach
- Thoonk setup
- Twitter’s Approach
2👍
Unless I have a verifiable performance issue, I personally dislike premature optimization as it often has become an endless spiral into insanity for me. You might find this to be the case here as well.
- [Django]-Run a Django project from VSCODE
- [Django]-Django invite code app recommendation?
- [Django]-Django Uploaded images not displayed in production
1👍
Premature optimization is the root of all evil.
But if I were going to optimize this, I might generate another stream, and the timestamps for the actions is set by the action_object timestamp… 🙂
- [Django]-After upgrade to Django 1.11 append_slash no longer works
- [Django]-Unable to use curl to get a token with Django OAuth Toolkit