1đź‘Ť
I think you’re starting to look at the tradeoffs in the right way. The answer to your question has to do with which kind of action in your application will be hurt worse by a performance penalty.
In other words, a) are people visiting pages more often, or b) is someone inspecting page counts more often? Almost certainly, the answer is “a”. In which case it makes more sense to optimize for storing visits efficiently vs. querying the visit count efficiently.
True, if you use a table with a “page” column that stores “views” as an integer, performance may be degraded due to transaction handling. Multiple inbound requests for the same page will all need access to that row in order to increment the count. See detailed info on that in the Django documentation here.
On the other hand, creating a “visit” record which by design would end up with a lot of rows could also work. You are correct, the count could become slow, but there are ways to handle that. You could cache it, but caches work optimally for storing data that is expensive to produce but changes infrequently. This case does not meet that criterion, since a page view count is likely to change extremely often.
It also depends how accurate your count needs to be. Assuming you are using PostgreSQL, there are some useful techniques for getting approximate counts that are more performant.
Also see this info on COUNT()
performance in PostgreSQL. Obtaining an accurate row count of an entire big table is almost always slow, but due to indexing it may not be too bad if you are using a “page” key on your visits table and counting with a WHERE
condition.
Ultimately the only way to figure it out is to try it, see where the painful performance penalties (alliteration…) really are and fix them. There are always inherent tradeoffs.