6đź‘Ť
A simpler solution, inspired by the @3066d0’s one:
renderers.py
class ReportsRenderer(CSVStreamingRenderer):
header = [ ... ]
labels = { ... }
views.py
class ReportCSVViewset(ListModelMixin, GenericViewSet):
queryset = Report.objects.select_related('stuff')
serializer_class = ReportCSVSerializer
renderer_classes = [ReportsRenderer]
PAGE_SIZE = 1000
def list(self, request, *args, **kwargs):
queryset = self.filter_queryset(self.get_queryset())
response = StreamingHttpResponse(
request.accepted_renderer.render(self._stream_serialized_data(queryset)),
status=200,
content_type="text/csv",
)
response["Content-Disposition"] = 'attachment; filename="reports.csv"'
return response
def _stream_serialized_data(self, queryset):
serializer = self.get_serializer_class()
paginator = Paginator(queryset, self.PAGE_SIZE)
for page in paginator.page_range:
yield from serializer(paginator.page(page).object_list, many=True).data
The point is that you need to pass a generator that yields serialized data as the data
argument to the renderer, and then the CSVStreamingRenderer
does its things and streams the response itself. I prefer this approach, because this way you do not need to override the code of a third-party library.
2đź‘Ť
Django’s StreamingHttpResponse
can be much slower than a traditional HttpResponse
for small responses.
Don’t use it if you don’t need to; the Django Docs actually recommend that StreamingHttpResponse
should only be used in when it is absolutely required that the whole content isn’t iterated before transferring the data to the client.”
Also for your problem you may find useful setting the chunk_size, switching to FileResponse or returning to a normal Response (if using the REST framework) or HttpResponse.
Edit 1: About setting the chunk size:
In the File api you can open the File in chunks so not all the file gets loaded in memory.
I hope you find this useful.
- No module named six
- How to configure Apache to run ASGI in Django Channels? Is Apache even required?
- JSON data convert to the django model
0đź‘Ť
So I ended up coming to a solution I was happy with using the Paginator
class with the queryset. First, I wrote a renderer that subclassed the CSVStreamingRenderer
, then used that in my CSVViewset’s Renderer.
renderers.py
from rest_framework_csv.renderers import CSVStreamingRenderer
# *****************************************************************************
# BatchedCSVRenderer
# *****************************************************************************
class BatchedCSVRenderer(CSVStreamingRenderer):
"""
a CSV renderer that works with large querysets returning a generator
function. Used with a streaming HTTP response, it provides response bytes
instead of the client waiting for a long period of time
"""
def render(self, data, renderer_context={}, *args, **kwargs):
if 'queryset' not in data:
return data
csv_buffer = Echo()
csv_writer = csv.writer(csv_buffer)
queryset = data['queryset']
serializer = data['serializer']
paginator = Paginator(queryset, 50)
# rendering the header or label field was taken from the tablize
# method in django rest framework csv
header = renderer_context.get('header', self.header)
labels = renderer_context.get('labels', self.labels)
if labels:
yield csv_writer.writerow([labels.get(x, x) for x in header])
else:
yield csv_writer.writerow(header)
for page in paginator.page_range:
serialized = serializer(
paginator.page(page).object_list, many=True
).data
# we use the tablize function on the parent class to get a
# generator that we can use to yield a row
table = self.tablize(
serialized,
header=header,
labels=labels,
)
# we want to remove the header from the tablized data so we use
# islice to take from 1 to the end of generator
for row in itertools.islice(table, 1, None):
yield csv_writer.writerow(row)
# *****************************************************************************
# ReportsRenderer
# *****************************************************************************
class ReportsRenderer(BatchedCSVRenderer):
"""
A render for returning CSV data for reports
"""
header = [ ... ]
labels = { ... }
views.py
from django.http import StreamingHttpResponse
from rest_framework import mixins, viewsets
# *****************************************************************************
# CSVViewSet
# *****************************************************************************
class CSVViewSet(
mixins.ListModelMixin,
viewsets.GenericViewSet,
):
def list(self, request, *args, **kwargs):
queryset = self.get_queryset()
return StreamingHttpResponse(
request.accepted_renderer.render({
'queryset': queryset,
'serializer': self.get_serializer_class(),
})
)
# *****************************************************************************
# ReportsViewset
# *****************************************************************************
class ReportCSVViewset(CSVViewSet):
"""
Viewset for report CSV output
"""
renderer_classes = [ReportCSVRenderer]
serializer_class = serializers.ReportCSVSerializer
def get_queryset(self):
queryset = Report.objects.filter(...)
This might seem like a lot for a streaming response, but we used the BatchedCSVRender
and CSVViewset
in a bunch of other places. If you’re running your server behind nginx then it might also be useful to adjust the settings there to allow streaming responses.
Hopefully this helps anyone having the same goal. Let me know if there’s any other information I can provide.
- Using django how can I combine two queries from separate models into one query?
- Django: vps or shared hosting?
- Filter queryset by reverse exists check in Django
- Django password reset. Not sending mail
0đź‘Ť
You need to provide the CSV headers (via the header
param) when rendering the data:
renderer.render(data, renderer_context={'header': ['header1', 'header2', 'header3']})
If you don’t specify the header
parameter, djangorestframework-csv
will attempt to “guess” the CSV headers by itself. To “guess” the CSV headers, djangorestframework-csv
will load all your data
in memory, resulting in the delay you are experiencing.
- Django-rest-framework HyperlinkedIdentityField with multiple lookup args
- Nested URL Patterns in Django REST Framework
- Django: Giving validation error feedback on a form after a post redirect get
- Django Login with Ajax?
- Got AttributeError when attempting to get a value for field on serializer