1👍
You can create an upload handler to upload file directly to s3. In this way you shouldn’t encounter connection timeout.
https://docs.djangoproject.com/en/1.10/ref/files/uploads/#writing-custom-upload-handlers
I did some tests and it works perfectly in my case.
You have to start a new multipart_upload with boto for example and send chunks progressively.
Don’t forget to validate the chunk size. 5Mb is the minimum if your file contains more than 1 part. (S3 Limitation)
I think this is the best alternative to django-queued-storage if you really want to upload directly to s3 and avoid connection timeout.
You’ll probably also need to create your own filefield to manage file correctly and not send it a second time.
The following example is with S3BotoStorage.
S3_MINIMUM_PART_SIZE = 5242880
class S3FileUploadHandler(FileUploadHandler):
chunk_size = setting('S3_FILE_UPLOAD_HANDLER_BUFFER_SIZE', S3_MINIMUM_PART_SIZE)
def __init__(self, request=None):
super(S3FileUploadHandler, self).__init__(request)
self.file = None
self.part_num = 1
self.last_chunk = None
self.multipart_upload = None
def new_file(self, field_name, file_name, content_type, content_length, charset=None, content_type_extra=None):
super(S3FileUploadHandler, self).new_file(field_name, file_name, content_type, content_length, charset, content_type_extra)
self.file_name = "{}_{}".format(uuid.uuid4(), file_name)
default_storage.bucket.new_key(self.file_name)
self.multipart_upload = default_storage.bucket.initiate_multipart_upload(self.file_name)
def receive_data_chunk(self, raw_data, start):
buffer_size = sys.getsizeof(raw_data)
if self.last_chunk:
file_part = self.last_chunk
if buffer_size < S3_MINIMUM_PART_SIZE:
file_part += raw_data
self.last_chunk = None
else:
self.last_chunk = raw_data
self.upload_part(part=file_part)
else:
self.last_chunk = raw_data
def upload_part(self, part):
self.multipart_upload.upload_part_from_file(
fp=StringIO(part),
part_num=self.part_num,
size=sys.getsizeof(part)
)
self.part_num += 1
def file_complete(self, file_size):
if self.last_chunk:
self.upload_part(part=self.last_chunk)
self.multipart_upload.complete_upload()
self.file = default_storage.open(self.file_name)
self.file.original_filename = self.original_filename
return self.file
3👍
I have faced the same issue and fixed it by using django-queued-storage on top of django-storages. What django queued storage does is that when a file is received it creates a celery task to upload it to the remote storage such as S3 and in mean time if file is accessed by anyone and it is not yet available on S3 it serves it from local file system. In this way you don’t have to wait for the file to be uploaded to S3 in order to send a response back to the client.
As your application behind Load Balancer you might want to use shared file system such as Amazon EFS in order to use the above approach.
- Django models and Python properties
- Django: foreign key value in a list display admin
- Django DateTimeField says 'You are 5.5 hours ahead of server time.'
- How to execute file.py on HTML button press using Django?
1👍
You can try to skip uploading the file to your server and upload it to s3 directly, then only get back an url for your application.
There is an app for that: django-s3direct you can give it a try.