17π
This works for me. Copy all content from a local directory to a specific bucket-name/full-path (recursive) in google cloud storage:
import glob
from google.cloud import storage
def upload_local_directory_to_gcs(local_path, bucket, gcs_path):
assert os.path.isdir(local_path)
for local_file in glob.glob(local_path + '/**'):
if not os.path.isfile(local_file):
upload_local_directory_to_gcs(local_file, bucket, gcs_path + "/" + os.path.basename(local_file))
else:
remote_path = os.path.join(gcs_path, local_file[1 + len(local_path):])
blob = bucket.blob(remote_path)
blob.upload_from_filename(local_file)
upload_local_directory_to_gcs(local_path, bucket, BUCKET_FOLDER_DIR)
10π
A version without a recursive function, and it works with βtop level filesβ (unlike the top answer):
import glob
import os
from google.cloud import storage
GCS_CLIENT = storage.Client()
def upload_from_directory(directory_path: str, dest_bucket_name: str, dest_blob_name: str):
rel_paths = glob.glob(directory_path + '/**', recursive=True)
bucket = GCS_CLIENT.get_bucket(dest_bucket_name)
for local_file in rel_paths:
remote_path = f'{dest_blob_name}/{"/".join(local_file.split(os.sep)[1:])}'
if os.path.isfile(local_file):
blob = bucket.blob(remote_path)
blob.upload_from_filename(local_file)
- Django 1.7 removing Add button from inline form
- Is there a way to combine behavior of SESSION_EXPIRE_AT_BROWSER_CLOSE and SESSION_COOKIE_AGE
- Django NameError: name 'views' is not defined
4π
A folder is a cataloging structure containing references to files and directories. The library will not accept a folder as an argument.
As far as I understand, your use case is to make an upload to GCS preserving a local folder structure. To accomplish that you can use the os python module and make a recursive function (e.g process_folder) that will take path as an argument. This logic can be used for the function:
- Use os.listdir() method to get a list of objects within the source path (will return both files and folders).
- Iterate over a list from step 1 to separate files from folders via os.path.isdir() method.
- Iterate over files and upload them with adjusted path (e.g. path+ β/β + file_name).
- Iterate over folders making a recursive call (e.g. process_folder(path+folder_name)).
Itβll be necessary to work with two paths:
- Real system path (e.g. β/Users/User/β¦/upload_folder/folder_nameβ) used with os module.
- Virtual path for GCS file uploads (e.g. βuploadβ+β/β + folder_name + β/β + file_name).
Donβt forget to implement exponential backoff referenced at [1] to deal with 500 errors. You can use a Drive SDK example at [2] as a reference.
[1] β https://developers.google.com/storage/docs/json_api/v1/how-tos/upload#exp-backoff
[2] β https://developers.google.com/drive/web/handle-errors
- Django update on queryset to change ID of ForeignKey
- Error installing mysqlclient for python on Ubuntu 18.04
- Change row colour in Django Admin List
- Pass initial value to a modelform in django
- In Django admin, how can I hide Save and Continue and Save and Add Another buttons on a model admin?
1π
I assume the sheer filename = "D:\foldername"
is not enough info about the source code. Neither am I sure that this is even possible.. via the web interface you can also just upload files or create folders where you then upload the files.
You could save the folders name, then create it (Iβve never used the google-app-engine, but I guess that should be possible) and then upload the contents to the new folder
- Django β When should I use signals and when should I override save method?
- How to find out the summarized text of a given URL in python / Django?
- Performance, load and stress testing in Django
0π
Refer β
https://hackersandslackers.com/manage-files-in-google-cloud-storage-with-python/
from os import listdir
from os.path import isfile, join
...
def upload_files(bucketName):
"""Upload files to GCP bucket."""
files = [f for f in listdir(localFolder) if isfile(join(localFolder, f))]
for file in files:
localFile = localFolder + file
blob = bucket.blob(bucketFolder + file)
blob.upload_from_filename(localFile)
return f'Uploaded {files} to "{bucketName}" bucket.'
- How to use select_related with GenericForeignKey in django?
- Git aws.push: No module named boto
- Django: "order" a queryset based on a boolean field
- Why "class Meta" is necessary while creating a model form?
- Right way to create a django data migration that creates a group?
0π
The solution can also be used for windows systems. Simply provide the folder name to upload the destination bucket name.Additionally, it can handle any level of subdirectories in a folder.
import os
from google.cloud import storage
storage_client = storage.Client()
def upload_files(bucketName, folderName):
"""Upload files to GCP bucket."""
bucket = storage_client.get_bucket(bucketName)
for path, subdirs, files in os.walk(folderName):
for name in files:
path_local = os.path.join(path, name)
blob_path = path_local.replace('\\','/')
blob = bucket.blob(blob_path)
blob.upload_from_filename(path_local)
- Django not sending error emails β how can I debug?
- Using python string formatting in a django template
- Django: How to change a field widget in a Inline Formset
- Django Generic Foreign keys β Good or Bad considering the SQL performance?
0π
I just came across the gcsfs library which seems to be also about better interfaces
You could copy an entire directory into a gcs location like this:
def upload_to_gcs(src_dir: str, gcs_dst: str):
fs = gcsfs.GCSFileSystem()
fs.put(src_dir, gcs_dst, recursive=True)
- Renaming a Django superclass model and updating the subclass pointers correctly
- Django- getting a list of foreign key objects
- Fields and base_fields β Django
- RuntimeError: Never call result.get() within a task Celery
- How can I automatically let syncdb add a column (no full migration needed)
0π
Here is my recursive implementation . we need to create a file named gdrive_utils.py and write the following.
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from apiclient.http import MediaFileUpload, MediaIoBaseDownload
import pickle
import glob
import os
# The following scopes are required for access to google drive.
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly',
'https://www.googleapis.com/auth/drive.metadata',
'https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.appdata']
def get_gdrive_service():
"""
Tries to authenticate using a token. If token expires or not present creates one.
:return: Returns authenticated service object
:rtype: object
"""
creds = None
# The file token.pickle stores the user's access and refresh tokens, and is
# created automatically when the authorization flow completes for the first
# time.
if os.path.exists('token.pickle'):
with open('token.pickle', 'rb') as token:
creds = pickle.load(token)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'keys/client-secret.json', SCOPES)
creds = flow.run_local_server(port=0)
# Save the credentials for the next run
with open('token.pickle', 'wb') as token:
pickle.dump(creds, token)
# return Google Drive API service
return build('drive', 'v3', credentials=creds)
def createRemoteFolder(drive_service, folderName, parent_id):
# Create a folder on Drive, returns the newely created folders ID
body = {
'name': folderName,
'mimeType': "application/vnd.google-apps.folder",
'parents': [parent_id]
}
root_folder = drive_service.files().create(body = body, supportsAllDrives=True, fields='id').execute()
return root_folder['id']
def upload_file(drive_service, file_location, parent_id):
# Create a folder on Drive, returns the newely created folders ID
body = {
'name': os.path.split(file_location)[1],
'parents': [parent_id]
}
media = MediaFileUpload(file_location,
resumable=True)
file_details = drive_service.files().create(body = body,
media_body=media,
supportsAllDrives=True,
fields='id').execute()
return file_details['id']
def upload_file_recursively(g_drive_service, root, folder_id):
files_list = glob.glob(root)
if files_list:
for file_contents in files_list:
if os.path.isdir(file_contents):
# create new _folder
new_folder_id = createRemoteFolder(g_drive_service, os.path.split(file_contents)[1],
folder_id)
upload_file_recursively(g_drive_service, os.path.join(file_contents, '*'), new_folder_id)
else:
# upload to given folder id
upload_file(g_drive_service, file_contents, folder_id)
After that use the following
import os
from gdrive_utils import createRemoteFolder, upload_file_recursively, get_gdrive_service
g_drive_service = get_gdrive_service()
FOLDER_ID_FOR_UPLOAD = "<replace with folder id where you want upload>"
main_folder_id = createRemoteFolder(g_drive_service, '<name_of_main_folder>', FOLDER_ID_FOR_UPLOAD)
And finally use this
upload_file_recursively(g_drive_service, os.path.join("<your_path_>", '*'), main_folder_id)
0π
Another option is to use gsutil
, the command-line tool for interacting with Google Cloud:
gsutil cp -r ./my/local/directory gs://my_gcp_bucket/foo/bar
The -r
flag tells gsutil
to copy recursively. Link gsutil
to documentation.
Invoking gsutil
in Python can be done like this:
import subprocess
subprocess.check_call('gsutil cp -r ./my/local/directory gs://my_gcp_bucket/foo/bar')
- How to filter JSON Array in Django JSONField
- Linking django and mysql containers using docker-compose
- Django is very slow on my machine
- In Django, how can I prevent a "Save with update_fields did not affect any rows." error?