1👍
anonymous is right about dump files being the best way to load data from/to databases.
If you don’t have access to the database in order to create a dump file, it might be harder, so a python way to make it work would be to bulk_create
in batches.
For example:
inserts = []
last = len(entries)
batch_size = 10000
for i, entry in enumerate(entries): ## or your datasource
# transform data to django object
inserts.append(EntryObject(attribute='attributes...'))
if i % batch_size == 0 or i == last:
EntryObject.bulk_create(inserts) # insert batch
inserts = [] # reset batch
Then again, it depends on your datasource. Also you might want to look into running them as asynchronous tasks if it needs to be called as part of a Django view.
0👍
That is because your system would have ran out of memory (RAM). Ideal way would be to create the SQL dump file (assuming you are getting 20 billion records from another DB). And then load this SQL file into the new DB. All the data present in dump file will be loaded into PostgreSQL.
For creating the Dump file, do:
$ pg_dump -U {user-name} {source_db} -f {dumpfilename.sql}
For Restoring from dump file, do:
$ psql -U {user-name} -d {desintation_db}-f {dumpfilename.sql}
Check: How To Backup and Restore PostgreSQL Database Using pg_dump and psql for detailed information
- Showing relationships inside the model
- Which object should hold the many-to-many relationship in Django?