[Django]-Insert pandas data frame into Postgres

9๐Ÿ‘

If dataframe has column names same as column names in database, you can insert df directly into database table using dataframe.to_sql() method with the help of sqlalchemy for connection:

from myapp.models import Bob
from sqlalchemy import create_engine
from django.conf import settings

db_connection_url = "postgresql://{}:{}@{}:{}/{}".format(
    settings.DATABASES['default']['USER'],
    settings.DATABASES['default']['PASSWORD'],
    settings.DATABASES['default']['HOST'],
    settings.DATABASES['default']['PORT'],
    settings.DATABASES['default']['NAME'],
)

engine = create_engine(db_connection_url)

df.to_sql(Bob._meta.db_table, engine, if_exists='append', index=False, chunksize=10000)

Missing column will be empty (or database will set default value if it defined at database level, not django level), or you can add missing column to dataframe with required value.

๐Ÿ‘คOleg Russkin

3๐Ÿ‘

Just do an Explicit Insert โ€ฆ

If your table has columns in the order of A,B,C,D,E

But your Pandas has them in the order of D,C,B,A (Note no Column E)

Just generate an SQL Insert like (Note I have no Column E)

   insert into <TABLE> (D,C,B,A) values (row_iterator.D,row_iterator.C,...) 

For the Column E โ€“ the best and simplest solution is have a default value in the Db Definition โ€ฆ.

i.e.

CREATE TABLE Bob (
    A int NOT NULL,
    B int NOT NULL,
    C int NOT NULL,
    D int NOT NULL,
    E int DEFAULT 42
);

Hope that helps

๐Ÿ‘คTim Seed

Leave a comment