[Django]-How to query for distinct groups of entities which are joined via a self-referential many-to-many table?

0๐Ÿ‘

โœ…

If you are looking for connected subgraphs, here is an approach.

You can characterize a connected subgraph by the minimum id of any node in it.

Start with a table to store the subgraph ids, something like:

create table subgraphids (
     personid int,
     subgraphid int
);

Initialize it:

insert into subgraphids(personid, subgraphid)
    select personid, min(subgraphid)
    from (select from_person_id as personid,
                 least(from_person_id, to_person_id) as subgraphid
          from cliques
          union all
          select to_person_id, least(from_person_id, to_person_id)
          from cliques
         ) t
    group by personid;

Now you have tentative subgraphids. To update them, you use a sort-of-similar query:

update subgraphid
    set subgraphid = (select min(s.subgraphid)
                      from cliques c join
                           subgraphid s
                           on c.from_person_id = s.personid or
                              c.to_person_id = s.personid
                      where subgraphid.personid = clique.from_person_id or
                            subgraphid.personid = click.to_person_id
                     );

Repeat this until no rows are updated. You can check for that condition explicitly:

select count(*)
from subgraphid
where subgraphid > (select min(s.subgraphid)
                    from cliques c join
                         subgraphid s
                         on c.from_person_id = s.personid or
                            c.to_person_id = s.personid
                    where subgraphid.personid = clique.from_person_id or
                          subgraphid.personid = click.to_person_id
                   );

This will find connected subgraphs in your original graph. The iteration needs to take place outside of SQL, in the calling code.

๐Ÿ‘คGordon Linoff

Leave a comment