Skip to content

avoid/reduce DISTINCT calls #101

@jerch

Description

@jerch

In bulk_updater we do a DISTINCT on every query in question. This is needed to avoid updating 1:n relations over and over.

Issues with this:

  • DISTINCT is placed there unconditionally, but in fact is only needed for back relations
  • DISTINCT is a known perf smell for queries
  • does not work with already sliced/limited querysets on mysql (better updatedata command #99 already applies a pk extraction hack for that)

Ideas for better handling:

  • To apply the distinct reduction only on back relations, changes to the graph output and _querysets_for_update are needed with relation type backtracking, also m2m-through handling is affected by this.
  • Depending on the cardinality/selectivity of a back relation set, an explicit pk extraction in python might perform better than query trickery. Problem here: we dont know the selectivity upfront, thus have to query and eval the pk set, which is costly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions