Django Tips for Performance – right way to use annotation with filter.

In Django on January 10th, 2017

Whenever using annotation and filter together, use filter first then annotation for better performance.

 

class Item(models.Model):
    name = models.CharField(max_length=200)
    price = models.IntegerField(Item)
    description = models.TextField(blank=True)

class Sales(models.Model):
    item = models.ForeignKey(Item)
    qty = models.IntegerField(Item)  
    price = models.IntegerField(Item)  # sold price
    sale_dt = models.DateTimeField(auto_now_add=True)

Query #1 (Good)

Item.objects.filter(
    sales__sale_dt__gte=last24hours
).annotate(
    qty=Count('sale__qty')
)

Query #2 (Bad)

Item.objects..annotate(
    qty=Count('sale__qty')
).filter(
    sales__sale_dt__gte=last24hours
)

 

Here’s why, let’s see Raw SQL they produce.

Query #1 will product sql like:

SELECT "item"."id", "item"."name", "item"."price", "item"."description", COUNT("sales"."item_qty") AS "qty" 
    FROM
        "item" 
    INNER JOIN
        "sales" ON ( "item"."id" = "sales"."item_id" ) 
    WHERE 
        "sales"."sale_dt" >= '2017-01-09 05:22:14.449276+00:00'
    GROUP BY 
        "item"."id", "item"."name", "item"."price", "item"."description"
    ORDER BY 
        "qty"

Query #2 will product sql like:

SELECT "item"."id", "item"."name", "item"."price", "item"."description", COUNT("sales"."item_qty") AS "qty" 
    FROM
        "item"
    LEFT OUTER JOIN 
        "sale" ON ( "main_pair"."id" = "sale"."pair_id" ) 
    INNER JOIN
        "sales" T4 ON ( "item"."id" = T4."item_id" ) 
    WHERE 
        "sales"."sale_dt" >= '2017-01-09 05:22:14.449276+00:00'
    GROUP BY 
        "item"."id", "item"."name", "item"."price", "item"."description"
    ORDER BY 
        "qty"

Query two has additional join which means it require more resources and time.

Hence always use annotation after filter, otherwise there will be performance issue.