Skip to content Skip to sidebar Skip to footer

Optimizing Mysql Query (likes/dislikes)

My website contains pieces of content on which users can vote (like/dislike similar to reddit upvotes). When selecting an individual piece of content, I run the following subquerie

Solution 1:

In terms of performance, those correlated subqueries can eat your lunch. And devour your lunchbox too, for large sets, because of the way MySQL processes them. Each of those subqueries gets executed for every row returned in the outer query. And that can get very expensive for large sets.

An alternative approach is to use an inline view to materialize the likes and dislikes for all content, and then do a join operation to that.

But, this approach can be expensive too, particularly when you are only needing the vote "counts" for just a few content rows, out of a bazillion rows. Often, there is a predicate from the outer query that can also be incorporated into the inline view, to limit the number of rows that need to be examined and returned.

We want to use an OUTER join to that inline view, so it returns a result equivalent to your query; returning a row from content when there are no matching rows in the vote table.

SELECT [... BUNCH OF FIELDS ...]
     , COALESCE(v.likes,0) AS likes
     , COALESCE(v.dislikes,0) AS dislikes
     , COALESCE(v.myvote,'.Constants::NO_VOTE.') AS myvote
  FROM content c
  LEFTJOIN ( SELECT vt.cId
              , SUM(vt.vote ='.Constants::LIKE.') AS likes
              , SUM(vt.vote ='.Constants::DISLIKE.') AS dislikes
              , MAX(IF(vt.userId ='.USER_ID.',vt.vote,NULL)) AS myvote
           FROM votes vt
          GROUPBY vt.cId
       ) v
    ON v.cId = c.contentId

       [... OTHER STUFF ... ]

Note that the inline view query (aliased as v) is going to look at EVERY single row from the votes table. If you only need a subset, then consider adding an appropriate predicate (either in a WHERE clause or as a JOIN to another table). There's no indication from the [... OTHER STUFF ...] in your query whether it's returning just a few rows from content or if you are needing all of the rows because you are ordering by likes, etc.

For a small number of rows selected from the content table, using the correlated subqueries (like in your query) can actually be faster than materializing a huge inline view and performing a join operation against it.

Oh... and for both queries, it goes without saying that an appropriate index on the votes table with a leading column of cId will benefit performance. For the inline view, you don't want the overhead of MySQL having to perform a filesort operation on all of those rows to do the GROUP BY. And for the correlated subqueries, you want them to use a index range scan, not a full scan.

Solution 2:

Your problem is simple your current sub queries are running for every single row returned.

You need to join to that data instead. You will need to change the code I've added to give you the correct counts but this should point you in the right direction.

SELECT
    BLAH
    Likes,
    Dislikes
FROM CONTENT as C
INNER JOIN (
    SELECT 
        cID,
        COUNT(votes) as Likes, --you will need to alter this 
        COUNT(votes) as Dislikes --to count your up and downvotes
    FROM Votes
    GROUPBY cID
    ) AS V
    ON  V.cID = C.ContentID

Post a Comment for "Optimizing Mysql Query (likes/dislikes)"