Magento Multiple Group-by’s on a collection

This might never be useful for anyone but me, but this is an obscure little function I needed to write recently.

If you need to return a collection that groups by different columns depending on condition, this function will help you out.
Please excuse me if this is a little hard to follow. I’ll do my best to explain the requirement. Here’s a simplified example:

Think of a table that records access to a particular page called “page_views”. We’ll call each record a “view”.
Each view has these fields:

    page_id
    customer_id
    ip_address
    datetime

Say we want to get all unique views from this table for the page with page_id “10”. What constitutes a “unique” view?

Well, let’s say that we group the table by customer_id, and there you have it. But wait, what about if the customer isn’t logged in, and the user_id field is blank? Grouping by customer_id will lump all of those ‘Guest’ views together. So let’s group by ip_address instead. But wait, then we come across multiple views from the same customer_id because the customer has viewed the same content from different connections.

So we want to group by customer_id if the customer_id is set, and ip_address if the customer is browsing as a guest. using “GROUP BY customer_id, ip_address” Doesn’t work either, as it groups by the combination of the two.
What we need to use is a UNION between two selects. This is straightforward if you’re writing direct SQL:

    SELECT t.* FROM page_views WHERE page_id = '10' AND customer_id IS NOT NULL GROUP BY customer_id
    UNION SELECT t.* FROM page_views WHERE page_id = '10' AND customer_id IS NULL GROUP BY ip_address;

How would we do this with Magento?

Magento implements the Zend_Db_Select class to handle the SELECT statements for its models and collections. This allows us to do some pretty cool things with collections. To access the Zend_Db_Select class instance for a collection, you just have to call $collection->getSelect().

Assuming we’re working with a module called “pages”, and our view model is called “page_view”, the following function from within the collection model will generate the above select statement.

<?php
public function getUniques()
    {        
        $select_a = Mage::getModel('pages/page_view')->getCollection()
            ->getSelect()
            ->where("customer_id IS NOT NULL")
            ->group(array("customer_id"));
        $select_b = Mage::getModel('pages/page_view')->getCollection()
            ->getSelect()
            ->where("customer_id IS NULL")
            ->group(array("ip_address"));
        
        $this->getSelect()->reset()->union(array(
            $select_a,
            $select_b
        ));
        
        return $this;
    }
?>

Then you just call $collection->getUniques() to modify your collection.

What’s happening here? If it’s not immediately obvious, what we’re doing here:

$this->getSelect()->reset()->union(array(
         $select_a,
         $select_b
     ));
?>

is making sure we’re modifying the Zend_Db_Select instance of the collection in question. That way, the defined collection returns data based on the modifications we’ve made. Calling “->reset()” does exactly that. It basically nullifies the select statement so that we can start from scratch. Try the same thing without the reset command and call $collection->getSelect()->__toString() on the returned data. It won’t look pretty.

We then use Zend_Db_Select’s “union” function, which takes an array of Zend_Db_Select instances as a parameter.

As usual when working with the select class of a collection, don’t go saving the models returned. This is only recommended for data output, or for indexing in a separate flat table.

Happy Magento-ing!

Tagged , ,

One thought on “Magento Multiple Group-by’s on a collection

Leave a comment