Group by multiple columns failed

Posts   
1  /  2
 
    
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39625
Joined: 17-Aug-2003
# Posted on: 01-Apr-2008 19:31:00   

savanna wrote:

Otis, you are so smart! You solved all my problems now simple_smile

Smart? Nah, I have hit that wall already a couple of times plus all the credit for this solution should go to ggpnet! sunglasses

So it works now? Cool! simple_smile Keep me posted if you run into more issues. simple_smile

The last couple of days I managed to bring down memory consumption of entities down by 10-20% and during transactions with 90% (so no more explosions of memory usage during saves), so this will be in the next build simple_smile

Frans Bouma | Lead developer LLBLGen Pro
rossmcw
User
Posts: 28
Joined: 27-Mar-2008
# Posted on: 03-Apr-2008 23:01:30   

I too have run into an issue with group by aggregates. There still seems to be an issue with aliases used in aggregates.

Using Mar-28 build, from Northwind where 'db' is LinqMetaData:

from c in db.Customers
join o in db.Orders on c.CustomerID equals o.CustomerID into og
from o in og.DefaultIfEmpty()
group o by new
{
    c.CustomerID,
    c.CompanyName
} into g
select new
{
    Customer = g.Key,
    LastOrderDate = g.Max(d => d.OrderDate),
    TotalOrders = g.Count()
}

which generates in SQL:

SELECT  [LPA_L1].[CustomerID], 
        [LPA_L1].[CompanyName], 
        [LPA_L1].[LPAV_] AS [LastOrderDate], 
        [LPA_L1].[LPAV_1] AS [TotalOrders] 
FROM 
(
    SELECT  [LPA_L3].[CustomerID], 
            [LPA_L3].[CompanyName], 
            MAX([LPLA_6].[OrderDate]) AS [LPAV_], 
            COUNT(*) AS [LPAV_1] 
    FROM ([Northwind].[dbo].[Customers] [LPA_L3]  
    LEFT JOIN [Northwind].[dbo].[Orders] [LPA_L4]  
        ON  [LPA_L3].[CustomerID] = [LPA_L4].[CustomerID]) 
    GROUP BY [LPA_L3].[CustomerID], [LPA_L3].[CompanyName]
) LPA_L1

which dies with a 'The multi-part identifier "LPLA_6.OrderDate" could not be bound.' error as [LPLA_6] does not exist.

The above linq query works fine in Linq2SQL.

Any help would be greatly appreciated.

Thanks Ross

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39625
Joined: 17-Aug-2003
# Posted on: 04-Apr-2008 09:23:05   

Hmm, will check it out!

(edit) reproduced. Is likely caused by the alias rewriting required for DefaultIsEmpty, which is a very nasty method to implement as a groupjoin target (og) has to be pulled from another part of the tree with aliases which are actually hidden for it (out of scope as it's in another part of the tree).

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39625
Joined: 17-Aug-2003
# Posted on: 07-Apr-2008 11:05:49   

Linq is one heck (pun intended) of a mess internally sometimes. A 'join' is always the 'right' side. Your Max() has as source the groupby. Say the groupby has alias LPLA_7. The aggregate is placed inside the groupby query (otherwise it doesn't work, a difference between linq and SQL) so it has to refer to the SOURCE of the groupby expression. But that's a join. Which side to refer to? Well... that's always the right side. It has been a while since I worked on joins and looking up the design docs where I had noted some remarks about this indeed showed this. disappointed . Adding that 'special case check' to the preprocessor of the tree makes the query generate the proper SQL. (It dies inside the projector where a NULL date can't be handled, working on that).

It's this kind of special case programming which makes writing a linq provider... 'a challenge'. Ironically, the linq to sql code is packed with this kind of special case tests as well. If you use reflector on some methods you'll see it has some very awkward if expressions which are very long (and test some obscure combinations of boolean expressions which apparently have a relation but that's unclear). Oh well.... simple_smile

The NULL date should be handled in the projector (2 customers in northwind don't have orders, so the aggregate returns NULL for last order date. However the projector gets confused in this particular case. Will check that out as well. simple_smile

What's especially crap about Linq's design (if I may say so) is that it tries to work with sequences instead of sets, but that's only because linq to objects works that way (as objects in memory have an order, sets don't). So instead of designing Linq as a language working with sets, which COULD work with sequences (which have an order), which would have made things much easier for everyone, they instead design it around sequences and the providers which have to work with sets (supertype of sequence wink ) have to jump through hoops to make the functionality work. Which of course doesn't work out that well...

(edit)Ok fixed that. simple_smile

Btw, your query has a bug too: if the customer has no orders, TotalOrders is 1. This is because you do a left join so Customer left join Orders always has a row: the customer. This is the result in linq to sql and our code.

The issue with the alias and the null issue are fixed in the next build.

Frans Bouma | Lead developer LLBLGen Pro
rossmcw
User
Posts: 28
Joined: 27-Mar-2008
# Posted on: 07-Apr-2008 22:46:41   

Thanks Frans,

I will check out the new build and let you know how I get on.

rossmcw
User
Posts: 28
Joined: 27-Mar-2008
# Posted on: 11-Apr-2008 07:35:30   

Btw, your query has a bug too: if the customer has no orders, TotalOrders is 1. This is because you do a left join so Customer left join Orders always has a row: the customer. This is the result in linq to sql and our code.

I have been looking at this and was wondering how I should go about doing it properly, any thoughts? It seems so simple, and I am feeling quite stupid for not knowing. simple_smile

I have tried a couple of things.

                   from c in db.Customers
                    join o in db.Orders on c.CustomerID equals o.CustomerID into og
                    from o in og.DefaultIfEmpty()
                    group o by new
                    {
                        c.CustomerID,
                        c.CompanyName
                    } into g
                    select new
                    {
                        Customer = g.Key,
                        LastOrderDate = g.Max(d => d.OrderDate),
                        TotalOrders = g.Count(d => d.OrderID != null)
                    }

but this doesnt return the customers with 0 orders. I also tried:

                    from c in db.Customers
                    join o in db.Orders on c.CustomerID equals o.CustomerID into og
                    from o in og.DefaultIfEmpty()
                    group o by new
                    {
                        c.CustomerID,
                        c.CompanyName
                    } into g
                    select new
                    {
                        Customer = g.Key,
                        LastOrderDate = g.Max(d => d.OrderDate),
                        TotalOrders = g.Where(d => d.OrderID != null).Count()
                    }

but this blew up with 'Unable to cast object of type 'System.Linq.Expressions.ParameterExpression' to type 'SD.LLBLGen.Pro26.LinqSupportClasses.ExpressionClasses.SetExpression'.'

Any pointers greatly appreciated simple_smile

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39625
Joined: 17-Aug-2003
# Posted on: 11-Apr-2008 14:38:56   

You want the customers and their total # of orders? There's a unittest in the pack which does this actually wink


[Test]
public void GetAllCustomerIDsWithTheirTotalNumberOfOrders()
{
    using(DataAccessAdapter adapter = new DataAccessAdapter())
    {
        LinqMetaData metaData = new LinqMetaData(adapter);
        var q = from c in metaData.Customer
                select new { CustomerID = c.CustomerId, NumberOfOrders = c.Orders.Count() };

        int counter = 0;
        foreach(var v in q)
        {
            counter++;
            Console.WriteLine("CustomerID: {0}. NumberOfOrders: {1}", v.CustomerID, v.NumberOfOrders);
        }
        Assert.AreEqual(91, counter);
    }
}

i.o.w.: you use a scalar query for the count. There's no other way, as you otherwise always get the row with the customer data when no orders are present.

(edit), the second query you posted, which crashed, actually does reveal the right info as well, in linq to sql it does. To support that it's a bit difficult, as it needs more special case code: the groupby query has to be cloned completely, the clone query furthermore has to geta correlation filter with the groupby fields in the projection of the original query. THAT clone query is then the source of the aggregate. As MS has build a big cloner visitor (internal of course) which clones a complete tree, and I haven't, it's a bit difficult to build this in, also because it's a scenario which has workarounds (using scalar queries without grouping). So I'll leave it for now (Will file the issue of the second query as a bug but won't fix it for now).

Frans Bouma | Lead developer LLBLGen Pro
1  /  2