What is the textbook way of handling complex queries (like the one in this post)?

Posts   
 
    
morphman
User
Posts: 28
Joined: 27-Oct-2011
# Posted on: 29-Jul-2015 16:18:51   

We are running .NET 4.5.1 with LLBLGen Pro Product Version 4.2.14.0919 with SQL Server 2012 Enterprise Edition.

While I'm going to cite a specific example (which we've already worked around using Prefetching), the purpose of my question is to better understand what is the textbook way in which we should handle complex queries using LLBLGen Pro (and/or the database directly).

Below is a fairly fancy SQL where we are pulling the contents of a shopping cart and we also need to pull the price in USD of each item from some tables that store generic item "attribute" information. Attached is a database diagram covering the pertinent tables as well.

SELECT BasketItem.[ID] AS BasketItemId
      ,[BasketItem].[EndItemID]
      ,[EndItem].CatalogNumber
      ,[Quantity]
      ,EndItemAttributeUniversalValue.AttributeValue AS PriceInUSD
  FROM [BasketItem]
  INNER JOIN EndItem ON EndItem.ID = BasketItem.EndItemID
  LEFT JOIN EndItemToEndItemAttribute ON EndItemToEndItemAttribute.EndItemId = EndItem.ID 
  AND EndItemToEndItemAttribute.EndItemAttributeId 
= (SELECT ID FROM EndItemAttribute WHERE EndItemAttribute.UniqueName = 'Price_USD')
  LEFT JOIN EndItemAttributeUniversalValue ON EndItemAttributeUniversalValue.EndItemToEndItemAttributeID = EndItemToEndItemAttribute.ID
  WHERE BasketItemStatus = 'InCart'

We believe the LINQ equivalent of this query is something like the below:

var checkoutItemResults = basketEntities.Select(x => new CheckoutItemResult
                                                     {
                                                         CatalogNumber = x.EndItem.CatalogNumber,
                                                         EndItemId = x.EndItemID,
                                                         Quantity = x.Quantity,
                                                         BasketItemId = x.ID,
                                                         FormattedPrice = x.EndItem.EndItemToEndItemAttributes
                                                                                  .Where(y => y.EndItemAttribute.UniqueName == "Price_USD")
                                                                                    .SelectMany(q => q.EndItemAttributeUniversalValues)
                                                                                    .Select(f => f.AttributeValue)
                                                                                    .FirstOrDefault()
                                                     }).ToList();

We have written a number of unit tests in which we manually build an object graph of these 5 tables/entities (BasketItem, EndItem, EndItemToEndItemAttribute, EndItemAttribute, and EndItemAttributeUniversalValue), and the LINQ query returns the exact results we want. However, when we run this through LLBLGen Pro it appears to return the same PriceInUSD for every item, indicating that it isn't pulling the appropriate EndItemAttributeUniversalValue.AttributeValue.

Interestingly, if we prefetch the entities and .ToList() them so we have the in-memory object graph, we can then run the above query against that object graph and it pulls the correct results. I think this indicates that LLBLGen Pro might not be generating the right queries to handle this scenario.

This works:


//first prefetch all of this stuff using LLBLGen Pro. I think this is way overkill for the 5 fields we need to pull.
var basketEntities = entityQueryable.Where(basketItemEntity => basketItemEntity.BasketItemStatus == filter.BasketItemStatus.ToString() &&
                                                                                basketItemEntity.UserID == singleSignOnId
                                                                                && basketItemEntity.BasketID == null)
            .WithPathAny(x => x.Prefetch<EndItemEntity>(bi => bi.EndItem)
                .SubPath(ei => ei.Prefetch<EndItemToEndItemAttributeEntity>(eia => eia.EndItemToEndItemAttributes)
                    .SubPath(a => a
                        .Prefetch<EndItemAttributeEntity>(ae => ae.EndItemAttribute)
                        .Prefetch<EndItemAttributeUniversalValueEntity>(ae => ae.EndItemAttributeUniversalValues)
                    ))).ToList();

//now run this LINQ query against the entity graph that was returned by LLBLGen Pro and it gets the results we want.
var checkoutItemResults = basketEntities.Select(x => new CheckoutItemResult
                                                     {
                                                         CatalogNumber = x.EndItem.CatalogNumber,
                                                         EndItemId = x.EndItemID,
                                                         Quantity = x.Quantity,
                                                         BasketItemId = x.ID,
                                                         FormattedPrice = x.EndItem.EndItemToEndItemAttributes
                                                                                  .Where(y => y.EndItemAttribute.UniqueName == "Price_USD")
                                                                                    .SelectMany(q => q.EndItemAttributeUniversalValues)
                                                                                    .Select(f => f.AttributeValue)
                                                                                    .FirstOrDefault()
                                                     }).ToList();

So the questions are: 1) Why doesn't this specific example work?

2) When we have to do more complicated queries like this, what is the general best practice? I realize this is subjective but I'm asking your opinion simple_smile

The way I see it we can: A) do a stored procedure and expose that through LLBLGen since we are really comfortable with the SQL, and less comfortable with how to do it through LLBLGen Pro and/or LINQ B) Go with the Prefetching approach but add more intelligent filters on each path. The downside here is that we are populating a really big object graph when we really only new a few fields. Maybe worrying about this is premature optimization... but we have built massive pages in the past using this approach that have severely slowed down due to the CPU hit from populating thousands of objects when we really only needed 20 or so fields. C) Figure out if there is actually something wrong with our LINQ queries and not LLBLGen Pro. D) Some great option we haven't thought of.

Thanks so much for your help! We really love the product and have been happily using it for over 5 years now.

Attachments
Filename File size Added on Approval
database_diagram.jpg 85,593 29-Jul-2015 16:19.01 Approved
Walaa avatar
Walaa
Support Team
Posts: 14993
Joined: 21-Aug-2005
# Posted on: 29-Jul-2015 22:08:43   

The linq query contains a SelectMany(), and after that a FirstOrDefault(). This means the row picked at the end is unclear, as there's no order by.

It's different then the prefetchPath routine.

One reason the in-memory linq query is working on the in-memory graph could be because the DB returns rows in a different order than the in-memory query (as the in-memory query is deterministic, the DB query's order is undefined, as is the case with all SELECT statements without order by).

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 30-Jul-2015 10:12:03   

To answer your more generic questions:

morphman wrote:

We are running .NET 4.5.1 with LLBLGen Pro Product Version 4.2.14.0919 with SQL Server 2012 Enterprise Edition.

While I'm going to cite a specific example (which we've already worked around using Prefetching), the purpose of my question is to better understand what is the textbook way in which we should handle complex queries using LLBLGen Pro (and/or the database directly).

Below is a fairly fancy SQL where we are pulling the contents of a shopping cart and we also need to pull the price in USD of each item from some tables that store generic item "attribute" information. Attached is a database diagram covering the pertinent tables as well.

SELECT BasketItem.[ID] AS BasketItemId
      ,[BasketItem].[EndItemID]
      ,[EndItem].CatalogNumber
      ,[Quantity]
      ,EndItemAttributeUniversalValue.AttributeValue AS PriceInUSD
  FROM [BasketItem]
  INNER JOIN EndItem ON EndItem.ID = BasketItem.EndItemID
  LEFT JOIN EndItemToEndItemAttribute ON EndItemToEndItemAttribute.EndItemId = EndItem.ID 
  AND EndItemToEndItemAttribute.EndItemAttributeId 
= (SELECT ID FROM EndItemAttribute WHERE EndItemAttribute.UniqueName = 'Price_USD')
  LEFT JOIN EndItemAttributeUniversalValue ON EndItemAttributeUniversalValue.EndItemToEndItemAttributeID = EndItemToEndItemAttribute.ID
  WHERE BasketItemStatus = 'InCart'

We believe the LINQ equivalent of this query is something like the below:

var checkoutItemResults = basketEntities.Select(x => new CheckoutItemResult
                                                     {
                                                         CatalogNumber = x.EndItem.CatalogNumber,
                                                         EndItemId = x.EndItemID,
                                                         Quantity = x.Quantity,
                                                         BasketItemId = x.ID,
                                                         FormattedPrice = x.EndItem.EndItemToEndItemAttributes
                                                                                  .Where(y => y.EndItemAttribute.UniqueName == "Price_USD")
                                                                                    .SelectMany(q => q.EndItemAttributeUniversalValues)
                                                                                    .Select(f => f.AttributeValue)
                                                                                    .FirstOrDefault()
                                                     }).ToList();

Btw, these queries are not equivalent at all. The sql query you specify uses a a simple projection over a joined set, the linq query uses a nested query in the projection, meaning it will in theory execute at every row, and it's not a set joined to the main set. If you want an equivalent, you should join the Formatted Price fetch with the main query and then simply project the value into the projection.

So the questions are: 2) When we have to do more complicated queries like this, what is the general best practice? I realize this is subjective but I'm asking your opinion simple_smile

In general, do: create set, filter set, project. If you stuff a lot of the logic in the projection you'll effectively creating a slower query, as the projection logic has to be run every row. See it as adding a scalar query to a projection which means the scalar query will run with every row. If you create a set with the scalars and then join that, and then project the scalar value in the joined set into the projection, you'll have a faster query.

Linq is a bit limited in this though, as it is awkward for doing joins over predicates other than a==b.

Also, FirstOrDefault on a set in a projection doesn't mean much without order by, as what Walaa said, the ordering of the set is undefined, so whatever is picked is random (in the order in which the DB returns the rows, SELECT's ordering is by definition undefined).

If joins are awkward in linq, try queryspec (our other high level query system). Your sql query in queryspec:


var qf = new QueryFactory();

var q = qf.BasketItem
            .From(QueryTarget
                    .InnerJoin(qf.EndItem).On(EndItemFields.ID==BasketItemFields.EndItemID)
                    .LeftJoin(qf.EndItemToEndItemAttribute)
                        .On((EndItemToEndItemAttributeFields.EndItemId==EndItemFields.ID)
                            .And(EndItemToEndItemAttributeFields.EndItemAttributeId
                                    .In(qf.Create()
                                        .Select(EndItemAttributeFields.ID)
                                        .Where(EndItemAttributeFields.UniqueName=="Price_USD"))))
                    .LeftJoin(qf.EndItemAttributeUniversalValue)
                        .On(EndItemAttributeUniversalValueFields.EndItemToEndItemAttributeID==EndItemToEndItemAttributeFields.ID))
            .Where(BasketItemFields.BasketItemStatus=="InCart")
            .Select(()=>new CheckoutItemResult() 
                {
                    CatalogNumber = EndItemFields.CatalogNumber.ToValue<int>(),
                    EndItemId = BasketItemFields.EndItemId.ToValue<int>(),
                    Quantity = BasketItemFields.Quantity.ToValue<int>(),
                    BasketItemId = BasketItemFields.Id.ToValue<int>(),
                    FormattedPrice = EndItemAttributeUniversalValueFields.AttributeValue.ToValue<string>()
                });

(I guessed the types and some names wink )

As you can see it's very close to the sql query. You can almost translate it 1:1 over to queryspec if you want to.

The prefetch path route -> in-memory projection isn't really an option, although it works: it's a hidden bottleneck that will pop up when the graph is big enough sometime in the future and you don't want that.

Frans Bouma | Lead developer LLBLGen Pro
morphman
User
Posts: 28
Joined: 27-Oct-2011
# Posted on: 30-Jul-2015 16:07:02   

Thanks for the thoughtful answers. Will take some time to digest this and try some things back. I think this is mostly a matter of our not understanding the best way to write this in LINQ.

I should mention that there should be exactly 0 or 1 results returned and not a set since the EndItemAttribute.UniqueName is a unique constraint on the table. I only mention this because it may explain why I don't think the order by would really matter in this case.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 30-Jul-2015 17:03:41   

The .SelectMany(q => q.EndItemAttributeUniversalValues) pulls in additional rows, hence my idea it's not a single value wink

Frans Bouma | Lead developer LLBLGen Pro