Data compare odditiy

Posts   
 
    
mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 24-Mar-2013 20:52:56   

Consider this code


                    List<PostAndTagsModel> all;
                    var query = from p in ctx.Post
                                where !p.IsDeleted
                                select p;
                    query = from p in query
                            orderby p.Created descending
                            select p;
                    var pageQuery = query.Skip(page * pageSize).Take(pageSize);
                    var withTags = from p in pageQuery
                                   select new PostAndTagsModel
                                   {
                                       Post = p,
                                       Tags = (from pt in p.PostTags
                                               from tv in ctx.TagsView
                                               where tv.Id == pt.TagViewId && tv.IsActive && (tv.StartDate == null || tv.StartDate < p.Created) && (tv.EndDate == null || tv.EndDate > p.Created)
                                               select tv).ToList()
                                   };
                    all = withTags.ToList();


Generates the nested SQL below. Check out the date compare operands. It is comparing with = instead using < and >.


SELECT DISTINCT [LPA_L3].[Id]       AS [F10_0],
                [LPA_L3].[TableId]   AS [F10_1],
                [LPA_L3].[ Code]      AS [F10_2],
                [LPA_L3].[Name]   AS [F10_3],
                [LPA_L3].[Type]   AS [F10_4],
                [LPA_L3].[StartDate] AS [F10_5],
                [LPA_L3].[EndDate]   AS [F10_6],
                [LPA_L3].[IsActive]  AS [F10_7],
                [LPA_L3].[Color]     AS [F10_8]
FROM   (( [Kdng].[dbo].[Posts] [LPA_L1]
          INNER JOIN [Kdng].[dbo].[PostTags] [LPA_L2]
              ON [LPA_L1].[Id] = [LPA_L2].[PostId])
        CROSS JOIN [Kdng].[dbo].[TagsView] [LPA_L3] )
WHERE  (((((((([LPA_L3].[Id] = [LPA_L2].[TagViewId])
              AND ([LPA_L3].[IsActive] = 1 /* @p1 */))
             AND (([LPA_L3].[StartDate] IS NULL)))
            AND (([LPA_L3].[EndDate] IS NULL))))
      AND (([LPA_L3].[StartDate] = '20130323' /* @p2 */
            AND [LPA_L3].[EndDate] = '20130323' /* @p3 */)
            OR ([LPA_L3].[StartDate] = '20130317' /* @p4 */
                AND [LPA_L3].[EndDate] = '20130317' /* @p5 */)
            OR ([LPA_L3].[StartDate] = '20130307' /* @p6 */
                AND [LPA_L3].[EndDate] = '20130307' /* @p7 */)
            OR ([LPA_L3].[StartDate] = '20130307' /* @p8 */
                AND [LPA_L3].[EndDate] = '20130307' /* @p9 */)))))


Post -> PostTags is a 1-many. PostTags has TagsViewId column that matches TagsView.Id. TagsView.Id is not unique (but its rand StartDate-EndDate is).

v3.5 Final - December 11th, 2012

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 25-Mar-2013 07:29:21   

Hi Miha,

  • Would you mind to update to the latest v3.5 release, regenerate code and try again?
  • Is this the only executed query? I wonder why it's creating parameters while for what I can see it should be expression between fields.
  • Does this happen if you execute only the most inner query?:
var q = (from pt in p.PostTags
             from tv in ctx.TagsView
             where tv.Id == pt.TagViewId && tv.IsActive 
                && (tv.StartDate == null || tv.StartDate < p.Created) 
                && (tv.EndDate == null || tv.EndDate > p.Created)
             select tv).ToList();
David Elizondo | LLBLGen Support Team
mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 25-Mar-2013 08:34:16   

Hi daelmo,

Yes, it should be an expression.

  1. updated to the latest (januar 17th)
  2. there are two queries - one top query and this, problematic, nested
  3. added this:

 DateTime d = new DateTime(2011, 1, 2);
                    var q2 = (from pt in ctx.PostTag
                              where pt.Id == 29
                              from tv in ctx.TagsView
                              where tv.IsActive && (tv.StartDate == null || tv.StartDate < d) && (tv.EndDate == null || tv.EndDate > d)
                              select tv).ToList();

and it (correctly) yields


SELECT DISTINCT [LPA_L2].[Id]       AS [F10_0],
                [LPA_L2].[TableId]   AS [F10_1],
                [LPA_L2].[ Code]      AS [F10_2],
                [LPA_L2].[Name]   AS [F10_3],
                [LPA_L2].[Type]   AS [F10_4],
                [LPA_L2].[StartDate] AS [F10_5],
                [LPA_L2].[EndDate]   AS [F10_6],
                [LPA_L2].[IsActive]  AS [F10_7],
                [LPA_L2].[Color]     AS [F10_8]
FROM   ( [Kdng].[dbo].[PostTags] [LPA_L1]
         CROSS JOIN [Kdng].[dbo].[TagsView] [LPA_L2] )
WHERE  (((((([LPA_L1].[Id] = @p1))
       AND ((([LPA_L2].[IsActive] = @p2)
             AND (([LPA_L2].[StartDate] IS NULL)
                   OR ([LPA_L2].[StartDate] < @p3)))
            AND (([LPA_L2].[EndDate] IS NULL)
                  OR ([LPA_L2].[EndDate] > @p4))))))) 

I am puzzled. Perhaps I have a stupid mistake somewhere. Huh. I can send you the datalayer if you want to take a look at it.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 25-Mar-2013 10:58:10   

Could you simplify the query in your startpost so it just has the most simplest outer query and the most simplest inner query which makes it fail? My guess is it might be something to do with the correlation filter between outer and inner query which makes this problematic, because the element in the where clause is from the outer query. To formulate the correlation filter (to merge inner elements with outer element) it has to find a predicate which ties them together. '<' won't cut it, but I'm not sure whether it is a bug (it changes the operator) or that it simply ignores something and adds its own...

Frans Bouma | Lead developer LLBLGen Pro
mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 25-Mar-2013 14:12:36   

                    var x = from p in ctx.Post
                            select new PostAndTagsModel
                            {
                                Post = p,
                                Tags = (from pt in p.PostTags
                                        from tv in ctx.TagsView
                                        where tv.Id == pt.TagViewId && tv.IsActive && (tv.StartDate == null || tv.StartDate < p.Created) && (tv.EndDate == null || tv.EndDate > p.Created)
                                        select tv).ToList()
                            };
                    var r = x.ToList();

This code works as long as I don't use paging - either at the start or at the end, doesn't matter.

However, this code (simplified where statement)


                    var x = from p in ctx.Post
                                   select new PostAndTagsModel
                                   {
                                       Post = p,
                                       Tags = (from pt in p.PostTags
                                               from tv in ctx.TagsView
                                               where tv.Id == pt.TagViewId && tv.StartDate < p.Created
                                               select tv).ToList()
                                   };
                    var r = x.ToList();

is again yielding wrong SQL even without pagging:


SELECT DISTINCT [LPA_L3].[Id]       AS [F10_0],
                [LPA_L3].[TableId]   AS [F10_1],
                [LPA_L3].[ Code]      AS [F10_2],
                [LPA_L3].[Name]   AS [F10_3],
                [LPA_L3].[Type]   AS [F10_4],
                [LPA_L3].[StartDate] AS [F10_5],
                [LPA_L3].[EndDate]   AS [F10_6],
                [LPA_L3].[IsActive]  AS [F10_7],
                [LPA_L3].[Color]     AS [F10_8]
FROM   (( [Kdng].[dbo].[Posts] [LPA_L1]
          INNER JOIN [Kdng].[dbo].[PostTags] [LPA_L2]
              ON [LPA_L1].[Id] = [LPA_L2].[PostId])
        CROSS JOIN [Kdng].[dbo].[TagsView] [LPA_L3] )
WHERE  (((((([LPA_L3].[Id] = [LPA_L2].[TagViewId])))
      AND ([LPA_L3].[StartDate] IN ('20121026' /* @p1 */, '20121026' /* @p2 */, '20121026' /* @p3 */, '20121105' /* @p4 */,
                                    '20121108' /* @p5 */, '20121114' /* @p6 */, '20121116' /* @p7 */, '20121128' /* @p8 */,
                                    '20121129' /* @p9 */, '20121204' /* @p10 */, '20121204' /* @p11 */, '20121209' /* @p12 */,
                                    '20121211' /* @p13 */, '20121220' /* @p14 */, '20130116' /* @p15 */, '20130207' /* @p16 */,
                                    '20130212' /* @p17 */, '20130217' /* @p18 */, '20130221' /* @p19 */, '20130223' /* @p20 */,
                                    '20130305' /* @p21 */, '20130305' /* @p22 */, '20130307' /* @p23 */, '20130307' /* @p24 */,
                                    '20130317' /* @p25 */, '20130321' /* @p26 */, '20130323' /* @p27 */)))))


Huh?

mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 25-Mar-2013 14:22:03   

Actually it never creates proper SQL but for different reasons. Here is what I though it was a proper SQL because I looked just at operands.

[code] SELECT DISTINCT [LPA_L3].[Id] AS [F10_0], [LPA_L3].[TableId] AS [F10_1], [LPA_L3].

      AS [F10_2],
                [LPA_L3].[Name]   AS [F10_3],
                [LPA_L3].[Type]   AS [F10_4],
                [LPA_L3].[StartDate] AS [F10_5],
                [LPA_L3].[EndDate]   AS [F10_6],
                [LPA_L3].[IsActive]  AS [F10_7],
                [LPA_L3].[Color]     AS [F10_8]
FROM   (( [Kdng].[dbo].[Posts] [LPA_L1]
          INNER JOIN [Kdng].[dbo].[PostTags] [LPA_L2]
              ON [LPA_L1].[Id] = [LPA_L2].[PostId])
        CROSS JOIN [Kdng].[dbo].[TagsView] [LPA_L3] )
WHERE  (((((((([LPA_L3].[Id] = [LPA_L2].[TagViewId])
              AND ([LPA_L3].[IsActive] = @p1))
             AND (([LPA_L3].[StartDate] IS NULL)))
            AND (([LPA_L3].[EndDate] IS NULL))))
      AND (EXISTS
           (SELECT [LPA_L1].[Created]
            FROM   [Kdng].[dbo].[Posts] [LPA_L1]
            WHERE  ([LPA_L3].[StartDate] < [LPA_L1].[Created]
                AND [LPA_L3].[EndDate] > [LPA_L1].[Created])))))) 

However if you look at entire where statement you'll notice that it is wrong - i.e. it uses AND exclusively which rules out any row (i.e. StartDate can't be NULL and < Created at the same time)

Source:


var x = from p in ctx.Post
                            select new PostAndTagsModel
                            {
                                Post = p,
                                Tags = (from pt in p.PostTags
                                        from tv in ctx.TagsView
                                        where tv.Id == pt.TagViewId && tv.IsActive && (tv.StartDate == null || tv.StartDate < p.Created) && (tv.EndDate == null || tv.EndDate > p.Created)
                                        select tv).ToList()
                            };
                    var r = x.ToList();

Walaa avatar
Walaa
Support Team
Posts: 14993
Joined: 21-Aug-2005
# Posted on: 25-Mar-2013 20:17:44   

I'm a little bit confused of which codeproduces which query. Could you please attach a simple repro solution?

mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 25-Mar-2013 20:39:52   

Updated the post (with Source code). Is it clear now?

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 26-Mar-2013 07:53:59   

This is the closest I can do to reproduce it, using AdventureWorks. First something like your latest query, where the operators are ok, but apparently the IS-NULL filter and the EXISTS are linked by an "AND".

RTL ORMSupportClasses: 3.5.13.108 LinqSupportClasses: 3.5.12.1211

Repreduce code

[TestMethod]
public void WeirdCorrelationInnerQuery()
{
    var adapter = new DataAccessAdapter();
    var ctx = new LinqMetaData(adapter);

    var x = from e in ctx.Employee
            select new
            {
                Emp = e,
                Vendors = (from po in e.PurchaseOrders
                            join v in ctx.Vendor on po.VendorId equals v.VendorId
                            where (v.ModifiedDate == null || v.ModifiedDate <= e.HireDate)
                            select v)
            };

    var results = x.ToList();
}

Generated SQL

-- 1st query...

SELECT [LPA_L1].[BirthDate]     AS [F1_0],
       [LPA_L1].[ContactID]     AS [F1_1],
       ...
       [LPA_L1].[VacationHours] AS [F1_15],
       [LPA_L2].[SalesPersonID] AS [F4_16],
       ...
       [LPA_L2].[TerritoryID]     AS [F4_22],
       1                           AS [LPFA_5],
       [LPA_L1].[HireDate]
FROM   ( [AdventureWorks].[HumanResources].[Employee] [LPA_L1]
         LEFT JOIN [AdventureWorks].[Sales].[SalesPerson] [LPA_L2]
             ON [LPA_L1].[EmployeeID] = [LPA_L2].[SalesPersonID]) 

-- 2nd query... 

SELECT DISTINCT [LPA_L3].[AccountNumber],
                [LPA_L3].[ActiveFlag],
                [LPA_L3].[CreditRating],
                [LPA_L3].[ModifiedDate],
                [LPA_L3].[Name],
                [LPA_L3].[PreferredVendorStatus],
                [LPA_L3].[PurchasingWebServiceURL] AS [PurchasingWebServiceUrl],
                [LPA_L3].[VendorID]             AS [VendorId]
FROM   (( [AdventureWorks].[HumanResources].[Employee] [LPA_L1]
          INNER JOIN [AdventureWorks].[Purchasing].[PurchaseOrderHeader] [LPA_L2]
              ON [LPA_L1].[EmployeeID] = [LPA_L2].[EmployeeID])
        INNER JOIN [AdventureWorks].[Purchasing].[Vendor] [LPA_L3]
            ON [LPA_L2].[VendorID] = [LPA_L3].[VendorID])
WHERE  (((((([LPA_L3].[ModifiedDate] IS NULL)))
      AND (EXISTS
           (SELECT [LPA_L1].[HireDate]
            FROM   [AdventureWorks].[HumanResources].[Employee] [LPA_L1]
            WHERE  ([LPA_L3].[ModifiedDate] <= [LPA_L1].[HireDate])))))) 

We will look into this...

David Elizondo | LLBLGen Support Team
mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 26-Mar-2013 08:40:44   

yep, that's a good start. As I said previously, I can provide you my datalayer as well. I use the same assembly versions as you.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 26-Mar-2013 09:58:26   

In David's repro, there's this nested query: Vendors = (from po in e.PurchaseOrders join v in ctx.Vendor on po.VendorId equals v.VendorId where (v.ModifiedDate == null || v.ModifiedDate <= e.HireDate) select v)

po is related to the outer element (through e.PurchaseOrders). However the nested query results in instances of Vendor, which is joined in.

Now consider the two sets: the outer query's elements and the inner query's elements. How do they relate? The code tries to find a correlation between the two. The correlation however isn't found between the two sets, as the correlation is defined by a 3rd element, not in memory, which defines the correlation, namely purchase orders.

The code however doesn't interpret the nested query what it does, it tries to find a correlation predicate, if there's no direct correlation (e.g. e.PurchaseOrders is a direct correlation so that would have been used as correlation predicate to tie the two sets together). It tries to find an existing predicate in the set and it sees one, namely v.ModifiedDate <= e.HireDate. It uses that one as the connection between the two sets. However it always has to AND this predicate, because it has to be true for EVERY row in the nested query. The predicate in the nested query however, has an OR. This means that if v.ModifiedDate <= e.HireDate would NOT be true, namely in the case of v.ModifiedDate==null, the nested row would, in theory! (as the can't interpret the nested query, it simply checks for predicates), not be mergable with any outer row, as there's no connection with any outer row.

This is why the correlation predicate found is and-ed, and not or-ed.

I'm afraid your nested query isn't really going to work, as there's no direct correlation defined between inner set and outer set.

If my explanation isn't clear enough, please let me know and I'll make an example simple_smile

Btw, in this case, please use prefetch paths, as with prefetch paths, you won't have this problem.

Frans Bouma | Lead developer LLBLGen Pro
mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 26-Mar-2013 10:22:48   

Your reply is full of technical words simple_smile . If I understand you correctly the problem arises from the fact that there is no correlation between e and vendors known to LLBLGenPro, correct? So it won work even if there were a correlation between PurchaseOrder and Vendor, like: Vendors = (from po in e.PurchaseOrders from v in po.Vendors where (v.ModifiedDate == null || v.ModifiedDate <= e.HireDate) select v)

Would it be at least possible to throw or warn on such case instead? Executing a wrong query is rather dangerous.

mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 26-Mar-2013 10:27:14   

And yes, I solved it with prefetch.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 26-Mar-2013 10:40:07   

mihies wrote:

Your reply is full of technical words simple_smile . If I understand you correctly the problem arises from the fact that there is no correlation between e and vendors known to LLBLGenPro, correct? So it won work even if there were a correlation between PurchaseOrder and Vendor, like: Vendors = (from po in e.PurchaseOrders from v in po.Vendors where (v.ModifiedDate == null || v.ModifiedDate <= e.HireDate) select v)

Yes, you have to look at the two sets which have to be merged: set of e's and a set of v's. The runtime now has to find a way to group sets of v's to be merged with their parent e instances. To do that, it has to find a correlation predicate. E.g. in the case of Customer and a nested set of Orders, the o.CustomerId==c.CustomerId predicate, ties nested set to outer set. With your nested set, the correlation is defined in another entity, so that's not going to work as you can't decide to which e a given v belongs to by simply looking at the v instance and the set of e's and using a predicate.

Would it be at least possible to throw or warn on such case instead? Executing a wrong query is rather dangerous.

It does that, but in this particular odd case it thinks it did find a predicate, namely v.ModifiedDate <= e.HireDate, but this turned out to be not the case. I'm not sure how to weed this out though, I'll have a look.

(edit) I see the filter for predicates not usable allows predicates which are field op field predicates, where one of the fields is in the nested set and one is in the outer set and it doesn't check the operator. So it allows your predicate as it's simply a field op field predicate, but the operator isn't usable in all cases, because the <= operator also succeeds when the left field is smaller than the right field, and this isn't usable for correlation either.

As this is a breaking change, I'll make the change in the beta of v4.

Frans Bouma | Lead developer LLBLGen Pro
mihies avatar
mihies
User
Posts: 800
Joined: 29-Jan-2006
# Posted on: 26-Mar-2013 12:35:31   

Great. I don't really care but I'd say implement it in 3.5 even though you consider it as a breaking change. The reasoning is that it doesn't work properly in this particular case and devs might not even be aware of the problem. So the breaking change would actually help them. Personally I'd prefer an exception in this case even on my existing code.

But then again, it doesn't really matter to me.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 26-Mar-2013 16:33:50   

People are pretty anal with these kind of things: they don't want to see crashes in places which worked before after applying a patch.

I don't think the situation is very common though.

Frans Bouma | Lead developer LLBLGen Pro