Curious result differences between equivalent queries

Posts   
 
    
MisterBK
User
Posts: 11
Joined: 18-Oct-2013
# Posted on: 13-Aug-2014 11:52:19   

Hi Frans,

I'm surprised by the following situation :


var good =
    (from a in tableA
     join b in tableB on a.Id equals b.Id
     join c in tableC on a.Id equals c.Id
     where c.Id == localId  
    
    where c.Something == null
    select a)
    .Count();

which gives me result x back, all is fine.

however :


var wrong = tableA
.Join(tableB, a => a.Id, b => b.Id, (a, b) => new { a, b })
.Join(tableC, x => x.a.Id, c => c.Id, (x, c) => new { x.a, x.b, c })
.Where(x => x.c.Id == localId)

.Where(x => x.c.Something == null)
.Select(x => x.a)
.Count();

gives me result y with x > y, which I do not understand.

Some more experimentation led me to the following observations :


var goodAlternative1 = tableA
.Join(tableB, a => a.Id, b => b.Id, (a, b) => new { a, b })
.Join(tableC.Where(x => x.Something == null), x => x.a.Id, c => c.Id, (x, c) => new { x.a, x.b, c })
.Where(x => x.c.Id == localId)

.Select(x => x.a)
.Count();


returns the expected x result (by moving the predicate filter before the join)

and, even more astonishing (to me anyway :-) :


var goodAlternative2 = (tableA
.Join(tableB, a => a.Id, b => b.Id, (a, b) => new { a, b }))
.Join(tableC, x => x.a.Id, c => c.Id, (x, c) => new { x.a, x.b, c })
.Where(x => x.c.Id == localId)

.ToList() // make it local
.Where(x => x.c.Something == null)
.Select(x => x.a)
.Count();

also returns the expected result x!

Especially the latter observation makes me believe I walked into a bug, but cannot rule out common lack of understanding, both which I hope you know how to solve :-)

Anyway, I'm using the llblgen 4.1 64 bit version you sent me at the end of our previous thread discussion : "Linq to llblgen Expression tree for Contains limitation" of 10-Jun-2014

Perhaps you are able from this contrived example what the problem might be. If not, I will send you the generated queries, and expression trees (apparently have to read some documentation first how to enable this, make that a link, hint, hint :-)

Hope to hear from you soon, cheers, Bart

MisterBK
User
Posts: 11
Joined: 18-Oct-2013
# Posted on: 13-Aug-2014 16:31:35   

Small update : Just installed llblgen pro 4.2 (it is wednesday afternoon afterall), rebuild everything, but behaviour remains exactly as reported earlier...

MisterBK
User
Posts: 11
Joined: 18-Oct-2013
# Posted on: 13-Aug-2014 17:43:19   

Ok, after adding <add name="LinqExpressionHandler" value="3" />

I get the following (somewhat cleaned up) expression trees :


value(DataSource`1[TableAEntity])
.Join(value(DataSource`1[TableBEntity]), a => a.Id, b => b.Id, (a, b) => new <>f__AnonymousType29`2(a = a, b = b))
.Join(value(DataSource`1[TableCEntity]), <>h__TransparentIdentifier18 => <>h__TransparentIdentifier18.a.Id, c => c.Id, (<>h__TransparentIdentifier18, c) => new <>f__AnonymousType2a`2(<>h__TransparentIdentifier18 = <>h__TransparentIdentifier18, c = c))
.Where(<>h__TransparentIdentifier19 => (<>h__TransparentIdentifier19.c.Id == value(+<>c__DisplayClass2f).localId))
.Where(<>h__TransparentIdentifier19 => (<>h__TransparentIdentifier19.c.Something == null))
.Select(<>h__TransparentIdentifier19 => <>h__TransparentIdentifier19.<>h__TransparentIdentifier18.a)
.Count()


value(DataSource`1[TableAEntity])
.Join(value(DataSource`1[TableBEntity]), a => a.Id, b => b.Id, (a, b) => new <>f__AnonymousType29`2(a = a, b = b))
.Join(value(DataSource`1[TableCEntity]), x => x.a.Id, c => c.Id, (x, c) => new <>f__AnonymousType2b`3(a = x.a, b = x.b, c = c))
.Where(x => (x.c.Id == value(+<>c__DisplayClass2f).localId))
.Where(x => (x.c.Something == null))
.Select(x => x.a)
.Count()

value(DataSource`1[TableAEntity])
.Join(value(DataSource`1[TableBEntity]), a => a.Id, b => b.Id, (a, b) => new <>f__AnonymousType29`2(a = a, b = b))
.Join(value(DataSource`1[TableCEntity]).Where(x => (x.Something == null)), x => x.a.Id, c => c.Id, (x, c) => new <>f__AnonymousType2b`3(a = x.a, b = x.b, c = c))
.Where(x => (x.c.Id == value(+<>c__DisplayClass2f).localId))
.Select(x => x.a)
.Count()

value(DataSource`1[TableAEntity])
.Join(value(DataSource`1[TableBEntity]), a => a.Id, b => b.Id, (a, b) => new <>f__AnonymousType29`2(a = a, b = b))
.Join(value(DataSource`1[TableCEntity]), x => x.a.Id, c => c.Id, (x, c) => new <>f__AnonymousType2b`3(a = x.a, b = x.b, c = c))
.Where(x => (x.c.Id == value(+<>c__DisplayClass2f).localId))


It shows that the query syntax generates a different tree then the fluent syntax, but only after the first join...

success! :-)

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 14-Aug-2014 09:57:06   

If you look at the SQL queries, they're also different, as your 'wrong' query has derived tables with projections. So: different queries simple_smile

Frans Bouma | Lead developer LLBLGen Pro
MisterBK
User
Posts: 11
Joined: 18-Oct-2013
# Posted on: 18-Aug-2014 17:39:26   

Unfortunately I do not understand your brief reply about derived tables with projections, but unencumbered by knowledge in general, I observe the following thing :

The expression tree shows me that the sql syntax query of the original query is transformed into :


var fluent1 = tableA
.Join(tableB, a => a.Id, b => b.Id, (a, b) => new { a, b })
.Join(tableC, x => x.a.Id, c => c.Id, (x, c) => new { x, c })
.Where(x => x.c.Id == localId)
.Where(x => x.c.Something == null)
.Select(x => x.x.a)
.Count();

while my by hand transformation from sql to fluent syntaxt lead me to the original :


var fluent2 = tableA
.Join(tableB, a => a.Id, b => b.Id, (a, b) => new { a, b })
.Join(tableC, x => x.a.Id, c => c.Id, (x, c) => new { x.a, x.b, c })
.Where(x => x.c.Id == localId)
.Where(x => x.c.Something == null)
.Select(x => x.a)
.Count();

which I see, semantically as equivalent, (in linq to objects surely) so why do they translate into different sql and therefor different results ? (if not a bug, which I am still convinced it is :-)

If I am wrong, please explain in more detail what I am missing. And if I am not, hahaha, well, a fix would be most welcome ... :-P

Cheers, Bart

Walaa avatar
Walaa
Support Team
Posts: 14993
Joined: 21-Aug-2005
# Posted on: 18-Aug-2014 20:08:32   

Both projections are different, yours (x.a, x.b, c) flattens out "x" in the original (x, c) projection.

It seems logical to me that the resultSet would turn different.

Did you check the resulted SQL?

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 19-Aug-2014 10:09:08   

Walaa is right, the queries are different, and without the produced SQL I can't say anything about why they are different other than what I think happens, looking at the queries in Linq form

MisterBK wrote:

Unfortunately I do not understand your brief reply about derived tables with projections, but unencumbered by knowledge in general, I observe the following thing :

The expression tree shows me that the sql syntax query of the original query is transformed into :


var fluent1 = tableA
.Join(tableB, a => a.Id, b => b.Id, (a, b) => new { a, b })
.Join(tableC, x => x.a.Id, c => c.Id, (x, c) => new { x, c })
.Where(x => x.c.Id == localId)
.Where(x => x.c.Something == null)
.Select(x => x.x.a)
.Count();

while my by hand transformation from sql to fluent syntaxt lead me to the original :


var fluent2 = tableA
.Join(tableB, a => a.Id, b => b.Id, (a, b) => new { a, b })
.Join(tableC, x => x.a.Id, c => c.Id, (x, c) => new { x.a, x.b, c })
.Where(x => x.c.Id == localId)
.Where(x => x.c.Something == null)
.Select(x => x.a)
.Count();

which I see, semantically as equivalent, (in linq to objects surely) so why do they translate into different sql and therefor different results ? (if not a bug, which I am still convinced it is :-)

Your original has everything in 1 scope, there's likely 1 select in the SQL query. The other linq queries however have projections along the way, resulting in multiple select statements in the SQL (select ... from (select ... ) AS A JOIN ... ) This can lead to different results because the set you're joining with is different and thus the end result is different.

Linq to objects isn't the same as linq to <database> as linq isn't 1:1 mappable to SQL. Linq to objects executes everything sequential as they're sequences. Linq to <DB> converts everything to be in 1 statement and is a set operation, and in general it leads to the same results, but it can differ, e.g. on NULL handling. (NULL=NULL in the DB is different from null==null in C#). When multiple scopes are in use, as in your second linq query, interpretation has to take place and a transformation to SQL has to take place. This in general leads to proper SQL, but in edge cases it can be the translation can't be made. For one thing, because SQL doesn't have a projection where you store a full row in a variable.

WIthout the SQL queries (and real code) it's guess work but I'm sure it's due to the fact the linq queries aren't alike.

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39861
Joined: 17-Aug-2003
# Posted on: 19-Aug-2014 10:43:54   

Let's try to reproduce it.

[Test]
public void DifferenceInLinqQuery()
{
    using(var adapter = new DataAccessAdapter())
    {
        var metaData = new LinqMetaData(adapter);
        var v1 = (from a in metaData.Region
                    join b in metaData.Employee on a.RegionId equals b.RegionId
                    join c in metaData.Territory on a.RegionId equals c.RegionId
                    where c.TerritoryId == "10019"
                    where c.TerritoryDescription == "New York"
                    select a)
                .Count();

        Console.WriteLine(v1);


        var v2 = metaData.Region
                    .Join(metaData.Employee, a => a.RegionId, b => b.RegionId, (a, b) => new { a, b })
                    .Join(metaData.Territory, x => x.a.RegionId, c => c.RegionId, (x, c) => new { x.a, x.b, c })
                    .Where(x => x.c.TerritoryId == "10019")
                    .Where(x => x.c.TerritoryDescription == "New York")
                    .Select(x => x.a)
                    .Count();

        Console.WriteLine(v2);
    }
}

Both result in '1'.

SQL of v1:

SELECT TOP 1 COUNT(*) AS [LPAV_]
FROM   (SELECT [LPA_L2].[RegionDescription],
               [LPA_L2].[RegionID] AS [RegionId]
        FROM   (( [Northwind].[dbo].[Region] [LPA_L2]
                  INNER JOIN [Northwind].[dbo].[Employees] [LPA_L3]
                      ON [LPA_L2].[RegionID] = [LPA_L3].[RegionID])
                INNER JOIN [Northwind].[dbo].[Territories] [LPA_L4]
                    ON [LPA_L2].[RegionID] = [LPA_L4].[RegionID])
        WHERE  (((([LPA_L4].[TerritoryID] = '10019' /* @p1 */))
             AND ([LPA_L4].[TerritoryDescription] = 'New York' /* @p2 */)))) [LPA_L1]

SQL of v2:


SELECT TOP 1 COUNT(*) AS [LPAV_]
FROM   (SELECT [LPA_L2].[RegionDescription],
               [LPA_L2].[RegionId]
        FROM   (SELECT [LPA_L3].[RegionDescription],
                       [LPA_L3].[RegionID]  AS [RegionId],
                       [LPA_L4].[Address],
                       [LPA_L4].[BirthDate],
                       [LPA_L4].[City],
                       [LPA_L4].[Country],
                       [LPA_L4].[EmployeeID]  AS [EmployeeId],
                       [LPA_L4].[Extension],
                       [LPA_L4].[FirstName],
                       [LPA_L4].[HireDate],
                       [LPA_L4].[LastName],
                       [LPA_L4].[Notes],
                       [LPA_L4].[HomePhone]   AS [Phone],
                       [LPA_L4].[Photo],
                       [LPA_L4].[PhotoPath],
                       [LPA_L4].[PostalCode],
                       [LPA_L4].[Region],
                       [LPA_L4].[RegionID]  AS [RegionId0],
                       [LPA_L4].[ReportsTo],
                       [LPA_L4].[Title],
                       [LPA_L4].[TitleOfCourtesy],
                       [LPA_L5].[RegionID]  AS [RegionId1],
                       [LPA_L5].[TerritoryDescription],
                       [LPA_L5].[TerritoryID] AS [TerritoryId],
                       [LPA_L5].[test]      AS [Test]
                FROM   (( [Northwind].[dbo].[Region] [LPA_L3]
                          INNER JOIN [Northwind].[dbo].[Employees] [LPA_L4]
                              ON [LPA_L3].[RegionID] = [LPA_L4].[RegionID])
                        INNER JOIN [Northwind].[dbo].[Territories] [LPA_L5]
                            ON [LPA_L3].[RegionID] = [LPA_L5].[RegionID])) [LPA_L2]
        WHERE  (((([LPA_L2].[TerritoryId] = '10019' /* @p1 */))
             AND ([LPA_L2].[TerritoryDescription] = 'New York' /* @p2 */)))) [LPA_L1]

See the extra select in the second query? That could lead to duplicates, which is likely the case in your query. As you do a simple row count, duplicates lead to different results.

If you add .Distinct() before the .Count() it should give the proper results.

Frans Bouma | Lead developer LLBLGen Pro