Bad translation (2x) of LINQ query [Firebird]

Posts   
 
    
Posts: 95
Joined: 31-Mar-2012
# Posted on: 27-Apr-2012 15:01:47   

Hi *,

using LLBLGen Pro v.3.5.12.330 from 20.4.2012 20:20.

I have a LINQ query:

CAs.Where(ca => ca.Address.Email == email).Select(ca => new Result()
{
    Address = ca.Address,
    CustomerID = ca.CustomerID,
}

CAs is a collection of Customer2Address entites (properties: ID, Customer, Address, [some not important]) (links to Customer and Address entities), simply speaking M:N with some additional properties.

This is translated into:

SELECT @p2 AS "LPFA_2", "LPA_L2"."ID_Customer" AS "CustomerID", "LPA_L2"."ID_Address" AS "AdressID" FROM ( "ADDRESSES" "LPA_L1"  INNER JOIN "CUSTOMERS_2_ADDRESSES" "LPA_L2"  ON  "LPA_L1"."ID_ADDRESS"="LPA_L2"."ID_ADDRESS") WHERE ( ( ( ( ( "LPA_L1"."EMAIL" = @p3)))))
Name:@p2    Type:Integer    Used Value:1
Name:@p3    Type:VarChar    Used Value:foo@bar.com

Two problems here:

  • The query is not valid Firebird query. You can't have '@p2 AS "LPFA_2"'. If it needs to be there, it has to be i.e. 'cast(@p2 as int) AS "LPFA_2"'.
  • Even if the query would run, it's wrong, because I asked for whole Address entity, and the query isn't returning these columns.
daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 27-Apr-2012 21:30:39   

I wonder why @p2 is there. Please post the complete relevant code. CAs should be an IQueryable, right? Please post how CAs is built.

David Elizondo | LLBLGen Support Team
Posts: 95
Joined: 31-Mar-2012
# Posted on: 27-Apr-2012 21:47:01   

I'm using GetQueryableForEntity with Customer2Address as argument.

daelmo avatar
daelmo
Support Team
Posts: 8245
Joined: 28-Nov-2005
# Posted on: 27-Apr-2012 22:16:14   

Still, please post the complete code, I don't see your metadata object.

David Elizondo | LLBLGen Support Team
Posts: 95
Joined: 31-Mar-2012
# Posted on: 28-Apr-2012 06:58:28   

Complete test case:

var adapter = new DataAccessAdapter(connectionString);
var linq = new LinqMetaData(adapter);
var cas = (SD.LLBLGen.Pro.LinqSupportClasses.DataSource2<Customer2AddressEntity>)linq.GetQueryableForEntity((int)Enum.Parse(typeof(EntityType), typeof(Customer2AddressEntity).Name));
cas.Where(ca => ca.Address.Email == "foo@bar.com").Select(ca => new
{
    Address = ca.Address,
    CustomerID = ca.CustomerID,
}).ToArray();
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 29-Apr-2012 19:04:36   

cincura.net wrote:

Hi *,

using LLBLGen Pro v.3.5.12.330 from 20.4.2012 20:20.

I have a LINQ query:

CAs.Where(ca => ca.Address.Email == email).Select(ca => new Result()
{
    Address = ca.Address,
    CustomerID = ca.CustomerID,
}

CAs is a collection of Customer2Address entites (properties: ID, Customer, Address, [some not important]) (links to Customer and Address entities), simply speaking M:N with some additional properties.

This is translated into:

SELECT @p2 AS "LPFA_2", "LPA_L2"."ID_Customer" AS "CustomerID", "LPA_L2"."ID_Address" AS "AdressID" FROM ( "ADDRESSES" "LPA_L1"  INNER JOIN "CUSTOMERS_2_ADDRESSES" "LPA_L2"  ON  "LPA_L1"."ID_ADDRESS"="LPA_L2"."ID_ADDRESS") WHERE ( ( ( ( ( "LPA_L1"."EMAIL" = @p3)))))
Name:@p2    Type:Integer    Used Value:1
Name:@p3    Type:VarChar    Used Value:foo@bar.com

Two problems here:

  • The query is not valid Firebird query. You can't have '@p2 AS "LPFA_2"'. If it needs to be there, it has to be i.e. 'cast(@p2 as int) AS "LPFA_2"'.
  • Even if the query would run, it's wrong, because I asked for whole Address entity, and the query isn't returning these columns.

Could you please explain to me why I have to cast a parameter of type int to int? No database requires that, I'm surprised firebird needs that. Fixing this will be difficult, and I really wonder why as the query to me looks like a correct query.

The reason the '1' is there is because it's a nested query. The address entities are fetched in a second query. The '1' is a place holder constant and is replaced with the address entity in the object array with values when the customer entity is materialized by the projection.

So there's just 1 problem, not two simple_smile Now if there's anything that could make Firebird simply accept that parameter it would be lovely, because adding the cast is a bit of a problem at that stage. The cast is IMHO completely redundant as the parameter is typed so casting is not doing anything: it's already typed int....

Frans Bouma | Lead developer LLBLGen Pro
Posts: 95
Joined: 31-Mar-2012
# Posted on: 30-Apr-2012 08:38:08   

It's because in Firebird query and parameters are decoupled. Hence when the query is prepared, the engine has no idea what the type of this parameter/column will be and how to do the allocations. If it's just a placeholder, why not to put it there directly (if you can't have cast there)?

And we have another problem (so back again to two). Why second query? This can be done with one (simple) query (hence one roundtrip, less latency).

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 30-Apr-2012 10:07:06   

cincura.net wrote:

It's because in Firebird query and parameters are decoupled. Hence when the query is prepared, the engine has no idea what the type of this parameter/column will be and how to do the allocations.

I'm sorry, but I don't buy that. The parameter is declared, it's typed. You're saying that the database doesn't know the type of the parameter defined right there in the query? What if I select a value into an output parameter in the projection? I don't have to cast it in any shape or form. But in this case I have? What's the difference with the constant in the parameter vs. the value put into the parameter?

And what about the value in the parameter? It's an int. Why do I have to cast the parameter, with the value which is an int, to int?

If it's just a placeholder, why not to put it there directly (if you can't have cast there)?

Because our engine doesn't insert constants in SQL, ever.

The main problem is that the linq engine sees the nested query and has to provide a value at that position. The projection engine then gets a lambda which will project the value at the slot of the constant into the property of Customer. The nested query is executed before that projection action and will place the object fetched in that slot, so the projection of Customer will work as if it didn't have a nested query.

To accomplish the constant (which is just there to make sure a slot is reserved in the array of values for the nested query values), it simply uses a constant in a function call (so it inserts "{0}" as function call and the value (in this case 1) as parameter). The problem is that the linq provider has no idea which database it is used on, nor does it have to, as the constant is added to the query always as a parameter. This works even on databases which are in general seen as crap, like MS Access and MySQL.

So for Firebird, the linq engine has to make sure that there's a cast. But it doesn't know when to do that.

And we have another problem (so back again to two). Why second query? This can be done with one (simple) query (hence one roundtrip, less latency).

No this isn't a problem. See it as this: EF always uses the join, not 1 query per node. (our system works similarly as prefetch paths). This crumbles down very quickly as joins make the end result very big in most cases (due to duplication of data) and very wide (due to all the joined columns).

We fetch each nested query as a separate query and merge internally. So if you fetch nested queries using 2 or more paths or as a 1:n, it's already much more efficient (we also use efficient IN clauses for filtering) than joins.

We'll have to look into how to add the constant to the query so it also works on Firebird, however I have to say, this is very silly. We could do it with a mandatory cast, so it's always casted to int, but that would create redundant statements in the SQL query for all other databases. disappointed

Frans Bouma | Lead developer LLBLGen Pro
Posts: 95
Joined: 31-Mar-2012
# Posted on: 30-Apr-2012 10:22:41   

Otis wrote:

I'm sorry, but I don't buy that. The parameter is declared, it's typed. You're saying that the database doesn't know the type of the parameter defined right there in the query?

Yes. Because the parameter descriptions are sent after the query is sent and processed.

Otis wrote:

What if I select a value into an output parameter in the projection? I don't have to cast it in any shape or form.

You mean like "select @p = 1 from ..."? You can't do that in Firebird. Parameters in *queryp are strictly input-only. Though you can do it in PSQL (and in some cases like in RETURNING statement), but then the parameter is bound to metadata of the query or definition itself (in PSQL "declare parameter ...").

Otis wrote:

And what about the value in the parameter? It's an int. Why do I have to cast the parameter, with the value which is an int, to int?

Because server doesn't know the value while query is processed and data structures are allocated. With the explicit cast you'll tell the type. OTOH if you'll wrap the query i.e. in EXECUTE BLOCK and declare the parameters in advance in header, Firebird will be OK with that.

Otis wrote:

Because our engine doesn't insert constants in SQL, ever.

Good. I was just trying to offer some simple way.

Otis wrote:

No this isn't a problem. See it as this: EF always uses the join, not 1 query per node. (our system works similarly as prefetch paths). This crumbles down very quickly as joins make the end result very big in most cases (due to duplication of data) and very wide (due to all the joined columns).

We fetch each nested query as a separate query and merge internally. So if you fetch nested queries using 2 or more paths or as a 1:n, it's already much more efficient (we also use efficient IN clauses for filtering) than joins.

Yes, if you're dumb and you you're dot-jumping through a lot of objects, you're effectively fetching whole database. Anyway, it's OK for me, the application will run on server, where the latency is low, but for desktop etc., I have to remember the behavior and keep eye on it.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 30-Apr-2012 10:30:53   

cincura.net wrote:

Otis wrote:

I'm sorry, but I don't buy that. The parameter is declared, it's typed. You're saying that the database doesn't know the type of the parameter defined right there in the query?

Yes. Because the parameter descriptions are sent after the query is sent and processed.

Otis wrote:

What if I select a value into an output parameter in the projection? I don't have to cast it in any shape or form.

You mean like "select @p = 1 from ..."? You can't do that in Firebird. Parameters in *queryp are strictly input-only. Though you can do it in PSQL (and in some cases like in RETURNING statement), but then the parameter is bound to metadata of the query or definition itself (in PSQL "declare parameter ...").

Oh, indeed, my bad (mondaymorning!), Firebird has only input parameters in general...

Otis wrote:

And what about the value in the parameter? It's an int. Why do I have to cast the parameter, with the value which is an int, to int?

Because server doesn't know the value while query is processed and data structures are allocated. With the explicit cast you'll tell the type. OTOH if you'll wrap the query i.e. in EXECUTE BLOCK and declare the parameters in advance in header, Firebird will be OK with that.

I think that's a bit overkill for this. We'll see what we can do. If nothing works we can go for the explicit cast to int instead of the normal constant. It's a bit more SQL for other databases though...

Otis wrote:

Because our engine doesn't insert constants in SQL, ever.

Good. I was just trying to offer some simple way.

I know. simple_smile In this particular case we could go for this though, as the constant is known, it's always numeric and it's generated by the engine. We'll do some tests.

Otis wrote:

No this isn't a problem. See it as this: EF always uses the join, not 1 query per node. (our system works similarly as prefetch paths). This crumbles down very quickly as joins make the end result very big in most cases (due to duplication of data) and very wide (due to all the joined columns). We fetch each nested query as a separate query and merge internally. So if you fetch nested queries using 2 or more paths or as a 1:n, it's already much more efficient (we also use efficient IN clauses for filtering) than joins.

Yes, if you're dumb and you you're dot-jumping through a lot of objects, you're effectively fetching whole database. Anyway, it's OK for me, the application will run on server, where the latency is low, but for desktop etc., I have to remember the behavior and keep eye on it.

Not necessarily slower. We do all our testing on a database in the network and it's in general more efficient to do it this way, also over a network. The only situation where things could be equal with a join is with just 1 nested query related over a m:1 relationship. Though the projection engine gets more complex because of that and as this is generic code (as in: it works with all kinds of queries), it's not possible to decide 'oh for this query do it this way, for that query do it that way), so we opted for the way which is more efficient (also over the network) in most cases.

I doubt the query will be much slower over the network though, as it's done this way: - first the parent is fetched, i.e customer - the customer rows are used for producing the filter for the nested query, i.e. address - only the addresses which match with the customers fetched are read - merge takes place using hashing, same as inside DB

so the amount of data passed over the network is equal or smaller (in case of 1:n, m:n) to the amount of data passed over the network when using a join. The calling code is doing a bit more because it has to merge the resultsets, but that's done very efficiently.

As your nested query is a m:n relationship, your join resultset will have a lot of duplicate rows for the parent. This could increase the amount of data returned to the client a lot. So it's not said that a join is more efficient. simple_smile

Frans Bouma | Lead developer LLBLGen Pro
Posts: 95
Joined: 31-Mar-2012
# Posted on: 30-Apr-2012 10:33:32   

We'll see what we can do. If nothing works we can go for the explicit cast to int instead of the normal constant. It's a bit more SQL for other databases though...

Good. Let me know if you need something.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 30-Apr-2012 10:36:44   

As the insertion of the cast is useless for other databases and the constant is known and generated by the framework, we'll try that in this particular case: we'll insert the constant directly.

The fix will be available tomorrow.

Frans Bouma | Lead developer LLBLGen Pro
Posts: 95
Joined: 31-Mar-2012
# Posted on: 30-Apr-2012 10:37:54   

Otis wrote:

The fix will be available tomorrow.

Great! Good work.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39908
Joined: 17-Aug-2003
# Posted on: 01-May-2012 11:52:33   

Fixed in attached build.

2 dlls are there, as queryspec has the same trick and also needed to be fixed. Not sure if this covers every aspect of using parameters as constants in the projection but it does fix your nested query problem. simple_smile

Attachments
Filename File size Added on Approval
RTLibFix_05012012.zip 144,216 01-May-2012 11:52.41 Approved
Frans Bouma | Lead developer LLBLGen Pro
Posts: 95
Joined: 31-Mar-2012
# Posted on: 01-May-2012 14:36:58   

Thanks. I can confirm the query is now correct.