Paged collection fetch with prefetch path - path query retrieves all records

Posts   
 
    
Robert.W
User
Posts: 79
Joined: 19-Jun-2007
# Posted on: 25-Oct-2007 18:09:18   

Hi,

I have the following relation: Invoice - Customer (m:1) in the database. I want to fetch a page of invoices with corresponding customer data. When I run the following code the trace shows that the query responsible for getting the customer data gets all 1000000 records instead of just 100 needed for fetched invoices:

adapter.FetchEntityCollection( invoices, null, 0, null, BuildCustomerPrefetchPath(), 1 /* first page /, 100 / 100 records per page */);

    private static IPrefetchPath2 BuildCustomerPrefetchPath()
    {
        IPrefetchPath2 prefetchPath = new PrefetchPath2((int)EntityType.InvoiceEntity);
        prefetchPath.Add(InvoiceEntity.PrefetchPathCustomer);
        return prefetchPath;
    }

Setting the ParameterisedPrefetchPathThreshold to 101 solves the issue but, unless I'm missing something obvious, the default behavior looks like a bug.

I'm using the 2.5 Final version of LLBLGen with adapter template.

Best regards, Robert Wilczynski.

Walaa avatar
Walaa
Support Team
Posts: 14946
Joined: 21-Aug-2005
# Posted on: 26-Oct-2007 09:47:02   

It's not a bug, and it's mentioned in the docs:

Prefetch Paths and Paging Starting with version 1.0.2005.1, LLBLGen Pro supports paging functionality in combination of Prefetch Paths. Due to the complex nature of the prefetch path queries, especially with per-node filters, sort expressions etc., paging wasn't possible, though due to the DataAccessAdapter.ParameterisedPrefetchPathThreshold threshold setting (See earlier in this section: Optimizing Prefetch Paths), paging can be made possible. If you want to utilize paging in combination of prefetch paths, be sure to set DataAccessAdapter.ParameterisedPrefetchPathThreshold to a value larger than the page size you want to use. You can use paging in combination of prefetch path with a page size larger than DataAccessAdapter.ParameterisedPrefetchPathThreshold but it will be less efficient.

To use paging in combination of prefetch paths, use one of the overloads you'd normally use for fetching data using a prefetch path, which accept a page size and page number as well.

Robert.W
User
Posts: 79
Joined: 19-Jun-2007
# Posted on: 26-Oct-2007 10:21:23   

Walaa,

Thanks for your answer. I did read this part of documentation before posting at least a couple of times.

You can use paging in combination of prefetch path with a page size larger than DataAccessAdapter.ParameterisedPrefetchPathThreshold but it will be less efficient.

"less efficient" is somehow misleading. It's not less efficient but it brings the whole system down to it's knees with 1000000 records being retrieved simple_smile - that's why I wanted to make sure I'm not missing anything. Too bad that plain old joins are out of reach of LLBLGen for now.

Robert.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39590
Joined: 17-Aug-2003
# Posted on: 26-Oct-2007 10:47:48   

Robert.W wrote:

Walaa,

Thanks for your answer. I did read this part of documentation before posting at least a couple of times.

You can use paging in combination of prefetch path with a page size larger than DataAccessAdapter.ParameterisedPrefetchPathThreshold but it will be less efficient.

"less efficient" is somehow misleading. It's not less efficient but it brings the whole system down to it's knees with 1000000 records being retrieved simple_smile - that's why I wanted to make sure I'm not missing anything. Too bad that plain old joins are out of reach of LLBLGen for now.

the efficiency indeed depends on the # of rows fetched. The thing is that if the threshold is smaller than the page size, the prefetch path query should use a subquery instead of an IN query with values. Take this example: you're fetching page 2, with size 100, of all customers from germany with their orders. Threshold is set to 50. For paging, the pager has to create a special paging query, e.g. with a CTE, Temptable etc. If you wouldn't be using paging, the query for the orders would look like: ('*' used for simplicity here) select * from orders where customerid in (select customerid from customers where country = @country)

However, as you're using paging, only the orders of the customers on the page fetched should be read. With a threshold > pagesize, the query is filtering on the client, which orders to take. This can take a while or be fast (depending on the # of orders), as there's no other way: if your page has the size of 10000, you can't use an IN query anyway, a query can't have that many parameters.

So the key is to page with pages within the range of the threshold, so the query can use an in query instead: select * from orders where customerid in (@cid1, @cid2, ... , @cid_pagesize_)

as the paging query used to fetch the customers can't be used as a subquery, that's the main problem.

A different solution would have been (but this isn't implemented currently) to create a temptable for the customerid's, store them there, and filter against that one. In this particular case a temptable would have been better: select * from orders where customerid in (select customerid from #customerids)

But alas, that's not available now, so you've to work with the threshold to get good results with paging + prefetch paths. So if you're doing paging without prefetch paths: no worries. If you're doing paging WITH prefetch paths in 1 query, be sure the page size is in range of the threshold (if you're using compound PK's, be sure to use the formula (#fields*pagesize)<threshold.

Frans Bouma | Lead developer LLBLGen Pro
Robert.W
User
Posts: 79
Joined: 19-Jun-2007
# Posted on: 29-Oct-2007 20:48:26   

Frans,

Thanks for an in depth explanation. I doubt whether my pages will ever reach more than 500 items so I will just stick with the current approach.

Indeed it seems temporary (or variable) tables would be enough for more complex paging needs - especially if both the main query and prefetch path queries were handled with a single DB roundtrip which should be possible with SQL Server (but judging by my modest experience with your product, LLBLGen somehow refuses to take advantage of that feature) - hopefully this will get implemented in one of the cycles.

Robert.