Implementing Paging

Posts   
 
    
smcculloch avatar
smcculloch
User
Posts: 7
Joined: 01-Oct-2003
# Posted on: 01-Oct-2003 09:00:40   

Is the only way to implement custom paging via a stored proc? (For an ASP.NET App)

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39588
Joined: 17-Aug-2003
# Posted on: 01-Oct-2003 10:15:19   

Paging is one of those things that can be done in a variety of ways and all of them suck. sunglasses It comes down to: - OR read everything in memory at once, and do the paging yourself - OR read per page the data from the database.

With a lot of data, option 2 is better, but with not a lot of data, option 1 is better. This forum reads all messages of a thread at once using a stored procedure which queries a view, into a datatable (it was build with LLBLGen 1.x, I now would just create a typed view simple_smile ). The BL routine who does this then looks at the startmessage number (index) of the current page and starts copying the datarows of the page from the datatable with all the messages of the thread into a new datatable, using the startmessage number and the amount of messages on a page.

This is of course not that efficient with threads with 1000s of messages and a lot of visitors.

You can also do this with the entity objects stored in an entity collection, using CopyTo: CustomerCollection customers = new CustomerCollection(); customers.GetMulti(null); customers.Items.CopyTo( // fill in the parameters);

Copy the entities (which are messages in your case) which are on the current page into an array and bind that array to the repeater control that is placed on your messages.aspx page.

Another way of doing this is to have an index column in front of the row or in the row, produced by a stored procedure. (f.e. using a temptable or other trick). When you then read all these rows into a datatable (using a view or call to stored procedure) you can create a dataview object with a filter on the index column. This is practically the same as what I mentioned above, but using other objects and is perhaps slower.

You can also re-query the messages for each page, which is not that efficient for threads with a low amount of messages but can be more efficient for threads with a lot of messages. As a last resort, you can create a 'helper table' which contains index and message information per thread, and thus can help you select a set of rows per page fast. This requires some extra database logic (triggers mostly) to keep that table up to date.

Frans Bouma | Lead developer LLBLGen Pro
Duncan
User
Posts: 9
Joined: 17-Sep-2003
# Posted on: 01-Oct-2003 11:30:33   

Hi,

The solution I use for paging which seems to work quite well for reasonable size result sets is to use a sub query, it has the advantages of not requiring a temp table or reading all data at once. It would go on this general idea:

SELECT TOP 10 ProductID, Name FROM Product WHERE CategoryID = 3 AND ProductID NOT IN (SELECT TOP 20 ProductID FROM Product WHERE CategoryID = 3
ORDER BY Name) ORDER BY Name

The outer TOP defines the page size, the inner TOP is the page size * the page.

The following must be true for this to work, with the exception of the NOT IN WHERE clauses must be identical, ORDER BY clause must be identical for the inner and outer queries.

Duncan

Posts: 15
Joined: 01-Oct-2003
# Posted on: 01-Oct-2003 11:54:09   

Duncan wrote:

Hi,

The solution I use for paging which seems to work quite well for reasonable size result sets is to use a sub query, it has the advantages of not requiring a temp table or reading all data at once. It would go on this general idea:

SELECT TOP 10 ProductID, Name FROM Product WHERE CategoryID = 3 AND ProductID NOT IN (SELECT TOP 20 ProductID FROM Product WHERE CategoryID = 3
ORDER BY Name) ORDER BY Name

The outer TOP defines the page size, the inner TOP is the page size * the page.

The following must be true for this to work, with the exception of the NOT IN WHERE clauses must be identical, ORDER BY clause must be identical for the inner and outer queries.

Duncan

Can you construct that as a predicate expression though?

I think i'll use the customers.Items.CopyTo( // fill in the parameters); method, thanks for all the help once again smile

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39588
Joined: 17-Aug-2003
# Posted on: 01-Oct-2003 12:25:44   

Duncan wrote:

Hi, The solution I use for paging which seems to work quite well for reasonable size result sets is to use a sub query, it has the advantages of not requiring a temp table or reading all data at once. It would go on this general idea:

SELECT TOP 10 ProductID, Name FROM Product
WHERE CategoryID = 3 AND ProductID NOT IN 
       (SELECT TOP 20 ProductID FROM Product 
            WHERE CategoryID = 3   
            ORDER BY Name)
ORDER BY Name

The outer TOP defines the page size, the inner TOP is the page size * the page. The following must be true for this to work, with the exception of the NOT IN WHERE clauses must be identical, ORDER BY clause must be identical for the inner and outer queries.

Neat simple_smile , however this requires sql construction for the inner top clause, plus I think it will be not that efficient, due to the double full table scan for the top clause in combination of the order by. However, it's clever found simple_smile I didn't think of this. With SqlServer 2000 you can also use a user defined function which returns a table variable with the rows, indexed using a new column which is an identity column (and the only key). That way, you don't rely on the tempdb but will create the table completely in memory, the same way as the subquery does. That solution will also be generic, since you can then pass a parameter to filter on the identity column in the table variable returned from the user defined function, where to start the current page (and a parameter for how many rows of course simple_smile ).

Scott: Predicates don't use subqueries, so you can't construct this query with predicates.
You can however do this. This is an example which targets Northwind. Use Query Analyzer to quickly get the column definitions (right mouse on table, select 'SCript object to new window as Create). It adds a new column, Index, as Identity, and doesn't define a key, because we'll select all rows anyway.


CREATE FUNCTION fnCustomersIndexer ()
RETURNS @indexedCustomers TABLE ( 
    [Index] [int] IDENTITY (1, 1) NOT NULL,
    [CustomerID] [nchar] (5) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
    [CompanyName] [nvarchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
    [ContactName] [nvarchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [ContactTitle] [nvarchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [Address] [nvarchar] (60) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [City] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [Region] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [PostalCode] [nvarchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [Country] [nvarchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [Phone] [nvarchar] (24) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    [Fax] [nvarchar] (24) COLLATE SQL_Latin1_General_CP1_CI_AS NULL)
AS
BEGIN
    -- insert all customers from table into this table variable
    INSERT @indexedCustomers 
    SELECT * FROM Customers
    -- done
    RETURN
END

This function creates an in-memory table object with a new column, 'Index'. You can now filter on this index, using:

SELECT * FROM fnCustomersIndexer()
WHERE [Index] > 10 AND [Index] < 20

Or you can create a view:


CREATE VIEW vwTest
AS
SELECT * FROM fnCustomersIndexer()

Which you then can use with predicates filters! sunglasses

Frans Bouma | Lead developer LLBLGen Pro
Duncan
User
Posts: 9
Joined: 17-Sep-2003
# Posted on: 01-Oct-2003 13:05:25   

Hi,

I usually use a function to return the table and because of SQL Server's caching it works out fairly efficiently. I've used it on tables of about 10,000 rows and it performs pretty well. For really large tables it's obviously not ideal but with proper keys and indexes on the search columns it's probably better than loading 10,000 rows into memory or creating a temp table.

The problem with a lot of the paging I've seen that uses primary keys is that it has a dependency, e.g. between key x and key y relies on a fixed sort order, and no rows being deleted in that range.

Duncan

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39588
Joined: 17-Aug-2003
# Posted on: 01-Oct-2003 13:20:59   

Duncan wrote:

I usually use a function to return the table and because of SQL Server's caching it works out fairly efficiently. I've used it on tables of about 10,000 rows and it performs pretty well. For really large tables it's obviously not ideal but with proper keys and indexes on the search columns it's probably better than loading 10,000 rows into memory or creating a temp table.

Oh definitely. The execution plan of your query is not bad, it uses index scans so it is not that expensive. Top is generally a little expensive though, but what isn't wink

The problem with a lot of the paging I've seen that uses primary keys is that it has a dependency, e.g. between key x and key y relies on a fixed sort order, and no rows being deleted in that range.

True. However with clustered indexes you can be sure the keys are sorted, but indeed, relying on that under-the-hood details can be tricky. I'll see what I can do to add paging functionality to the library.

Frans Bouma | Lead developer LLBLGen Pro
bonder
User
Posts: 36
Joined: 23-Sep-2003
# Posted on: 03-Oct-2003 04:45:42   

Otis wrote:

Or you can create a view:


CREATE VIEW vwTest
AS
SELECT * FROM fnCustomersIndexer()

Which you then can use with predicates filters! sunglasses

So do we create a TypedView around this view?

--Bruce

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39588
Joined: 17-Aug-2003
# Posted on: 03-Oct-2003 09:17:52   

bonder wrote:

Otis wrote:

Or you can create a view:


CREATE VIEW vwTest
AS
SELECT * FROM fnCustomersIndexer()

Which you then can use with predicates filters! sunglasses

So do we create a TypedView around this view?

Yes. That will create a class which you can use with predicates to filter the rows, and thus implement paging right in the database by filtering on the Index column

Frans Bouma | Lead developer LLBLGen Pro
bonder
User
Posts: 36
Joined: 23-Sep-2003
# Posted on: 12-Oct-2003 04:23:06   

How about this instead of a view and sproc:

 CREATE PROCEDURE [dbo].[spGetCustomersWithPaging]
    @recordsByPage int = 30,
    @currentPage int = 1
AS
    BEGIN
        SET NOCOUNT ON

        DECLARE @temp TABLE (
            [Index] [int] IDENTITY (1, 1) NOT NULL,
            [CustomerID] char (6) COLLATE SQL_Latin1_General_CP1_CI_AS
        )

        INSERT INTO @temp SELECT CustomerID FROM Customer

        SELECT * FROM [Customer]
            WHERE
                CustomerID IN (
                    SELECT CustomerID
                    FROM @temp
                    WHERE
                        [INDEX] > (@currentPage-1)*@recordsByPage
                        AND [INDEX] <= @currentPage*@recordsByPage
                )
    END

GO

Should take up less memory, and also results in fewer database artifacts to maintain.

--Bruce

smcculloch avatar
smcculloch
User
Posts: 7
Joined: 01-Oct-2003
# Posted on: 17-Oct-2003 04:26:08   

bonder wrote:

How about this instead of a view and sproc:

 CREATE PROCEDURE [dbo].[spGetCustomersWithPaging]
    @recordsByPage int = 30,
    @currentPage int = 1
AS
    BEGIN
        SET NOCOUNT ON

        DECLARE @temp TABLE (
            [Index] [int] IDENTITY (1, 1) NOT NULL,
            [CustomerID] char (6) COLLATE SQL_Latin1_General_CP1_CI_AS
        )

        INSERT INTO @temp SELECT CustomerID FROM Customer

        SELECT * FROM [Customer]
            WHERE
                CustomerID IN (
                    SELECT CustomerID
                    FROM @temp
                    WHERE
                        [INDEX] > (@currentPage-1)*@recordsByPage
                        AND [INDEX] <= @currentPage*@recordsByPage
                )
    END

GO

Should take up less memory, and also results in fewer database artifacts to maintain.

--Bruce

Thanks guys, much appreciated, smile

swallace
User
Posts: 648
Joined: 18-Aug-2003
# Posted on: 17-Oct-2003 15:31:06   

Me also, thanks much for that excellent SP !

swallace
User
Posts: 648
Joined: 18-Aug-2003
# Posted on: 17-Oct-2003 16:16:26   

One change, be sure to SET NOCOUNT OFF again before last end.

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39588
Joined: 17-Aug-2003
# Posted on: 17-Oct-2003 16:35:31   

swallace wrote:

One change, be sure to SET NOCOUNT OFF again before last end.

NOCOUNT is available until the end of the connection (afaik), so most likely the next query will not have NOCOUNT set to ON. However it is indeed something to take into account (and do not like I did and define NOCOUNT ON on the connection properties in sqlserver smile )

Frans Bouma | Lead developer LLBLGen Pro
Otis avatar
Otis
LLBLGen Pro Team
Posts: 39588
Joined: 17-Aug-2003
# Posted on: 23-Oct-2003 21:07:22   

Just a FYI about table variables simple_smile I just ran into this blurb in the SqlServer magazine newsletter. I quote the whole Q and A:

  • TIP: TABLE VARIABLES VS. TEMPORARY TABLES (contributed by Brian Moran, savvy@sqlmag.com)

Q. In general, is there a performance difference between using table variables and using temporary tables in SQL Server 2000?

A. There can be a big performance difference between using table variables and temporary tables. In most cases, temporary tables are faster than table variables. I took the following tip from the private SQL Server MVP newsgroup and received permission from Microsoft to share it with you. One MVP noticed that although queries using table variables didn't generate parallel query plans on a large SMP box, similar queries using temporary tables (local or global) and running under the same circumstances did generate parallel plans.

A senior member of the SQL Server development team told me that table variables use internal metadata in a way that prevents the engine from using a table variable within a parallel query. He also said that SQL Server maintains statistics for queries that use temporary tables but not for queries that use table variables. Without statistics, SQL Server might choose a poor processing plan for a query that contains a table variable. The development team member added that you should limit your use of SQL Server 2000 table variables to reasonably small queries and data sets and use temporary tables for larger data sets.

This advice about table variables contradicts some past Microsoft information, which said that table variables were faster because they're created in memory. However, table variables can incur disk I/O in ways similar to temporary tables.

In deciding whether you should use table variables or temporary tables, benchmark both in your environment. I suspect that temporary tables will provide better performance on larger result sets in most cases. For more information about table variables, see the Microsoft article "INF: Frequently Asked Questions--SQL Server 2000--Table Variables" at http://support.microsoft.com/default.aspx?scid=kb;en-us;305977 .

Probably not that important, but I saw several solutions in this thread, one with a table variable (the neatest one simple_smile ) and one with temptables.

Frans Bouma | Lead developer LLBLGen Pro