HnD | Performance under load, Page 1

Performance under load

Posts

Jackk100
User

Posts: 48
Joined: 11-Jan-2005

# Posted on: 27-Jan-2005 04:52:44

Hi,

I recently took the plunge... after 14 days of using LLBLGen daily we decided to purchase it for use in an enterprise-grade web app we're building. I'm looking for some additional "comfort" regarding performance under load. Since there's a lot of generated code, I'm wondering if there's anyone out there with advice/stories about using LLBLGen in a medium/heavy load application (adapter model).

Granted, LLBLGen'd code is only part of the architectural picture, but it's the part that I'm wondering about. Success stories? Tips? Potential angst warnings?

Thx, Jack

ps. Frans, FWIW, I'm really impressed with what you've built. Am happy to be taking advantage of all your hard work. Thx.

Otis
LLBLGen Pro Team

Posts: 39897
Joined: 17-Aug-2003

# Posted on: 27-Jan-2005 09:35:13

Jackk100 wrote:

Hi,

I recently took the plunge... after 14 days of using LLBLGen daily we decided to purchase it for use in an enterprise-grade web app we're building. I'm looking for some additional "comfort" regarding performance under load. Since there's a lot of generated code, I'm wondering if there's anyone out there with advice/stories about using LLBLGen in a medium/heavy load application (adapter model).

Granted, LLBLGen'd code is only part of the architectural picture, but it's the part that I'm wondering about. Success stories? Tips? Potential angst warnings?

Code is shared among all instantiated objects in .NET, data is not. So even if your generated code is large, it won't eat up more memory than the size of the assembly.

The queries executed onto the database are tweakable from the code you write. For example, if you have to retrieve entities of type A by using a filter on entities of type B, you can do that in two ways: using joins or using a subquery. Joins are faster with small(er) tables. The subquery is faster with large tables. You can specify the filter with joins by using the relation objects (Entity.Relations.RelatedEntityUsing...) and you can use the subquery with a FieldCompareSetPredicate.

Especially 3 or more entity-spanning filters can be slow with joins on tables with a lot of data.

As your application is a webapplication, you have to read all data for a page up front, so a prefetch path is often the best way to do this. Collect data to save one graph to the database, don't read data during a transaction.

A webapplication often benefits greatly from caching. Don't use the viewstate for storing object graphs, this is slow (.NET isn't that fast in deserializing the object graph from teh viewstate) and it blows up the pages a lot. Use the session for that. I'm not sure what the app does, but if it's a CMS like application where the user mostly passive, i.e. a lot of pages are teh same for a lot of visiters, use the slashdot-style of site caching: time how much time it takes for rendering the complete site. Say this is 2 seconds. Set a timer so you re-render the complete site into the cache every 4 seconds. So each 4 seconds the site is refreshed and each visitor gets teh same pages for 4 seconds. With this you de-coupled the amount of visitors from the logic building up the pages. You now only have to make sure that the load you get is handleable by the webserver as the webserver feeds pages to the visitors from the cache.

This can greatly boost site performance under load. And always write your pages as batch-processing blocks: 1) get data 2) use data to render page 3) done and on postback: 1) check input, 2) collect data to send down to BL tier 3) call bl tier and wait for feedback 4) process feedback for user and 5) render page with feedback. This way you avoid chatty apps which call back and forth between tiers a lot which can cause hidden bottlenecks for example when a transaction is going on and some reads are done, which stall the process due to locks because of the transaction (sqlserver)

Also check this thread: http://www.llblgen.com/tinyforum/Messages.aspx?ThreadID=2219 about a discussion on prefetch paths.

ps. Frans, FWIW, I'm really impressed with what you've built. Am happy to be taking advantage of all your hard work. Thx.

Thanks Jack!

Frans Bouma | Lead developer LLBLGen Pro

Jackk100
User

Posts: 48
Joined: 11-Jan-2005

# Posted on: 04-Feb-2005 03:59:13

The queries executed onto the database are tweakable from the code you write. For example, if you have to retrieve entities of type A by using a filter on entities of type B, you can do that in two ways: using joins or using a subquery. Joins are faster with small(er) tables. The subquery is faster with large tables.

I think the subquery thing is what I'm wondering most about. In the past, I've created Stored Procs to get related data, then done some manual cobbling together (in datasets or looping datareaders or something), or just done a JOIN and got more data than I needed. When LLBLGen uses subqueries to populate collection entities/properties in a "parent entity", it basically runs the same query n times, depending on how deep the nesting is (if I understand it correctly). For example, if I have 3 related tables:

Survey - SurveyPage - - SurveyItems

and I fetch data from all three by using prefetchpath's under the Survey entity, the Survey data is SELECTed using something like "where SurveyID = 1", then the SurveyPage data is SELECTed using the same query as a subquery, and the SurveyItem data is SELECTed using the first query and the 2nd query as subqueries... essentially running the first query 3 times.

I've watched the queries in SQL Profiler and I see that LLBLGen will not try to SELECT from a child table (I'm using adapter) if nothing is returned from it's parent's SELECT, which is cool, but I am wondering about the impact of all those "extra" (redundant) queries that will occur when I use prefetchpath's to get child data.

Especially 3 or more entity-spanning filters can be slow with joins on tables with a lot of data.

That I'm pretty familiar with, as well as SQL Execution analysis, perf tuning, etc.... but this will be the first time I'm architecting something that will do the subqueries described above. I realize that I can index-tune, and that the queries should execute very quickly since they "just ran" a split second ago, but..... since I haven't load-tested and can't really until this project is farther along, I'm fishing for words from the trenches. Any takers?

As your application is a webapplication, you have to read all data for a page up front, so a prefetch path is often the best way to do this. Collect data to save one graph to the database, don't read data during a transaction.

Good, am doing that.

A webapplication often benefits greatly from caching. Don't use the viewstate for storing object graphs, this is slow (.NET isn't that fast in deserializing the object graph from teh viewstate) and it blows up the pages a lot. Use the session for that.

ditto on that. :-)

I'm not sure what the app does, but if it's a CMS like application where the user mostly passive, i.e. a lot of pages are teh same for a lot of visiters, use the slashdot-style of site caching: time how much time it takes for rendering the complete site. Say this is 2 seconds. Set a timer so you re-render the complete site into the cache every 4 seconds. So each 4 seconds the site is refreshed and each visitor gets teh same pages for 4 seconds.

This project is both: some sections are form-heavy, with constantly changing data, other sections are content-heavy, with rarely changing data. We're definining our caching rules as we go along, but I'm not sure what you mean by the "slashdot-style" and using timers to rerender the complete site to cache. Can you explain or give me a pointer?

This can greatly boost site performance under load. And always write your pages as batch-processing blocks: 1) get data 2) use data to render page 3) done and on postback: 1) check input, 2) collect data to send down to BL tier 3) call bl tier and wait for feedback 4) process feedback for user and 5) render page with feedback. This way you avoid chatty apps which call back and forth between tiers a lot which can cause hidden bottlenecks for example when a transaction is going on and some reads are done, which stall the process due to locks because of the transaction (sqlserver)

Ditto on that, too :-) (my 2 cents for anyone else doing this stuff.... I confirm! <g>)

Jack

Otis
LLBLGen Pro Team

Posts: 39897
Joined: 17-Aug-2003

# Posted on: 04-Feb-2005 10:34:47

Jackk100 wrote:

The queries executed onto the database are tweakable from the code you write. For example, if you have to retrieve entities of type A by using a filter on entities of type B, you can do that in two ways: using joins or using a subquery. Joins are faster with small(er) tables. The subquery is faster with large tables.

I think the subquery thing is what I'm wondering most about. In the past, I've created Stored Procs to get related data, then done some manual cobbling together (in datasets or looping datareaders or something), or just done a JOIN and got more data than I needed. When LLBLGen uses subqueries to populate collection entities/properties in a "parent entity", it basically runs the same query n times, depending on how deep the nesting is (if I understand it correctly). For example, if I have 3 related tables:

Survey - SurveyPage - - SurveyItems

and I fetch data from all three by using prefetchpath's under the Survey entity, the Survey data is SELECTed using something like "where SurveyID = 1", then the SurveyPage data is SELECTed using the same query as a subquery, and the SurveyItem data is SELECTed using the first query and the 2nd query as subqueries... essentially running the first query 3 times.

I've watched the queries in SQL Profiler and I see that LLBLGen will not try to SELECT from a child table (I'm using adapter) if nothing is returned from it's parent's SELECT, which is cool, but I am wondering about the impact of all those "extra" (redundant) queries that will occur when I use prefetchpath's to get child data.

Let's do some calculations Let's say we have 3 surveys and each survey has 4 pages and on each page 10 items. Now, if we want to do this the old fashioned way, we need 1 + 3 + 3*4 queries, is 16 queries. Each portion is perhaps small, but the roundtrip to the server is rather expensive.

For a prefetch path we need 3 queries, which is a real timesaver in the roundtrip area, which are no longer a large part of the time spend on executing the query.

Now, the data can be fetched in 3 ways: 1) join all 3 tables together in one big inner join resultset: SELECT * FROM Survey INNER JOIN SurveyPage ON .. INNER JOIN SurveyItem ON ... This is 1 query. It will result in a lot of redundant data (each survey's data is 40 times mentioned in the resultset) and which will cause a lot of problems on the client. 2) use 3 queries: SELECT * FROM Survey SELECT DISTINCT SurveyPage.* FROM Survey INNER JOIN SurveyPage... SELECT DISTINCT SurveyItem.* FROM Survey INNER JOIN SurveyPage ON .. INNER JOIN SurveyItem ON ...

This can't always be done, as DISTINCT can't always be applied to a selectlist, for example when it contains BLOB/TEXT etc., fields. In that case a large block of redundant data is returned to the objectfetcher which is able to filter out duplicates but this is of course not as fast as sqlserver would do that.

3) use 3 queries: SELECT * FROM Survey WHERE Foo=1 SELECT * FROM SurveyPage WHERE SurveyID IN (SELECT SurveyID FROM Survey WHERE Foo=1) SELECT * FROM SurveyItem WHERE PageID IN (SELECT PageID FROM SurveyPage WHERE SurveyID IN (SELECT SurveyID FROM Survey WHERE Foo=1))

This doesn't give duplicates in the resultset and if you inspect the execution plans of the subqueries and the joins, they follow the same paths (in most cases)

Overall this gives the best results performance wise. The IN query is sometimes slow(er) when there is just 1 item in the IN clause. I can't optimize that away though. What's best for performance is that you add indexes to each FK field in a table. So if an FK contains 2 fields, add per field an index. The subqueries will then result in index scans which are very fast.

In your example, the subquery always holds a PK field, so the index scan is always used, but if you read the tree from the other direction, the subquery contains the FK field, which benefits greatly from an index.

Pk -> Fk trees (as your example) have the best results with subqueries as childs always have more or equal amount or rows than the parent. In Fk -> Pk trees (order item -> order -> customer) the child has always the same amount of rows as the parent, so it could be that a join is slightly faster in SOME cases, but it would be hard to determine that up front.

The childs are also not selected using the same query as the parent, they're selected using a filter which uses the same filter as the parent set. The difference is in the fact that just 1 column (a PK or FK) is selected, not an entire row.

I'm not sure what you mean with redundant queries. Could you elaborate on that a bit?

Especially 3 or more entity-spanning filters can be slow with joins on tables with a lot of data.

That I'm pretty familiar with, as well as SQL Execution analysis, perf tuning, etc.... but this will be the first time I'm architecting something that will do the subqueries described above. I realize that I can index-tune, and that the queries should execute very quickly since they "just ran" a split second ago, but..... since I haven't load-tested and can't really until this project is farther along, I'm fishing for words from the trenches. Any takers?

All in all joins and subqueries don't matter much in performance, however subqueries give you more flexibility to tune for performance yourself by tweaking the filter and setting indexes. For example, if I want all employees which have entered orders, and I have 5 employees and 10,000 orders, a query like: SELECT Employee.* FROM Employee INNER JOIN Order ON ... will result in a lot of redundant data. I could apply 'DISTINCT' if Employee didn't contain an image column. . The subquery version: SELECT * FROM Employee WHERE EmployeeID IN (SELECT EmployeeID FROM Order) is much faster, as I don't have to plow through a lot of redundant rows.

The only situation in which I found subqueries to suffer is where the IN clause resulted in just 1 item. That was not that fast on sqlserver, perhaps an optimization error in the sqlserver engine, don't know. There is a significant difference between: SELECT * FROM Order WHERE EmployeeID IN (2) and SELECT * FROM Order WHERE EmployeeID = 2

Which is odd.

I'm not sure what the app does, but if it's a CMS like application where the user mostly passive, i.e. a lot of pages are teh same for a lot of visiters, use the slashdot-style of site caching: time how much time it takes for rendering the complete site. Say this is 2 seconds. Set a timer so you re-render the complete site into the cache every 4 seconds. So each 4 seconds the site is refreshed and each visitor gets teh same pages for 4 seconds.

This project is both: some sections are form-heavy, with constantly changing data, other sections are content-heavy, with rarely changing data. We're definining our caching rules as we go along, but I'm not sure what you mean by the "slashdot-style" and using timers to rerender the complete site to cache. Can you explain or give me a pointer?

I don't want to point to the slash code (used by slashdot.org) as it is perl and I find that horrible to read, but it simply works as follows: - to avoid slowdowns, you have to work with solely static pages. On slashdot, everything is static, even the comments. (that's why it takes some seconds for your comment to show up). - each visitor requesting a page gets the latest rendered version from the page cache. This can be a dynamic page with cached elements, or a static page rendered dynamically at set interfals.

To make sure the site looks dynamic, you render the site every X seconds. Say you state that you find a site dynamic if it is new every 10 seconds. Then you have to render each page every 10 seconds and it looks dynamic. If that takes your hardware 5 seconds, you have 5 seconds if spare time the system does nothing. Basicly your website is build with 2 parts: 1) a page render engine and 2) a page server engine. part 1 just renders pages and dumps them into the cache, it does this every X seconds. part 2 handles requests from users and reads content from the cache and serves that to the visitor.

If 1) can't keep up, 2) serves slightly older content. The fun part is: 2) isn't depending on 1)'s speed for serving content, as there is always content to serve. The visitor doesn't notice this most of the time.

So if 1) is able to render the complete site parts every 5 seconds on your current hardware, you can set it to re-render everyting every 10 seconds. 2) will then just serve new pages every 10 seconds, and 1) is always able to handle whatever requests are arriving at 2).

If 1) is under stress, for example because 1) and 2) are on the same machine and 2)'s load affects 1), you can tweak 1)'s re-render timer to say 20 seconds. The site will still be up, content is still served, there is just a slightly delay, not a lot of people will see. You can also trick this a lot. For example on a forum, you cache all messages on a page, and a new message you add directly to the cache as well. A visitor will see cached messages (no-one will notice if another user at that moment hit enter and changed a word in his message, webdate is stale data anyway) and his own message and it looks really dynamic.

This is how large websites handle alot of requests: serve static content which is refreshed each x seconds.

Frans Bouma | Lead developer LLBLGen Pro