What say you : about GUIDs

Posts   
 
    
LLBLGen
User
Posts: 43
Joined: 10-Apr-2006
# Posted on: 31-May-2006 21:34:32   

I am writting a good size software (used by 20 to 300+ users) (Fran i have to buy you a beer someday when i visit Netherlands, i cant appriceate enough of your work)

My Question is, we have used GUID as PK preety much throughout the tables because it offers few really good beneit to us, One example is that we want all the drop downs in the software be based on User, So Admin can decide which row to show/not show for each user from each drop down. Some user may see Book Category of Science, Geography, Music, Adults.... while other may only see Science, Geography, Music.

For this we have a UserChoice Table that has UserGUID and ForeignKeyGUID. Insead of creating hundred table, i can now just use ForeignKeyGUID (In this case BookCategoryGUID and UserGUID) . but this table will be HUGE as we may have 100+ dropdowns.....

So some of my friends tell me that since i am using GUID and not int ("GUIDs are slower than int") as PK all over and since i will now have this Huge table UserChoice...its going to have a huge impact on performance.....

But performance is the key for any software to be successful. So is Using GUID a bad choice? Please advise.....

I am using Adapter/Winforms Thanks

Walaa avatar
Walaa
Support Team
Posts: 14952
Joined: 21-Aug-2005
# Posted on: 01-Jun-2006 07:20:52   

Well let's first talk about the database design. How do you relate the UserChoice table to other tables? Is there a direct relation (FK-PK)? I guess not. That's one of the problems.

I'd go for one Choice table per each table that user can choose from, that's better for maintaining relations, for performance sake and I guess you can use GUIDs without any inmpact on the performace.

I suppose that this will address all your concerns.

P.S. I like you name, it sounds familier simple_smile

LLBLGen
User
Posts: 43
Joined: 10-Apr-2006
# Posted on: 01-Jun-2006 08:53:21   

"Is there a direct relation (FK-PK)? I guess not. That's one of the problems."

Yes UserChoice table has FK UserGUID which is the User Table PK and ForeignKeyGUID is FK for any other table in this case its the BookCategoryGUID for BookCategory Table..

so i am not sure why that is a problem...

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39620
Joined: 17-Aug-2003
# Posted on: 01-Jun-2006 09:38:09   

It's a problem because you can't use a fixed FK. THe FK is semantical: based on the value of the field, you look at another table, am I correct?

(and about that beer: deal simple_smile )

Frans Bouma | Lead developer LLBLGen Pro
LLBLGen
User
Posts: 43
Joined: 10-Apr-2006
# Posted on: 01-Jun-2006 15:35:56   

hmm...so are you saying using ForeignKeyGUID as FK to link multiple table is bad frowning ? if so can you may be elaborate this a little?....does that mean i have to create multiple table to get this feature working...meaning have UserBookCategoryAllowed with UserGUID and BookCategoryGUID......and so on... cry

mkamoski avatar
mkamoski
User
Posts: 116
Joined: 06-Dec-2005
# Posted on: 01-Jun-2006 15:47:45   

LLBLGen wrote:

...But performance is the key for any software to be successful. So is Using GUID a bad choice? Please advise...

FWIW, here is my 2-cents-worth.

I am sorry; but, I have never personally tested the difference between lookup times on a PK-INT versus a PK-GUID. Not formally. Shame on me.

However, I have used PK-GUID and it works great. Small-to-medium size schema, highly-normalized, small-to-medium number of rows, SQL Server 2000. I loved it. So, regarding performance, I did not see a problem in my projects. There are, however, many significant, clear, benefits to using PK-GUID.

Some upsides are...

...PK-GUIDs can be generated in the middle-tier easily, which supports "real" in-memory object creation... which makes building in-memory hierarchies easier... which makes saving in-memory hierarchies easier (probably faster?)... and so on...

...replication, and cross-database data sharing in general, is virutally automatically supported...

...offline applications can create "real" data and sync later without collision...

...the PK is not human-readable and that's good IMHO because no one should be looking at them, mucking with them, creating them in an ad-hoc manner such as in Query Analyzer, and so on... they are a pure abstraction and generally should do their job and stay out of sight and out of mind...

...database design is MUCH easier and the overall model is much more flexible given that an existing database, with data in it, can be refactored completely and seriously without having to worry about how to move and relate data for the new version... this is huge benefit, especially if you are developing RAD/ XP/ Agile style, with lots of versions and lots of courageous changes and lots of refactoring... (that is to say, the way development should be done, BTW)....

...and I think that there a few other upsides...

Some downsides are...

...Oracle does not support/ auto-generate/ use PK-GUID easily, at least not in Version 9, and at least not as far as I could find... ...(BUT, I am sure there is a workaround for this AND this is more of an Oracle-problem than a PK-GUID problem AND one can hope that Oracle moved out of the dark ages (such as all capital column names???) and now supports PK-GUID in Version 10 but I am afraid to check)...

...the PK is not human-readable, and that is bad (according to some) because making data-insert scripts, seeding code tables for initialization, making test data insert scripts, and the like can be tricky... ...(BUT, as I note above, this is actually a good thing and not a bad thing)...

...indexing is slower... ...(BUT, I have heard varying opinions on this, so I really am not so sure about it and have not seen any noticable slowdown AND if your querying takes that long, I say, you might want to revisit the underlying design AND whatever happened to the "no more than N joins per query" rule???... and so on)....

...the datatype itself takes up more room than an int... ...(BUT I stopped caring much about that, for most practical applications, the day people started saying things like "we have to use nvarchar instead of varchar everywhere", I mean sheesh, we just doubled the space requirements so don't cry to me if we use a PK-GUID)...

...and I think that there a few other (alledged) downsides...

The main obstacle that I have found to using PK-GUID is DBAs and developer's who don't like the idea because it is "non standard". Coincidentally, this is the same "pro SPs" crowd who think someone is crazy if they say "let's eliminate 90% of our SPs and, along with them, 90% of our database headaches". Oh well. Its going to take some time for some people to come around.

For me, if I had my druthers, I would use PK-GUID virtually everywhere and virtually always.

IMHO.

(Caveat-- As always, I recognize the fact that I may be wrong. As such, I reserve the right to change my mind. But, I will probably have to "prove it with code" to be convinced.)

HTH.

Thank you.

--Mark Kamoski

Walaa avatar
Walaa
Support Team
Posts: 14952
Joined: 21-Aug-2005
# Posted on: 01-Jun-2006 15:51:19   

The UserChoice table can have a FK pointing to the User table, but how can it have a FK pointing to multiple tables?

LLBLGen
User
Posts: 43
Joined: 10-Apr-2006
# Posted on: 01-Jun-2006 17:21:05   

Mark, Thank you ...for your reply

Walaa, sorry if i wasn't clear...i'll try this way...

so the UserChoice table has UerGUID that is FK pointing to User Table

and I am using ForeignKeyGUID which is FK to all the other table...

because GUIDs are unique,

BookCategory's BookCategoryGUID (PK) is different from OfficeLocation table's OfficeLocationGUID (PK)

so in my drop down for BookCategory i can use something like

select * from BookCategory b, UserChoice u where b.BookCategoryGUID =u.ForeignKeyGUID and u.UerGUID =@UserGUID

RelationPredicateBucket bucket = new RelationPredicateBucket(); bucket.Relations.Add(BookCategory.Relations.UserChoiceUsingForeignKeyGuid); bucket.PredicateExpression.Add(PredicateFactory.CompareValue(UserChoiceFieldIndex.UserGuid, ComparisonOperator.Equal, GetUserGuid()));

and for my office location dropdown i can use

select * from OfficeLocation o, UserChoice u where o.OfficeLocationGUID =u.ForeignKeyGUID and u.UerGUID

RelationPredicateBucket bucket = new RelationPredicateBucket(); bucket.Relations.Add(OfficeLocation .Relations.UserChoiceUsingForeignKeyGuid); bucket.PredicateExpression.Add(PredicateFactory.CompareValue(UserChoiceFieldIndex.UserGuid, ComparisonOperator.Equal, GetUserGuid()));

=@UserGUID is the user logged in....

pilotboba
User
Posts: 434
Joined: 05-Aug-2005
# Posted on: 01-Jun-2006 18:30:08   

LLBLGen wrote:

hmm...so are you saying using ForeignKeyGUID as FK to link multiple table is bad frowning ? if so can you may be elaborate this a little?....does that mean i have to create multiple table to get this feature working...meaning have UserBookCategoryAllowed with UserGUID and BookCategoryGUID......and so on... cry

I don't think there is anything wrong with your design. Having a userChoice table for each other table would be alot of tables for no real reason. Performance wise it doesn't matter if there is a hard FK relation or not.

The only thing I would say is that you have to make sure you have some way to keep the user choice table in sync with all the other tables. IE, when you delete a record in a table that joins to userchoicetable do you remove all the joined records. Since there isn't really a hard FK there you can't use a db cascade delete.

Of course, for entities, you can create an entity UserCoice for each table it is joined to and define a relationship in LLBLGen Pro... this shoul make queries easier.

Now, bottom line on performance...

INT's are a small bit faster than Guids because they are smaller.

Do NOT create a clustered index on a PK that uses a Guid (SQL Server 2005 has a new function to mitigate this issue.)

Also, you could use INT's with the same design to do the same exact thing. You would have to bench mark each schema to see which was faster, and it the difference was worth loosing GUID's to you.

In my opinion the only reasons to use a Guid as a key is when it indicates a key value for items that are used in more than one database. This includes having smart clients that can add records off line... or course, you can still use INT keys here, it is just a bit more work.

BOb

BOb

pilotboba
User
Posts: 434
Joined: 05-Aug-2005
# Posted on: 01-Jun-2006 18:32:59   

Walaa wrote:

The UserChoice table can have a FK pointing to the User table, but how can it have a FK pointing to multiple tables?

Easily in the "logical" data model. Not possible in the physical data model, you have to use business rules (middle tier or triggers) to maintain the relationships.

BOb

Walaa avatar
Walaa
Support Team
Posts: 14952
Joined: 21-Aug-2005
# Posted on: 02-Jun-2006 06:51:47   

pilotboba says:

I don't think there is anything wrong with your design. Having a userChoice table for each other table would be alot of tables for no real reason. Performance wise it doesn't matter if there is a hard FK relation or not.

Easily in the "logical" data model. Not possible in the physical data model, you have to use business rules (middle tier or triggers) to maintain the relationships.

I agree, There is nothing as wrong in a design. I would have put them all in one table if the relation was the other way around. (if all other tables refer to this table)

I favour the physical relations maintained in the database rather than logically maintained through code.

One big benefit of this: is that it gaurantees that no value would exist without a corresponding one in the PK-side table. Which can exist due to unhandled delete operations or wrongly inserted values.

You may to a certain degree gaurantee that those things won't happen if you are the only developer that is using and will ever use the database.

But my general rule is: "don't put for tomorrow what you can do today" so if we are speaking about rules and constrains the previous line can be twisted to be "don't put for an upper layer what you can do in a lower one"

pilotboba
User
Posts: 434
Joined: 05-Aug-2005
# Posted on: 02-Jun-2006 19:16:19   

Walaa wrote:

I favour the physical relations maintained in the database rather than logically maintained through code.

One big benefit of this: is that it gaurantees that no value would exist without a corresponding one in the PK-side table. Which can exist due to unhandled delete operations or wrongly inserted values.

As I said, you can do this in a trigger. Still in the db layer.

mkamoski avatar
mkamoski
User
Posts: 116
Joined: 06-Dec-2005
# Posted on: 02-Jun-2006 20:49:37   

Otis--

If you have a moment, please do share a sentence or two concerning what I say above, about the use of GUID as a PK.

I would love to hear your viewpoint.

Thank you.

--Mark Kamoski

Otis avatar
Otis
LLBLGen Pro Team
Posts: 39620
Joined: 17-Aug-2003
# Posted on: 03-Jun-2006 10:55:30   

If you have an application which has to sync between multiple db's, and both dbs are getting updates through a client, a GUID is a perfect choice. If you have a single app targeting the db, guid's aren't that great.

Guid's tend to be slower for inserts, due to indexing problems. This can be circumvented in some situations but nothing beats a sequence. In SqlServer 2005, you have the concept of sequential GUIDs, using the default newsequentialid(). This gives sequential GUIDs when you insert a new row, so it doesn't have the index problems and you still have the greatness of uniqueness across systems. V2 of llblgen pro supports newsequentialid() simple_smile

People often choose guids while they don't have to. And sometimes people choose sequences while guids would have been better simple_smile . For example in 2000 I wrote a big CMS. It stores meta-data and site data in one db. Though you can't avoid having a relation between the data and the meta-data so the viewer logic knows what kind of type the data is etc. If you have a db internally on which you develop a site and then sync it with an online box, where you then change the data of the site (and the layout perhaps) it gets messy as you will get out-of-sync identity fields. This was a problem I ran into as I picked sequences over guids. So I had to write a piece of logic which 'records' changes to meta-data on the dev box, and then 'plays back' these actions on the live server. Something I could have avoided with guids wink

Frans Bouma | Lead developer LLBLGen Pro