HnD | Composite foreign keys in entity hierarchy confuse designer, Page 1

Composite foreign keys in entity hierarchy confuse designer

Posts

HamishM
User

Posts: 3
Joined: 13-Feb-2013

# Posted on: 13-Feb-2013 17:44:59

Version: LLBLGen Pro v3.5 (January 17th, 2013)

It seems that the LLBLGen designer doesn't handle composite foreign keys properly if the key fields on the FK side are mapped to different entities in a "target per entity hierarchy".

For example, imagine you have an table A with fields Key1 and Key2 which are a foreign key to table C, and this is mapped in LLBLGen as an: * Entity A has Key1 mapped * Sub-type entity B has Key2 mapped. * Foreign key target mapped as entity C. * There is a relation between entity B and entity C corresponding to the foreign key.

In this scenario, the designer gets confused in several ways: * when you perform a "refresh relational model data", it outputs a DDL script which drops the foreign key and recreates another foreign key only containing "Key2" (Key1 missing). * if you perform a "refresh relational model data" with RelationshipsFollowDBForeignKeyConstraints set to true, then the catalog refresher maps the Key2 field onto the base entity A (as well as leaving it mapped on the subtype entity B) and creates a relationship between entity A and entity C. * no relation is created when you map Key2 into entity B. * the llblgenproj file contains bogus foreign keys with GUIDs in their names which change every time you refresh from the database, which complicates version control of the file.

I will attach a sample LLBLGen project file along with a database creation script (for SQLServer 2008 ), which demonstrates the problem.

The generated code in unaffected by these problems, so as we use "database first" development, we can workaround the problems as long as we never run the generated DDL scripts, and turn off RelationshipsFollowDBForeignKeyConstraints.

This composite foreign key scenario might seem a little strange, but we have encountered a scenario where it is quite natural. It is a natural consequence of consolidating data from multiple databases into a single database when the source databases use IDENTITY for primary key ID allocation. In this case, we want to add a "DatabaseId" column to every table in the consolidated database schema and add it to all our primary keys (in order to ensure that primary keys from different source databases do not clash). As a consequence, the "DatabaseId" column has to be included in every foreign key and unique constraint. Naturally, the DatabaseId column gets mapped in the root of entity hierarchies, while the other fields in foreign keys do not. This also leads to the situation where multiple composite foreign key constraints on various sub-types in an entity hierarchy all share the "DatabaseId" as one of their foreign key fields. Of course, there are other ways to solve the problem (e.g. using GUIDs), but it still seems like a reasonable usage of composite foreign keys.

many thanks, Hamish

Attachments
Filename	File size	Added on	Approval
CreateSqlAndProjectFile.zip	2,936	13-Feb-2013 17:45.23	Approved

daelmo
Support Team

Posts: 8245
Joined: 28-Nov-2005

# Posted on: 14-Feb-2013 07:00:10

Hi Hamish,

For your hierarchy setup is most suitable the TargetPerEntity inheritance. See Entity Inheritance for more info.

Otherwise you should move Key2 up to Entity B, then you can create the relation either in Entity B or SubEntityB.

David Elizondo | LLBLGen Support Team

HamishM
User

Posts: 3
Joined: 13-Feb-2013

# Posted on: 14-Feb-2013 11:00:03

Thanks for your reply.

Moving to TargetPerEntity would have a big impact and is unlikely to be an option (in common with many people using database-first, LLBLGen is not the only consumer of the database). However, I guess it would work.

Moving Key2 to Entity B and adding the relation to SubEntityB helps, but LLBLGen still keeps trying to drop the foreign key constraint, and I've now got a load of fields mapped at the wrong level in my hierarchy. And I still can't turn on RelationshipsFollowDBForeignKeyConstraints or it drops my relation, so I don't think I've gained much.

Given these two choices, I would rather stick with my existing work-around of ignoring the DDL output by the designer and manually dealing with impact of the ever-changing bogus foreign keys in the llblgenproj file. Better to have pain using the designer than to compromise my schema or my generated code.

I think that there is clearly a designer bug here (I create a relation with 2 fields, and LLBLGen creates a foreign key with only 1 field - that can't be right). The entire LLBLGen tool chain copes with my relation scenario except for the entity model<->database schema synchronisation algorithm in the designer. So I was rather hoping that you might tell me if you agree that this is a bug and tell me if/when it might get fixed.

Otis
LLBLGen Pro Team

Posts: 39894
Joined: 17-Aug-2003

# Posted on: 14-Feb-2013 16:31:14

(I'm using the names from the example project, not the ones in your start post, they're not the same)

The situation appears strange to me. The thing is: if I persist a B instance, I don't have a Key2 value, so the value for Key1 in B has no meaning. The value for Key2 is NULL, so the FK constraint is ignored as one of the fields is NULL.

FK fields appear in the FK side entity B as a copy of the PK fields from the entity B has a relationship with (A). Your setup doesn't do that: it has the PK field copies scattered across two types. I.o.w.: the FK fields make no sense. They're fields, but they're not FK fields: Key1, defined in B (the supertype), is not an FK field in B, but is in Bsubtype. That can't be the case, it's either an FK field everywhere, or it's never.

The designer gets confused, likely because there's no code which anticipates on this scenario. In Target Per Entity, it's not possible to define this, and you're the first who brings this forward in a TPEH scenario As it doesn't make sense to have the fields as FK fields, we didn't anticipate anyone would do this.

IMHO the only way to fix this is to move Key1 to BSubtype or Key2 to B or to declare the relationship 'model only', so it doesn't have a backing FK constraint.

The runtime can deal with this, however you will have a problem there too: B.Key1 is marked as an 'FK field', while it's not. Well, it is, but not in B. This alone makes it not something you should do.

What I don't understand either is why you merged DB's together this way, as it keeps data separated but it's in the same table, why not use schemas within a catalog? Much easier and you have the same thing: data is separated. Because how I understand it, your result tables have all the data but are actually multiple schemas merged into one without having the data reference each other (due to the DatabaseId).

Frans Bouma | Lead developer LLBLGen Pro

HamishM
User

Posts: 3
Joined: 13-Feb-2013

# Posted on: 14-Feb-2013 18:31:01

It's incorrect to say that if Key2 is NULL then Key1 must have no meaning. Most databases are quite explicit about the behaviour of composite foreign keys: they use the MATCH SIMPLE rule which means that the foreign key is only enforced if all the key columns are not NULL. This means that Key1 can be part of other foreign keys and can have meaning in its own right, regardless of the value of Key2.

This may seem like an abuse at first sight, and I'm perfectly willing to believe that I'm your only customer who wants to do this, but I won't accept that it's non-sensical. If I have a database table which looks like this:

CREATE TABLE Example ( Id int NOT NULL PRIMARY KEY, Key1 int NULL FOREIGN KEY REFERENCES Table1 (Id), Key2 int NULL FOREIGN KEY REFERENCES Table2 (Id), Key3 int NULL FOREIGN KEY REFERENCES Table3 (Id) )

then when I combine data from multiple databases into a single table, I have to transform this table into this:

CREATE TABLE ExampleAggregated ( DatabaseId int NOT NULL, Id int NOT NULL, Key1 int NULL, Key2 int NULL, Key3 int NULL,

PRIMARY KEY (DatabaseId, Id), FOREIGN KEY (DatabaseId) REFERENCES Database (Id), FOREIGN KEY (DatabaseId, Key1) REFERENCES Table1 (DatabaseId, Id), FOREIGN KEY (DatabaseId, Key2) REFERENCES Table2 (DatabaseId, Id), FOREIGN KEY (DatabaseId, Key3) REFERENCES Table3 (DatabaseId, Id) )

Hopefully you can see that DatabaseId is meaningful in its own right and that it makes sense for Key1, Key2 and Key3 to be NULL even though DatabaseId is NOT NULL. I'm sure you can also see that when you map this to an entity hierarchy, it can make sense for DatabaseId to be mapped in the root of the entity hierarchy, with Key1 mapped in a sub-type.

I'm sure I could dream up other sensible scenarios: If you have an ID which is only unique within some scope (defined by another field), this kind of composite key seems a legitimate way to deal with the problem.

And, yes, this does all work fine in SQL server, the LLBLGen runtime and it mostly works in the designer too, aside from the model to database synchronization.

The purpose of collecting data and combining into one database like this is to be able to write a single SQL query which queries data from all the databases in one go, I don't think that using multiple schemas would be a way to achieve this?

Otis
LLBLGen Pro Team

Posts: 39894
Joined: 17-Aug-2003

# Posted on: 15-Feb-2013 11:20:37

(Hamish, please don't see my texts as lecturing, if you do, I apologize)

HamishM wrote:

It's incorrect to say that if Key2 is NULL then Key1 must have no meaning. Most databases are quite explicit about the behaviour of composite foreign keys: they use the MATCH SIMPLE rule which means that the foreign key is only enforced if all the key columns are not NULL. This means that Key1 can be part of other foreign keys and can have meaning in its own right, regardless of the value of Key2.

I didn't want to imply it didn't work, it does, it's however illogical: If you start creating the tables X and Y and Y has to have a relationship with X, you place the PK fields of X in Y as FK fields. That's how PK fields are propagated into FK sides of the relationship: they're not real attributes of the relation (table/entity), but 'inherited' from the PK side. In NIAM for example you'll never define an FK field.

In your model, one of the FK fields has a different role than just being an FK field to mark the relationship with a given PK side. This happens, not a lot, but it does occur. In the v3.0 beta we first had FK fields which could only be copies of PK fields, as the theory prescribes. During testing we ran into situations where people wanted to have an FK field which is besides an FK field also a real attribute (field) of the entity. We had to refactor the code to make this work, hence your model does work.

Your situation would work (Key1 being a real attribute of B) if there wasn't an inheritance hierarchy in place, but there is. The difference between the situation without inheritance and the situation where there's inheritance is that with inheritance, the field is BOTH an FK field (in the subtype) and it's NOT (in the supertype). It's the same field, it can't be two fields and two states, it's one field, with one state: it's either an FK field or it's not. That's the problem I don't have a solution for, because I can't say "this field is both an FK field and it's not", like a 3 state bool This is the reason I think it goes wrong when validating & updating the relational model data: it first validates the supertype, sees Key1, it's an FK field but the other field in the relationship Key1 is in isn't there -> marks FK constraint as invalid and that it has to be deleted. This is because one can work model first, the designer doesn't know/care. Then it validates the subtype, sees the relationship but no valid FK, so it creates one.

To overcome this, one could use a model-only relationship. This is a relationship without an backing FK. Well, there might be a backing FK, but it's ignored. Did you try the model-only variant of the relationship? The designer will then not create an FK or consider an FK to be present for the relationship. Making a relationship model-only can be done by 'editing' the relationship.

This may seem like an abuse at first sight, and I'm perfectly willing to believe that I'm your only customer who wants to do this, but I won't accept that it's non-sensical. If I have a database table which looks like this:

CREATE TABLE Example ( Id int NOT NULL PRIMARY KEY, Key1 int NULL FOREIGN KEY REFERENCES Table1 (Id), Key2 int NULL FOREIGN KEY REFERENCES Table2 (Id), Key3 int NULL FOREIGN KEY REFERENCES Table3 (Id) )

then when I combine data from multiple databases into a single table, I have to transform this table into this:

CREATE TABLE ExampleAggregated ( DatabaseId int NOT NULL, Id int NOT NULL, Key1 int NULL, Key2 int NULL, Key3 int NULL,

PRIMARY KEY (DatabaseId, Id), FOREIGN KEY (DatabaseId) REFERENCES Database (Id), FOREIGN KEY (DatabaseId, Key1) REFERENCES Table1 (DatabaseId, Id), FOREIGN KEY (DatabaseId, Key2) REFERENCES Table2 (DatabaseId, Id), FOREIGN KEY (DatabaseId, Key3) REFERENCES Table3 (DatabaseId, Id) )

Hopefully you can see that DatabaseId is meaningful in its own right and that it makes sense for Key1, Key2 and Key3 to be NULL even though DatabaseId is NOT NULL. I'm sure you can also see that when you map this to an entity hierarchy, it can make sense for DatabaseId to be mapped in the root of the entity hierarchy, with Key1 mapped in a sub-type.

It makes the rows unique again, true. I wonder why you didn't opt for a conversion though, to migrate keys from one DB to the result DB, as this setup will never allow you do get rid of this awkward key usage: if you look at the tables without knowing the history behind it, it makes no sense: why is there 'id' if it's not unique? A few years down the line, people will forget the history behind the table layout and will start wonder why this is done this way.

(I wondered about the migration step because your new model isn't usable with existing code without upgrading that as well (they need the DB id in every query), so why not migrate to a new model without the DB id and with a proper unique id?)

I'm sure I could dream up other sensible scenarios: If you have an ID which is only unique within some scope (defined by another field), this kind of composite key seems a legitimate way to deal with the problem.

In theory, that doesn't make sense, as an entity definition has a set of attributes which uniquely identify an instance (row). If 'ID' is only unique within a range of values, it would mean the values dictate how the model looks like, which is odd, considering the model is defined without the data.

In practice however, one might need to patch up a model to make it work, e.g. after a merger and there's little time or no way to migrate the data to new IDs etc., then it might be doable to add a field to 'make it unique', but that only makes sense if you know the history of the model and why it's not done properly.

And, yes, this does all work fine in SQL server, the LLBLGen runtime and it mostly works in the designer too, aside from the model to database synchronization.

The purpose of collecting data and combining into one database like this is to be able to write a single SQL query which queries data from all the databases in one go, I don't think that using multiple schemas would be a way to achieve this?

No, you're right about that, unless you can union several queries over multiple schemas, but that too is awkward.

I think the best way, (but I'm not part of your project so I don't know your time table, project restrictions etc. so please do realize that I know what I'm saying might not work for your situation ) is to migrate the data to a 1 key model without the DatabaseId. In the long run this is absolutely better, because you get rid of the databaseid setup which has no meaning in the model other than to make 'id' unique again but that is only the case because several datasets were merged, something which is essential knowledge to understand why it makes sense.

Many years ago I was part of a project which ended up in a database similar to yours, it wasn't success because with all code we wrote we had to drag the history with us.

But as I said above, it might very well be not an option to fully migrate the data in your case (e.g.: some code might need to work on data only from DB x)

Frans Bouma | Lead developer LLBLGen Pro