I have managed to make it work in more occasions. The one thing I can't fix (as I don't know a solution to this) is this kind of stuff:
this works now:
var q = from customer in metaData.Customer
from ca in customer.CustomerAddressCollection
from soh in customer.SalesOrderHeaderCollection.DefaultIfEmpty()
where ca.AddressId != null
where soh.SalesOrderId == null
select customer;
this doesn't
var q = from customer in metaData.Customer
from ca in customer.CustomerAddressCollection
where ca.AddressId != null
from soh in customer.SalesOrderHeaderCollection.DefaultIfEmpty()
where soh.SalesOrderId == null
select customer;
The thing is that the 'where' is misplaced: the branch doesn't have a projection, but still ... it does, as Microsoft's linq designers smoked too much weed at a given day and designed it like when the last expression in a chain already has a projection, the projection of a select is crammed into that one (so you don't know if the 'where' is a subquery together with the from above it, or if it's not.). As the subquery can't be created because the projection is useless, the stuff has to be added to the main join... small problem: the relations in the defaultifempty point to the subquery as a set (so they have the set alias as the alias to refer to, not the real aliases inside them, as tables inside a subquery are unknown to tables outside a subquery...
As this is really an edge case and the where can be moved down (similar to order by etc.) to make it work, I'll spend time on the other issues now.
Spot on, I think you can assume for all of the examples I've given everywhere that all I want to do is generate ON clauses(or equivalent) without having to specify the join explicitly - as I've said before I figure I shouldn't have to since it's already defined in the designer.
Agreed, though you can use Contains and Any/All in combination with Where as well, to avoid these messy mixed from clauses altogether, and build upon the relations defined in the designer.
(edit) the sql error I get with your original query:
query = from soh in query
from customerAddress in soh.Customer.CustomerAddress
where countries.Contains(customerAddress.Address.StateProvince.CountryRegion.Name)
select soh;
is also very tricky. The problem is that the part customerAddress.Address.StateProvince.CountryRegion.Name is already present in the query in the outer scope. (in the where clause and therefore in the main query). This means that re-refering to the same part shouldn't lead to the same joins being added again. How to tell them apart is not really doable. (at least I wouldn't know how)
I can't fix this too. Luckily there exist more efficient workarounds for this edge case.