Jul 24, 2007

Relational vs. Object Modeling

This is an excerpt of Expert C# Business Object.

Before going any further, let’s make sure we’re in agreement that object models aren’t the same as relational models. Relational models are primarily concerned with the efficient storage of data, so that replication is minimized. Relational modeling is governed by the rules of normalization, and almost all databases are designed to meet at least the third normal form. In this form, it’s quite likely that the data for any given business concept or entity is split between multiple tables in the database in order to avoid any duplication of data.

Object models, on the other hand, are primarily concerned with modeling behavior, not data. It’s not the data that defines the object, but the role the object plays within your business domain. Every object should have one clear responsibility and a limited number of behaviors focused on fulfilling that responsibility.

For instance, a Customer object may be responsible for adding and editing customer data. A CustomerInfo object in the same application may be responsible for providing read-only access to customer data. Both objects will use the same data from the same database and table, but they provide different behaviors.

Similarly, an Invoice object may be responsible for adding and editing invoice data. But invoices include some customer data. A na├»ve solution is to have the Invoice object make use of the aforementioned Customer object, but that’s not a good answer. That Customer object should only be used in the case where the application is adding or editing customer data—something that isn’t occurring while working with invoices. Instead, the Invoice object should directly interact with the customer data it needs to do its job.

Through these two examples, it should be clear that sometimes multiple objects will use the same relational data. In other cases, a single object will use relational data from different data entities. In the end, the same customer data is being used by three different objects. The point, though, is that each one of these objects has a clearly defined responsibility that defines the object’s behavior. Data is merely a resource the object needs to implement that behavior.

Behavioral Object-Oriented Design

It is a common trap to think that data in objects needs to be normalized like it is in a database. A better way to think about objects is to say that behavior should be normalized. The goal of object oriented design is to avoid replication of behavior, not data. In object-oriented design, behavior should be normalized, not data.

At this point, most people are struggling. Most developers have spent years programming their brains to think relationally, and this view of object-oriented design flies directly in the face of that conditioning. Yet the key to the successful application of object-oriented design is to divorce object thinking from relational or data thinking.

Perhaps the most common objection at this point is this: if two objects (say, Customer and Invoice) both use the same data (say, the customer’s name), how do you make sure that consistent business rules are applied to that data? And this is a good question.

The answer is that the behavior must be normalized. Business rules are merely a form of behavior. The business rule specifying that the customer name value is required, for instance, is just a behavior associated with that particular value. Earlier in the chapter, I discussed the idea that a validation rule can be reduced to a method defined by a delegate. A delegate is just an object that points to a method, so it is quite possible to view the delegate itself as the rule. Following this train of thought, every rule then becomes an object.

Behavioral object-oriented design relies heavily on the concept of collaboration. Collaboration is the idea that an object should collaborate with other objects to do its work. If an object starts to become complex, you can break the problem into smaller, more digestible parts by moving some of the sub-behaviors into other objects that collaborate with the original object to accomplish the overall goal.

In the case of a required customer name value, there’s a Rule object that defines that behavior. Both the Customer and Invoice objects can collaborate with that Rule object to ensure that the rule is consistently applied. As you can see in Figure 2-6, the actual rule is only implemented once, but is used as appropriate—effectively normalizing that behavior.

It could be argued that the CustomerName concept should become an object of its own, and that this object would implement the behaviors common to the field. While this sounds good in an idealistic sense, it has serious performance and complexity drawbacks when implemented on development platforms such as .NET. Creating a custom object for every field in your application can rapidly become overwhelming, and such an approach makes the use of technologies like data binding very complex. My approach of normalizing the rules themselves provides a workable compromise; providing a high level of code reuse while still offering good performance and allowing the application to take advantage of all the features of the .NET platform.

Object-Relational Mapping

If object models aren’t the same as relational models (or some other data models that we might be using), some mechanism is needed by which data can be translated from the Data Storage and Management layer up into the object-oriented Business Logic layer.

This is a well-known issue within the object-oriented community. It is commonly referred to as the impedance mismatch problem, and one of the best discussions of it can be found in David Taylor’s book, Object-Oriented Technology: A Manager's Guide (Addison-Wesley, 1991).

Several object-relational mapping (ORM) products exist for the .NET platform from various vendors. In truth, however, most ORM tools have difficulty working against object models defined using behavioral object-oriented design. Unfortunately, most of the ORM tools tend to create "superpowered" DataSet equivalents, rather than true behavioral business objects. In other words, they create a data-centric representation of the business data and wrap it with business logic. The differences between such a data-centric object model and what I am proposing in this book are subtle but important. Behavioral object modeling creates objects that are focused on the object’s behavior, not on the data it contains. The fact that objects contain data is merely a side effect of implementing behavior; the data is not the identity of the object. Most ORM tools, by contrast, create objects based around the data, with the behavior being a side effect of the data in the object. Beyond the philosophical differences, the wide variety of mappings that you might need, and the potential for business logic driving variations in the mapping from object to object, make it virtually impossible to create a generic ORM product that can meet everyone’s needs. Consider the Customer object example discussed earlier. While the customer data may come from one database, it is totally realistic to consider that some data may come from SQL Server while other data comes through screen-scraping a mainframe screen. It’s also quite possible that the business logic will dictate that some of the data is updated in some cases, but not in others. Issues like these are virtually impossible to solve in a generic sense, and so solutions almost always revolve around custom code. The most a typical ORM tool can do is provide support for simple cases, in which objects are updated to and from standard, supported, relational data stores. At most, they’ll provide hooks by which you can customize their behavior. Rather than trying to build a generic ORM product as part of this book, I’ll aim for a much more attainable goal. The framework in this book will define a standard set of four methods for creating, retrieving, updating, and deleting objects. Business developers will implement these four methods to work with the underlying data management tier by using ADO.NET, the XML support in .NET, Web Services, or any other technology required to accomplish the task. In fact, if you have an ORM (or some other generic data access) product, you’ll often be able to invoke that tool from these four methods just as easily as using ADO.NET directly.

The approach taken in this book and the associated framework is very conducive to code generation. Many people use code generators to automate the process of building common data access logic for their objects—thus achieving high levels of productivity while retaining the ability to create a behavioral object-oriented model.

The point is that the framework will simplify object persistence to the point at which all developers need to do is implement these four methods in order to retrieve or update data. This places no restrictions on the object’s ability to work with data, and provides a standardized persistence and mapping mechanism for all objects.