When one thinks about data hierarchies, or in other words, information systems where elements may be related to one another in a hierarchical way, a programmer will often think of expressing that relationship by using an inheritence tree.
Python supports multiple inheritence for objects, and this is often used as "mixins"; one can add all of the attributes and methods of a superclass to your derived class by simple inclusion. So, within limits it may be possible to express the relationship "A is a kind of B, and is also a type of C" through multiple inheritence.
The ZODB naturally stores information hierarchically, but that hierarchy is not an inheritance structure, and is rather a parent->child relationship. The type of parent in a ZODB structure does not predicate the type of the child.
Let's consider the following hierarchy:
In summary, a publication has a title, date and one or more authors. An article is a publication with a few additional attributes. A book is also a kind of publication having a table of contents and ISBN. And so forth.
Normally, a relational database might map the above types into a set of tables. One might then write some Python classes to map onto the structure, for example:
Later, the programmer might map some views onto the classes (IOW the data model) to produce forms for data entry, or to produce reports. Some web frameworks use class inheritence as a mechanism to define a full set of attributes in order to provide the fields for an automatically generated entry form. For example, due to inheritance, we know that a Novel object has a title, a list of authors, a publication date, a table of contents, an ISBN and a foreword section.
To update the database with a Novel object, the Publication table would first be updated, then the Book table, then the Novel table using data entered in the form. Pretty good so far. Using an object relational mapper (for example Django ORM, STORM or SqlAlchemy), the task of creating or updating records in the database can be greatly simplified. Django takes the approach of directly defining class attributes as schema fields, so that a field can simultaneously describe a form user interface as well as a database table schema field.
Things would break down rather badly if a database schema were to diverge, and no longer match the class hierarchy. So, in the case where a class hierarchy is used to map a database schema, the two would always have to be in synchrony.
Zope & Grok do things rather differently.
A Zope Schema is defined as a set of schema fields, collected into a Zope Interface. For example, one might define:
In summary, a Zope schema defines the fields contained in an entity such as a Novel, and inheritence may be used to help define that schema interface.
However, just defining a schema does not in any way describe or constrain the structure of your data model. We could for example declare a Novel as a single discrete Python class, such as:
If we were to do this, the plain old data class Novel
is now associated with the INovel
interface, and also any parent interfaces:
Since various HTML fom elements are already associated with Zope schema fields via form libraries such as zope.formlib or z3c.forms, it is possible to automatically generate entry forms, add forms or display forms just from the zope schema. This is similar in some ways to the approach taken by other web frameworks, except that the zope.component.Interface
is completely abstracted from the data storage format.
To allow instances of Novel to persist in a ZODB database, we can derive the class from a grok.Model,
which in turn derives from persistent.persistent,
which is required for objects stored in ZODB.
Recalling that in our schema everything (other than IAuthor)
derives from an IPublication,
we could feasibly store all data for our publications in a single ZODB folder. The different Publication specialisations are similar to files of different types in the same file system folder. How these publications are treated could simply be a function of the type.
For example, imagine an application with the following simple ZODB structure:
Where app, authors and publications are all containers, this would provide automatic URL's for http://[server]/app, http://[server]/authors and http://[server]/app/publications. In code, this might look something like this:
Looking at the container App()['publications']
, there is no implied restriction as to what might be stored the container. The trick though, is handling (creating, listing, updating and viewing) data items intelligently. One might imagine an application which must be able to list all publications regardless of type, but also create new books. When creating a Novel, there should be entry fields specific to novels not present for other book types. There are various approaches to accomplishing this, and no approach would be "wrong".
One approach is to add sub-containers for Book, Article and Pamphlet to the Publications container. One might then add Biography, Novel and Reference sub-containers to Book, and so on. This has many advantages:
On the other hand this has some disadvantages too, when compared to a single Publications folder which contains publications of various types:
A rule of thumb is to think of ZODB as a data storage system for objects. While containers generally store items of related classes, it is not required that they do so. However, structuring one's storage in a logical way aids classification and later searches to find content. ZODB can store both structured as well as unstructured content.
Compare a persistent object storage container, which may easily store items of varying types, to a relational database table which always stores the same data columns- although generic views might be accomplished through unused columns which contain null data. Persistent objects may have logic (i.e. methods) as well as data attributes, and while relational databases may have stored functions, it is not the same thing. Modelling data relationally does have some nice advantages where it comes to foreign key import constraint rules, triggers, queries, joins and numerous other things, for which there are no direct equivalents in object databases. For example, ensuring uniqueness in an object database might involve an <on_insert> event handler which checks against an index for prior existence; that is extra work which is unnecessary when using relational databases.
Typically, one will find relational databases do not easily map to object oriented designs.
Other than when working with object-relational databases such as PostgreSQL, a translation of our schema to a set of related tables does not provide us with a mechanism for inheritance. Rather, the fact that a Publication has a list of Author is described throuh a foreign key relationship between the two tables, and Book and Publication are related in the same way.
Rather than using inheritance to describe the fact that a Book is a Publication with an ISBN and table of contents, or that a Novel is a Book with a foreword, it is perhaps a better approach, at least from the point of view of relational databases, to describe a Novel as having a Book, and a Book as having a Publication. This changes our schema slightly:
If our data were stored in an RDBMS such as Postgres, MySql or even SqlLite, one could define database tables as follows:
Now assuming the schema were implemented in the database, the attributes would be automatically read in through reflection, and relations would be defined as described in the database.
So the Publication class would have fields defined for attributes title, author, and date, and also because Book refers to Publication specifying a 'book' back reference, there will also be a Publication.book attribute containing a list of books. These attributes map directly to the database, so updating an attribute will result in the appropriate SQL instructions to accomplish the change.
Using Object-relational mappings might seem like a step backward when compared to a true object database such as ZODB, but there are many advantages- not least being the ability to access your database from code written outside of Python. It is always possible to extend your mapped classes with your own methods and even non-persistent attributes, and so one gains much flexibility and portability by using an ORM such as SqlAlchemy.