Response to question 3

 

A central goal of IDM research should be to enable the creation of conceptual data models that directly reflect the application domain being modeled.  Thus, a conceptual data model for molecular biology should directly embody concepts like ‘gene’, ‘protein’, ‘chemical reaction’, etc..  Likewise, a conceptual model for a molecular biology laboratory process should directly embody concepts like ‘sequence the DNA of a gene’, ‘measure the abundance of a protein’, ‘determine which pairs of proteins can react’, etc.. 

 

If people like me could build conceptual models at this level of abstraction, we could turn these over to people with much more domain expertise – e.g., real biologists -- to create databases for specific purposes.  Thus, I could give my conceptual model of ‘genes’, ‘proteins’, and ‘chemical reactions’ to a molecular biologist who studies cancer, and s/he could use it to construct a database about the molecular basis of cancer. 

 

Since it’s infeasible to implement such concepts from scratch, we need layers of software underneath that provide generic mechanisms for implementing a broad range of concepts.  We also need query language processors that can execute queries expressed in terms of concepts against the underlying engine.  This is a rich area, I think, for research on database systems and theory.