[InterMine Dev] seeking advice, best practices

Joel Richardson jer at informatics.jax.org
Tue Sep 13 14:55:57 BST 2011


Hi all,

The more I dig into Intermine, the more possibilities I see
but also the more questions I have. I'd really appreciate
any help/insight/opinions as to the best ways to deal with
the following issues.

Core model + bio extensions. Is there flexibility here? Can any
of this be changed without breaking things? Which parts?
Reasons:
   - a fair amount of stuff is irrelevant for our data and so
   will remain unpopulated in the mine. I know we can just ignore these
   parts (and that's fine for now), but it seems a bit awkward, e.g,
   to have "dead" classes available in the query builder.
   - some aspects conflict with the data we have. For example,
   neither Alleles nor Proteins are subclasses of BioEntity, and so
   cannot have OntologyAnnotations. Again, I know we can define
   our own subclasses of BioEntity (MGIAllele, MGIProtein, or whatever),
   but that seems messy.
A larger question is whether/how the different mines (at least, the
InterMOD ones) coordinate their model extensions. I'm assuming everyone
pretty much extends the core model for their own purposes, and it's
a great strength of Intermine that this is possible. But it also
raises issues for interoperability as the mines' models diverge.

Source control/versioning. I'm wondering how people are approaching
version control of their mines' components (config files, source
code, etc.) as distinct from Intermine itself?

Loading lots of data from a relational db. Most of our data
will come out of MGI. There's lots of it and lots of different types.
Should this be one big load or lots of little ones? Should the
loads connect to the db directly, or should the db get dumped
in ItemXml format and we load that? If loading ItemsXml, is it
better to load one big file, or a directory of smaller ones?

Many thanks in advance,
Joel


-- 

===============================================================
Joel Richardson, Ph.D.
Sr. Research Scientist
Mouse Genome Informatics
The Jackson Laboratory   Phone: (207) 288-6435
600 Main Street          Fax:   (207) 288-6132
Bar Harbor, Maine 04609  URL:   www.informatics.jax.org
===============================================================



More information about the dev mailing list