[InterMine Dev] Java libraries used

Alex Kalderimis alex at intermine.org
Fri Nov 16 17:21:40 GMT 2012


Dear JD,

The interaction between the intermine user-land and the database is 
through an ORM called the "ObjectStore" which is a piece of software 
written for intermine, and maintained as part of the project (it 
resides in the org.intermine.objectstore package). It relies on antlr 
for various code generation and sql parsing tasks, and it depends on 
the psql adaptors. We also make use of various standard 
org.apache.commons libraries.

The model is described in XML, and then built. The build stage involves 
two parallel tasks:
  * Create tables and indices within the database to hold the data.
  * Generate java classes that reflect those tables. This involves 
literally writing out .java files which are then compiled in a separate 
build stage. This is
     why you must first build your db before running any integrate code.

When built, the object-store can be used as any other java code, as of 
course it is, as the classes created by the object-store are POJO 
beans. If you want to make and store  a new gene, it goes along the 
lines of:

    Gene gene = new Gene();
    gene.setSymbol("eve");
    osw.store(gene);

You can find tonnes of examples of this in the pre-written intermine 
data loaders.

tl;dr: InterMine objects are just normal java objects, but their source 
files are written by the custom intermine ORM software.

All best,

Alex

On Fri 16 Nov 2012 16:49:54 GMT, JD Wong wrote:
> Thanks Alex!  Correct me if I'm wrong, but it is my understanding that
> InterMine dynamically generates Java objects based on the model.  Do
> you mind going into detail on how that is done ( and which libraries
> are used ) ?
>
> Thanks,
> -JD
>
>
> On Tue, Nov 13, 2012 at 5:47 AM, Alex Kalderimis <alex at intermine.org
> <mailto:alex at intermine.org>> wrote:
>
>     The standard procedure for writing data-loaders in Java is to
>     write code
>     that reads the flat files and load the data they contain directly into
>     the items-database. So in beautiful ascii-art:
>
>        +------------------------+
>        | integrate calls        |
>        |  * gff3-loader      <==|=== [INPUT-1.gff3]
>        |  * go-annotation    <==|=== [INPUT-2.obo]
>        |  * {YOUR-LOADER}    <==|=== {YOUR FLAT FILES}
>        +------------------------+
>              |
>              ˇ
>        +------------------------+
>        |  items-db              |
>        |  (staging db)          |
>        +------------------------+
>              |
>              ˇ
>        +------------------------+
>        |  production db         |
>        +------------------------+
>
>     Whereas the workflows you will have been familiar with involved a
>     pre-loading step of generating items-xml, which is simply a format for
>     which we already have a generic loader for:
>
>        +------------------------+
>        | Translate your files   |
>        |  read input         <==|=== {YOUR FLAT FILES}
>        |  produce items-xml   ==|======================+
>        +------------------------+                      "
>                                                        "
>                                                        "
>        +------------------------+                      "
>        | integrate calls        |                      "
>        |  * gff3-loader      <==|=== [INPUT-1.gff3]    "
>        |  * go-annotation    <==|=== [INPUT-2.obo]     "
>        |  * load-items       <==|=== items.xml <=======+
>        +------------------------+
>              |
>              ˇ
>        AS BEFORE...
>
>     As you can see, the two worflows are equivalent, but:
>       * Generating your own items-xml is inefficient as it
>         involves reading the same data twice in two different
>         processes.
>       * Any logic that generates items-xml will be directly translated
>         to statements that load that data - you might as well skip
>         the intermediary step.
>
>     I believe Julie sent you a link to information on how to write a
>     loader
>     for a custom data-source; if you have any difficulties getting
>     started,
>     please let us know and we will try and help out.
>
>     Alex
>
>
>     On 12/11/12 19:20, JD Wong wrote:
>     > Definitely,
>     > I'm writing a file converter which will translate flat files
>     into Items
>     > xml.  It will process input files based on a schema provided, and
>     > validate each one to the model before converting it into items XML.
>     >  Basically the InterMine model in a nutshell, thoughts?
>     >
>     > -JD
>     >
>     >
>     > On Mon, Nov 12, 2012 at 1:49 PM, Alex Kalderimis
>     <alex at intermine.org <mailto:alex at intermine.org>
>     > <mailto:alex at intermine.org <mailto:alex at intermine.org>>> wrote:
>     >
>     >     Of course. Could you explain a bit what you are trying to do?
>     >
>     >     JD Wong <jdmswong at gmail.com <mailto:jdmswong at gmail.com>
>     <mailto:jdmswong at gmail.com <mailto:jdmswong at gmail.com>>> wrote:
>     >
>     >         Hi Alex,
>     >
>     >         I'm looking to generate custom-defined java objects like
>     >         InterMine does.  Do you mind sharing with me the
>     libraries used?
>     >
>     >         Thanks!
>     >         -JD
>     >
>     >
>     >     --
>     >     Sent from my Android phone with K-9 Mail. Please excuse my
>     brevity.
>     >
>     >
>
>



More information about the dev mailing list