[InterMine Dev] Java libraries used

Alex Kalderimis alex at intermine.org
Tue Nov 13 10:47:58 GMT 2012


The standard procedure for writing data-loaders in Java is to write code
that reads the flat files and load the data they contain directly into
the items-database. So in beautiful ascii-art:

   +------------------------+
   | integrate calls        |
   |  * gff3-loader      <==|=== [INPUT-1.gff3]
   |  * go-annotation    <==|=== [INPUT-2.obo]
   |  * {YOUR-LOADER}    <==|=== {YOUR FLAT FILES}
   +------------------------+
         |
         ˇ
   +------------------------+
   |  items-db              |
   |  (staging db)          |
   +------------------------+
         |
         ˇ
   +------------------------+
   |  production db         |
   +------------------------+

Whereas the workflows you will have been familiar with involved a
pre-loading step of generating items-xml, which is simply a format for
which we already have a generic loader for:

   +------------------------+
   | Translate your files   |
   |  read input         <==|=== {YOUR FLAT FILES}
   |  produce items-xml   ==|======================+
   +------------------------+                      "
                                                   "
                                                   "
   +------------------------+                      "
   | integrate calls        |                      "
   |  * gff3-loader      <==|=== [INPUT-1.gff3]    "
   |  * go-annotation    <==|=== [INPUT-2.obo]     "
   |  * load-items       <==|=== items.xml <=======+
   +------------------------+
         |
         ˇ
   AS BEFORE...

As you can see, the two worflows are equivalent, but:
  * Generating your own items-xml is inefficient as it
    involves reading the same data twice in two different
    processes.
  * Any logic that generates items-xml will be directly translated
    to statements that load that data - you might as well skip
    the intermediary step.

I believe Julie sent you a link to information on how to write a loader
for a custom data-source; if you have any difficulties getting started,
please let us know and we will try and help out.

Alex


On 12/11/12 19:20, JD Wong wrote:
> Definitely,
> I'm writing a file converter which will translate flat files into Items
> xml.  It will process input files based on a schema provided, and
> validate each one to the model before converting it into items XML.
>  Basically the InterMine model in a nutshell, thoughts?
> 
> -JD
> 
> 
> On Mon, Nov 12, 2012 at 1:49 PM, Alex Kalderimis <alex at intermine.org
> <mailto:alex at intermine.org>> wrote:
> 
>     Of course. Could you explain a bit what you are trying to do?
> 
>     JD Wong <jdmswong at gmail.com <mailto:jdmswong at gmail.com>> wrote:
> 
>         Hi Alex,
> 
>         I'm looking to generate custom-defined java objects like
>         InterMine does.  Do you mind sharing with me the libraries used?
> 
>         Thanks!
>         -JD
> 
> 
>     -- 
>     Sent from my Android phone with K-9 Mail. Please excuse my brevity.
> 
> 



More information about the dev mailing list