[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Orekit Users] design of the DataProviders



Hi Luc,

Thank you very much for this detailed answer. I am glad to see that we have identified similar potential improvements.
I am looking forward to discussing with you in person at the Orekit Day.

In the meantime, I will try to make some time to gather my thoughts and post on the developers list. I am afraid this may take a few days.


Kind regards
Yannick




-----Original Message-----
From: orekit-users-request@orekit.org [mailto:orekit-users-request@orekit.org] On Behalf Of MAISONOBE Luc
Sent: Monday, November 13, 2017 12:17 PM
To: orekit-users@orekit.org
Subject: Re: [Orekit Users] design of the DataProviders


"JEANDROZ, Yannick [FR]" <yannick.jeandroz@airbus.com> a écrit :

> Hello,

Hi Yannick,

>
> I think I have encountered a use case that, to my knowledge, Orekit cannot
> handle. This has led me to investigate the inner workings of DataProviders. I
> believe that the design, while very robust for "one-shot" simulations, is not
> well suited for applications where the models data are a bit "dynamic" during
> the execution. Could someone please confirm what I think I understood ?
>
>
>
> My starting point is that I need to execute Orekit code in a
> multi-threaded context, with thread-specific data providers. My actual
> use case is about solar activity, but for the sake of simplicity I
> will provide examples based on the 'tai-utc.dat' file.
>
> Basically, I have 2 different versions of the data file "a.dat" and
> "b.dat", and I need to run simultaneously thread A using a.dat,
> and thread B using b.dat. Since DataProvidersManager is a
> singleton, I have not found a way to do that.

You are right, DataProvidersManager is a global singleton. We could
probably easily make its internal fields ThreadLocal, but this would
probably not work as expected, see below the reasoning about the
interaction betwee caches and threads.

>
>
>
>
> I understand that my need is very specific. But in the process of  
> simplifying my
> test case as much as possible, I have found another strange behaviour of
> DataProvidersManager, even in a mono-thread application. I find this more
> problematic. Since model data are usually cached by the factory classes (for
> instance TimeScalesFactory), it seems virtually impossible to change the
> dataproviders during execution (even in a mono-thread context).
>
> I have an example to illustrate this. I have built two data sets :
> - dataset 1 uses a "correct" utc-tai.dat
> - dataset 2 uses a modified utc-tai.dat, where I have added a 0.5s  
> shift in TAI-UTC values
>
> Now I perform a simple computation on each of them with the  
> following method :
>
>     private void displayUTC(String datapath) throws OrekitException {
>         DataProvidersManager.getInstance().addProvider(new  
> DirectoryCrawler(new File(datapath)));
>         TimeScale utc = TimeScalesFactory.getUTC();
>         AbsoluteDate date = new AbsoluteDate("1999-08-22T00:00:00", utc);
>         System.out.println(date.durationFrom(AbsoluteDate.GALILEO_EPOCH));
>          
> System.out.println(DataProvidersManager.getInstance().getLoadedDataNames());
>     }
>
> My test code looks like this. Each method must be run in a separate
> process, to make sure nothing is kept in memory between executions.
>
> RUN1 :
>     public void test1() throws OrekitException {
>         System.out.println("Test 1");
>         displayUTC("C:\\dataset1");
>     }
> OUTPUT :
>                 Test 1
>                 0.0
>                 [C:\dataset1\tai-utc.dat]
>
> This is the expected output.
>
>
> RUN2 :
>     public void test2() throws OrekitException {
>         System.out.println("Test 2");
>         displayUTC("C:\\dataset2");
>     }
> OUTPUT :
>                 Test 2
>                 0.5
>                 [C:\dataset2\tai-utc.dat]
>
> This is the expected output. Notice the 0.5s shift that I have
> introduced in the data file.
>
> RUN3 :
>     public void testBoth() throws OrekitException {
>         System.out.println("Test both");
>         displayUTC("C:\\dataset1");
>         DataProvidersManager.getInstance().clearProviders();
>         displayUTC("C:\\dataset2");
>     }
> OUTPUT :
>                 Test both
>                 0.0
>                 [C:\dataset1\tai-utc.dat]
>                 0.0
>                 [C:\dataset1\tai-utc.dat]
>
> As you can see, the results for the second test change when it is  
> executed right
> after the first, in the same process. The dataset 1 is used twice, despite
> clearing the DataProviders and reloading dataset 2. I believe this is because
> the data is cached in the TimeScalesFactory. I think it would make  
> more sense to
> cache the data in the DataProviders (or maybe DataLoaders) instead of the
> factories.

You are right about the cause: the factory caches data. In fact, when we run
the unit tests for Orekit, we do this kind of stuff thousands of times, so
we had to circumvent our own caches. We have set up a clearFactories method
in the class org.orekit.Utils just for this purpose. Beware, this is only
in the test part of the sources and is *not intended* to be put in the
library part. It is an ugly hack only suitable for tests. For the sake of
information, we do this using instrospection, accessing private fields and
modifying them (we also reset the singletons this way).

I am not sure about the solution. The caches were designed to be used by
several threads, and it took quite some time to achieve this (look at
the GenericTimeStampedCache class for an idea of what it looks like). This
designed was addressed the following use case: a server application runs
somewhere and answers to requests (typically web services) coming from
the network. the application uses a pool of threads to handle requests.
The important part is here: as the pool of threads can be reused, even
if remote clients A and B are always using the same data set (for example
computing next week maneuvers and therefore using data around next
week for client A and post processing last month data for client B),
requests from client A and client B will not always be served by the same
thread on the server. The threads are picked up from the pool, serve
one request and returned to the pool. Next incoming request from the
same client may pick up a different thread. So ThreadLocal do *not* work
in this context. This would be even worse if threads were created and
destroyed continuously to handle only one request: each new request
would use a newly created thread. This is why we currently have caches
that have the following properties:
  - they are thread safe
  - they can handle data from different time ranges
  - thread association with time range can change for each request

Obviously, the use case we adressed is different from the one you need.
In our case, one Orekit process (possibly using different threads)
corresponds to one data set (i.e. a set of data loaders and the caches
containing the data loaded from them).

In your case each of your threads has a dedicated predefined meaning and
should use its own data set.

I don't know yet how to manage this. Do your different threads really
need to be threads within the same process? Could you use simply different
processes, possibly exchanging data with inter-process communication
if needed (or not exchanging data at all if they don't need to)?

>
>
>
> Finally, but this is a very very minor nitpick : the data loading  
> mechanism is
> based on data file names. This is a bit confusing when working with a
> non-file-based storage, typically a database of some sort. Asking for data by
> "type" (solar activity, earth orientation parameters...) would seem more
> intuitive to me.

I fully agree. We had a SOCIS intern in 2014 for this. The work done
is available here:  
<https://www.orekit.org/forge/projects/socis-2014-database/>
It has never been integrated as it needs more work.

The "filename" could be considered simply as a key or table name. I don't know
if this would be a large API change or simply a documentation and parameter
name change.

>
> After re-reading this email, I feel like I am bashing the data loading
> mechanism. Please do not interpret my feedback this way : I have  
> used Orekit for
> several years now, and this is the first time I feel like I have hit a hard
> limitation. This is a testament to the overall design of the library.

Thanks for the kind words, it is appreciated.

>
>
> I have started thinking about possible refactorings of the model data
> management. I have a somewhat similar behaviour somewhere else in my  
> software,
> and I have used a dependency inversion based on Java services to solve it. So
> far, it seems to work quite well (but my software is not that big yet, so it
> might be a bit early to tell). Maybe something like this could be implemented
> for orekit data management ? I would gladly share a very basic draft  
> of my ideas
> if it can be of any help.

Sure! We can speak about this on the developers list (and also during
the Orekit day at the end of the month!).

best regards,
Luc

>
>
>
> Thank you for your time.
>
> Yannick Jeandroz
>
> Yannick Jeandroz
> TESOA2 - Flight Dynamics
> T    +33 (0)5 62 19 51 71
> E    yannick.jeandroz@airbus.com
>
> www.airbusdefenceandspace.com<http://www.airbusdefenceandspace.com/>
>
> [AirbusDS]
>
>
>
> ***************************************************************
> Ce courriel (incluant ses eventuelles pieces jointes) peut contenir  
> des informations confidentielles et/ou protegees ou dont la  
> diffusion est restreinte. Si vous avez recu ce courriel par erreur,  
> vous ne devez ni le copier, ni l'utiliser, ni en divulguer le  
> contenu a quiconque. Merci d'en avertir immediatement l'expediteur  
> et d'effacer ce courriel de votre systeme. Airbus Defence and Space  
> et les sociétés Airbus Group declinent toute responsabilite en cas  
> de corruption par virus, d'alteration ou de falsification de ce  
> courriel lors de sa transmission par voie electronique.
> This email (including any attachments) may contain confidential  
> and/or privileged information or information otherwise protected  
> from disclosure. If you are not the intended recipient, please  
> notify the sender immediately, do not copy this message or any  
> attachments and do not use it for any purpose or disclose its  
> content to any person, but delete this message and any attachments  
> from your system. Airbus Defence and Space and Airbus Group  
> companies disclaim any and all liability if this email transmission  
> was virus corrupted, altered or falsified.
> ---------------------------------------------------------------------
> Airbus Defence and Space SAS (393 341 516 RCS Toulouse) - Capital:  
> 29.821.072 EUR - Siege social: 31 rue des Cosmonautes, ZI du Palays,  
> 31402 Toulouse cedex 4, France



This mail has originated outside your organization, either from an external partner or the Global Internet.
Keep this in mind if you answer this message.




***************************************************************
Ce courriel (incluant ses eventuelles pieces jointes) peut contenir des informations confidentielles et/ou protegees ou dont la diffusion est restreinte. Si vous avez recu ce courriel par erreur, vous ne devez ni le copier, ni l'utiliser, ni en divulguer le contenu a quiconque. Merci d'en avertir immediatement l'expediteur et d'effacer ce courriel de votre systeme. Airbus Defence and Space et les sociétés Airbus Group declinent toute responsabilite en cas de corruption par virus, d'alteration ou de falsification de ce courriel lors de sa transmission par voie electronique.
This email (including any attachments) may contain confidential and/or privileged information or information otherwise protected from disclosure. If you are not the intended recipient, please notify the sender immediately, do not copy this message or any attachments and do not use it for any purpose or disclose its content to any person, but delete this message and any attachments from your system. Airbus Defence and Space and Airbus Group companies disclaim any and all liability if this email transmission was virus corrupted, altered or falsified. 
---------------------------------------------------------------------
Airbus Defence and Space SAS (393 341 516 RCS Toulouse) - Capital: 29.821.072 EUR - Siege social: 31 rue des Cosmonautes, ZI du Palays, 31402 Toulouse cedex 4, France