This is a static archive of the previous Open Grid Forum GridForge content management system saved from host forge.ogf.org file /sf/discussion/do/listPosts/projects.ggf-editor/discussion.rec_usage_record_format_recs.nordugrid_comments_on_the_ggf_ur at Thu, 03 Nov 2022 23:16:16 GMT SourceForge : Post

Project Home

Tracker

Documents

Tasks

Source Code

Discussions

File Releases

Wiki

Project Admin
Project: Editor     Discussion > REC:Usage Record Format Recs > NorduGrid comments on the GGF UR draft > List of Posts
Forum Topic - NorduGrid comments on the GGF UR draft: (2 Items)
View:  as 
 
 
NorduGrid comments on the GGF UR draft
NorduGrid comments on the GGF Usage Record draft document
(see [1] for contacts).


Background
==========

The NorduGrid Collaboration [1] develops & maintains the Advanced
Resource Connector (ARC) Grid middleware. The middleware is being used
in production in numerous Grids. The ARC middleware is equipped with a
logging service, the "logger" collects and stores job records for
logging and accounting purposes.  In the logger service each job is
represented by a record, the so-called "nordugrid usage record". The
NorduGrid usage record schema was designed to be used with the ARC
middleware for storing usage & accounting information of jobs that are
run on sites that have installed the ARC middleware.  The NorduGrid UR
is currently being revised with the coming new release of the ARC
middleware. Please consult [2] for more information about the ARC
logger.


General observations
====================

- The document represents a long-awaited attempt at standardization of
resource usage records. We look forward to implement the format that
will be agreed by the GGF participants. Still, much work has to be
done in order to deliver an unambiguous specification that suits
everybody and accommodates most important aspects.

- The document does not give a clear definition of the "usage record",
which leads to rather confusing statements such as "Record identity
uniquely defines a record in the usage record" (page 6, definition of
RecordIdentity).

Furthermore, when talking about the logging & accounting information
associated to grid jobs it is crucial to give some kind of definition
of a grid job. Many of the UR properties described in the document are
ambiguous because it is not clear what constitutes a grid job.  An
example is StarTime. Is the brokering or stagein phase of a grid job
taken into account?


- The document states that "its main purpose is to outline the basic
building blocks of the accounting record"

NorduGrid agrees with this scope, we consider the UR rather as a
logging and accounting record of a grid job rather than simply a
record of resource consumption. For us the UR should contain
information not just about resource consumption but also about job
identity, ownership, status, etc. Maybe it would make sense to rename
the UR as logging or accounting record?


- Many of the attributes are inadequately defined or their meaning is
deliberately left open. This partly defeats the purpose of the schema
since it was designed specifically to be a format for exchanging usage
data over grids but if interpretations of the attributes are different
at different sites/grids, comparing exchanged data becomes
complicated.  For example the Charge, Status, StartTime, EndTime, base
properties claim that "The meaning of this charge will be site
dependent", "the semantic meaning of the status is site dependent" or
"the value of this property may depend on the queuing system", etc.


- The GGF UR proposal is still too 'site' or batch system specific; in
many places the document assumes data exchange among computing centers (sites) and not among Grids. Seems that the grid 
layer is not really
taken into account, the UR at many places resembles as a data exchange
format between batch systems (or sites) and not Grids. For example,
the Current Practices Survey (Appendix A) covers only supercomputing
centers and not a single Grid.  The UR document should make it clear
it is a Grid UR proposal, enabling grid job logging & accounting data
exchange between different Grids.


- The document states: "The document does not attempt to dictate the
format in which the accounting records are stored at a local site,
instead it meant to be a common exchange format"

NorduGrid fully supports this approach (we assume "Grid" everywhere we
read "site"), we consider the UR as an exchange format between
Grids. This implies that the GGF UR should be powerful enough to
"accommodate" other Grid/local Usage Records. Later in these comments,
a mapping of the...
View Full Message
Re: NorduGrid comments on the GGF UR draft
{context elided...}
 
> General observations
> ====================
> 
> - The document represents a long-awaited attempt at standardization of
> resource usage records. We look forward to implement the format that
> will be agreed by the GGF participants. Still, much work has to be
> done in order to deliver an unambiguous specification that suits
> everybody and accommodates most important aspects.

Thank you for your input, to the document and the process. 
 
> - The document does not give a clear definition of the "usage record",
> which leads to rather confusing statements such as "Record identity
> uniquely defines a record in the usage record" (page 6, definition of
> RecordIdentity).

Additional context has been added at the beginning of the document.

> Furthermore, when talking about the logging & accounting information
> associated to grid jobs it is crucial to give some kind of definition
> of a grid job. Many of the UR properties described in the document are
> ambiguous because it is not clear what constitutes a grid job.  An
> example is StarTime. Is the brokering or stagein phase of a grid job
> taken into account?

The document described an atomic record of resource consumption. As such, discussions of "grid jobs", "aggregation", etc
 are outside the scope of this document. They may be addressed in V2, if the community feels this is a critical need.

> - The document states that "its main purpose is to outline the basic
> building blocks of the accounting record"
> 
> NorduGrid agrees with this scope, we consider the UR rather as a
> logging and accounting record of a grid job rather than simply a
> record of resource consumption. For us the UR should contain
> information not just about resource consumption but also about job
> identity, ownership, status, etc. Maybe it would make sense to rename
> the UR as logging or accounting record?

It is difficult to reach consensus on definitions of "job", "log" and "accounting". For this reason, this recommendation
 attempts to identify the smallest unit of resource consumption. Any other expansions of this definition are out of 
scope at this level.

> - Many of the attributes are inadequately defined or their meaning is
> deliberately left open. This partly defeats the purpose of the schema
> since it was designed specifically to be a format for exchanging usage
> data over grids but if interpretations of the attributes are different
> at different sites/grids, comparing exchanged data becomes
> complicated.  For example the Charge, Status, StartTime, EndTime, base
> properties claim that "The meaning of this charge will be site
> dependent", "the semantic meaning of the status is site dependent" or
> "the value of this property may depend on the queuing system", etc.

The format is meant to facilitate information exchange within a single grid instantiation, which will include 
heterogeneous resources. The first step in a unified solution is to unify the atomic data. That is what this 
recommendation addresses. Since each grid instantiation will be driven by the policies and practices agreed upon by the 
partners in that grid, the recommendation is left sufficiently open to allow common meanings for that instantiation to 
be agreed upon.

Meta-grids, grid interoperability, and grids-of-grids are beyond the scope of this version of the recommendation.

> - The GGF UR proposal is still too 'site' or batch system specific; in
> many places the document assumes data exchange among computing centers (sites)
>  and not among Grids. Seems that the grid layer is not really
> taken into account, the UR at many places resembles as a data exchange
> format between batch systems (or sites) and not Grids. For example,
> the Current Practices Survey (Appendix A) covers only supercomputing
> centers...
View Full Message

 
 


The Open Grid Forum Contact Webmaster | Report a problem | GridForge Help
This is a static archive of the previous Open Grid Forum GridForge content management system saved from host forge.ogf.org file /sf/discussion/do/listPosts/projects.ggf-editor/discussion.rec_usage_record_format_recs.nordugrid_comments_on_the_ggf_ur at Thu, 03 Nov 2022 23:16:17 GMT