01/24/2006 11:51 AM
post4806
|
LCG Aggregate accounting (via David Kent)
This short document outlines our requirements and experiences using the GGF aggr
egate usage record schema
for accounting usage on the LCG Grid Computing project.
Who We are, and What We Do
In the LHC Computing Grid (LCG), we are concerned with obtaining a global view o
f job usage for each
Virtual Organisation, each participating Country and for each computing site pro
viding resources to the
project. The LCG project receives usage data from computing sites participating
in different grid projects
(EGEE, OSG, SweGrid) each using their own internal accounting sensors to collect
accounting data which must
be consolidated at a suitable level.
Within each grid, the total number of individual job records is quite high: In t
he period 2004-2005, about
70% of EGEE sites (approx 120 compute sites) generated approximately 6 millions
job records.
LCG Aggregate Accounting
LCG Accounting reports are provided through a graphical interface. Queries must
be sufficiently fast: it is
a requirement to minimise database traversal time and avoid repetitive summation
s of the data. The sort of
questions we are asking are quite high level: we are concerned with collections
of jobs rather than
individual jobs.
<A5> How much CPU time did a particular VO consume in each quarter of 2005?
<A5> If the ATLAS VO consumed 10,000 CPU hours in Jan 2005, who are the users
that submitted the work?
<A5> <C9>. and which computing facilities (Sites) provided the resources to d
o this work?
We have been developing a graphical reporting interface based on aggregate accou
nting here:-
http://goc.grid-support.ac.uk/gridsite/accounting/tree/treeview.php
Requirements for LCG Aggregate Accounting
Provide a high level view of usage across the grid at the VO and User level
1) Total Usage consumed by each VO in the LCG project
2) Share of Total Usage per VO for each Grid Project (EGEE, OSG, SweGrid)
3) 1) and 2) above per Grid User
4) Provide fast reporting to clients by minimising the database traversal t
ime and avoid repetitive
summations of the data.
a. Avoid converting time units: aggregated records describing the total usa
ge in a defined over an
extended interval of time (e.g. Month or quarter) are better described in units
of Hours, or
K.Hours than in seconds.
b. Normalise CPU data from different computing sites to a reference value t
o allow usage
comparisons between sites.
Comments regarding draft March 2005
General Comments:
1) Section 3: We did not find a suitable quantity to describe information i
n an aggregated usage record
such as "Number of Jobs".
2) Section 3.14: We find MachineName unsuitable to describe the site on whi
ch the job ran. We believe
that "ExecutingSite" or "siteName" is more appropriate especially in a Grid comp
uting environment
where resources are distributed.
3) Section 3.18: ProjectName fits well in Grid projects like LHC and EGEE,
but not in terms of virtual
organisations which form a natural grouping. We recommend an additional field ca
lled
"VirtualOrganisation" or "VO".
4) Section 10.8: CpuDuration type xsd:duration is not consistent with type
xsd:positiveInteger listed in
Appendix B
5) Section 13: Please provide an additional example for an aggregate record
(see below)
6) Appendix D: GlobalUsername described in section 3.6 does not appear in A
ppendix D
Recommendation for additional example: Section 13.3
For an aggregate record describing the total work done at the "RAL-LCG2" computi
ng site on the "EGEE"
project for the user "Dave Kant" in the "dteam" VO in December 2005 :-
<A5> Can the editor confirm that the usage record would look as follows?
<A5> Can such an example be provided in the document?
<?xml version="1.0" encoding="UTF-8"?>
<JobUsageRecord xmlns="http://www.gridforum.org/2003/ur-wg"
xmlns:urwg="http://www.gridforum.org/2003/ur-wg"...
View Full Message
|
|
|