01/10/2006 6:20 AM
post4812
|
LCG experiences and comments: GGF-UR schema
We have prepared a word document at the following URL:
http://goc.grid-support.ac.uk/gridsite/accounting/GGF-Comments.doc
In the LHC Computing Grid (LCG), we are concerned with obtaining a global view of job usage for each Virtual
Organisation, each participating Country and for each computing site providing resources to the project. The LCG project
receives usage data from computing sites participating in different grid projects (EGEE, OSG, SweGrid) each using their
own internal accounting sensors to collect accounting data which must be consolidated at a suitable level.
Within each grid, the total number of individual job records is quite high: In the period 2004-2005, about 70% of EGEE
sites (approx 120 compute sites) generated approximately 6 millions job records.
LCG Accounting reports are provided through a graphical interface. Queries must be sufficiently fast: it is a
requirement to minimise database traversal time and avoid repetitive summations of the data. The sort of questions we
are asking are quite high level: we are concerned with collections of jobs rather than individual jobs:
[*] Total Usage consumed by each VO in the LCG project
[*] Share of Total Usage per VO for each Grid Project (EGEE, OSG, SweGrid)
[*] How much CPU time did a particular VO consume in each quarter of 2005?
[*] If the ATLAS VO consumed 10,000 CPU hours in Jan 2005, who are the users that submitted the work?
[*] Which computing facilities (Sites) provided the resources to do this work?
We have been developing a graphical reporting interface based on aggregate accounting here:-
http://goc.grid-support.ac.uk/gridsite/accounting/tree/treeview.php
Comments regarding draft March 2005
=========================
General Comments:
1) Section 3: We did not find a suitable quantity to describe information in an aggregated usage record such as "Number
of Jobs".
2) Section 3.14: We find MachineName unsuitable to describe the site on which the job ran. We believe that "
ExecutingSite" or "siteName" is more appropriate especially in a Grid computing environment where resources are
distributed.
3) Section 3.18: ProjectName fits well in Grid projects like LHC and EGEE, but not in terms of virtual organisations
which form a natural grouping. We recommend an additional field called "VirtualOrganisation" or "VO".
4) Section 10.8: CpuDuration type xsd:duration is not consistent with type xsd:positiveInteger listed in Appendix B
5) Section 13: Please provide an additional example for an aggregate record (see below)
6) Appendix D: GlobalUsername described in section 3.6 does not appear in Appendix D
Recommendation for additional example: Section 13.3
====================================
For an aggregate record describing the total work done at the "RAL-LCG2" computing site on the "EGEE" project for the
user "Dave Kant" in the "dteam" VO in December 2005 :-
[*] Can the editor confirm that the usage record would look as follows?
[*] Can such an example be provided in the document?
<?xml version="1.0" encoding="UTF-8"?>
<JobUsageRecord xmlns="http://www.gridforum.org/2003/ur-wg"
xmlns:urwg="http://www.gridforum.org/2003/ur-wg"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.gridforum.org/2003/ur-wg file:/Users/bekah/Documents/GGF/URWG/urwgf-schema.09.xsd">
<RecordIdentity urwg:recordID="" urwg:createTime="2005-12-01T02:15:45Z" />
<aggregate>
<GlobalUsername>/C=UK/O=eScience/OU=QueenMaryLondon/L=Physics/CN=davekant</GlobalUsername>
<VirtualOrganisation>dteam</VirtualOrganisation>
<ProjectName>EGEE</ProjectName>
<ExecutingSite>RAL-LCG2</ExecutingSite>
<Charge urwg:description="SpecInt2K">800</Charge>
<NJobs>17423</Njobs>
<WallDuration>PT3405H</WallDuration>
<CPUDuration>PT3353H</CPUDuration>
<StartTime>2005-12-01T02:15:45Z</StartTime>
...
|
|
|