SourceForge : View Wiki Page: InitialSketchOnEntities

Search Wiki Pages Project: GLUE Wiki > InitialSketchOnEntities > View Wiki Page

wiki1726: InitialSketchOnEntities

Initial Sketch on Entities Involved in GLUE (OBSOLETE)

The draft available in the Documents area (https://forge.gridforum.org/sf/docman/do/listDocuments/projects.glue-wg/docman.root.drafts) obsoletes this Wiki page.

Authors:

Sergio Andreozzi (INFN)
Balazs Kònya (Lund University)

Introduction

In this document, we present a conceptual information model described in natural language enriched with a graphical representation using the UML Class Diagram. It proposes an initial collection of entities that are involved in a Grid environment and that are meaningful to be modeled in GLUE. The current draft tries to allocate the concepts of the GLUE Schema 1.3 and NorduGrid schema (other schemas will be considered in the next phase). The main focus is the Computing Element and its design considers also these use cases . Since it is a conceptual model, it is targeted to be implementation-independent. When a commonly-agreed version will be available, we will define mapping to concrete data models (relational, LDAP, XML Schema and RDF). From the semantical viewpoint, the concrete data model should represent the same concepts and relationships of the conceptual information model; nevertheless it can contains simplifications specific to the target data model in order to improve query performance or other aspects.

Main Entities

Site: administrative domain grouping resources and services managed by the same set of persons
- NorduGrid does not have a site concept; there are administrative attributes in the resources;
- fine to add Site concept and move attributes there
- properties: URI (ex-UniqueID), Name, Description, Contact (Email, Web page, ...), Location (Latitude, Longitude, ...?), Sponsor, OtherInfo

Service:
- what is the difference between service and resource?
- service and endpoint; in our modeling, does a service have only one endpoint or can have multiple endpoints?
  - if many endpoints, then we need per-endpoint information
  - in Web Services world, a service instance has a single endpoint;
  - this needs to be discussed
- the service should have extension hooks like key,value pairs or tag set
- for some services, there will be the need to create service-specific schema
  - from the use cases, the list of service types with service-specific schema should be identified
- properties: URI (ex-UniqueID), Name, Type, Version, Endpoint, Status, StatusInfo, WSDL (?), Semantics, StartTime, Owner (?), AccessControlBaseRule (AuthorizationInfo), ServiceData=set of (Key,Value), Authentication (nordugrid-cluster-issuerca-hash, nordugrid-cluster-trustedca nordugrid-cluster-issuerca)
- tentative types:
  - Computing: specialization of the service with additional information
  - Storage Management
  - File Transfer
  - ...

Resource:
- tentative types:
  - Computing: specialization of the resource with additional information
  - Storage (+ files)
  - Network
  - Data
  - Instrument
  - ...

Element
- grouping concept for services and the related resources; it is introduced in order to capture the relationship among services managing or abstracting the access to resources
- you can have service without resource; e.g., GridFTP server
- the same resource can be shared across multiple services
  - e.g., SRMv1 and SRMv2.2 services on top of the same resource
  - e.g., CREAM and GRAM interface to the same batch system
- can we have multiple resources served by the same service instance?
- what is the definition of the resource for us?
- policy information is part of services or resources?
- investigate other definitions like WSRF definitions
- tentative types:
  - Computing (see below for more details)
  - Storage

Grid Job: the JSDL is a job request document; the job information is a snapshot of the information related to a job as soon as it is created in a Grid (e.g., when it is submitted to the Broker)
- strongly needed by NorduGrid
- need to model different types of jobs (e.g., single processor job, collection of jobs, workflows); a job type can be useful
- information from jobs coming from different services (e.g., broker, accounting system, computing element)
- some of the job attributes should possibly map the OGF usage record spec
- important is the state model;

Virtual Organization
- from Travica: "a new organizational form which manifests itself as a temporary or permanent collection of geographically dispersed individuals, groups or organizational units, either belonging or not belonging to the same organization, or entire organizations that depend on electronic links in order to complete the production process"
- from NorduGrid
  - ARC VO is defined as a structured group of individuals with “externally” provisioned resources which as a whole is a subject of VO-based resource allocation, accounting, authorization and resource discovery. Resources utilized by a VO are expected to be provisioned via SLA’s.

Authorization ???
- do we need an entity for expressing authorization policies to be associated to services and/or resources?

Conceptual Model of the Computing Element

Computing Element: grouping concept for computing services and the related computing resource; a Computing Element is managed by a single Local Resource Management System; a Local Resource Managent System can be a batch system or other types of systems. The OS can be the simplest case of LRMS. The Computing Element may contain aggregated status information.
- Properties: URI, Name, LRMS(Type, Version, Note) (if the hypothesis that a CE maps to a set of resources managed by the same LRMS), nordugrid-cluster-support?, nordugrid-cluster-localse (similar to Glue.CESEBind.SEUniqueID), TotalJobs, RungningJobs, WaitingJobs, nordugrid-cluster-prelrmsqueued, nordugrid-cluster-owner, OtherInfo (e.g., to adversite a URL of a web page giving info about this element)

Computing Resource
- grouping concepts for a set of different types of execution environments; used to have aggregated information
- Properties: URI/LocalID (?), Total, Used, Physical/LogicalCPUs, Shared directories (e.g., TmpDir, ScratchDir, DataDir), Homogeneity, NetworkInfo (type of internal network available among the execution environments), Aggregated/Global Benchmark Info (?), CPUDistribution (number of boxes:number of CPUs, it can be repeated, e.g.: 1:16 3:2), nordugrid-cluster-sessiondir-free, nordugrid-cluster-sessiondir-total, nordugrid-cluster-cache-free, nordugrid-cluster-cache-total
  - Total, Used refers to the number of execution environments
  - What to do with logical vs. physical CPUs? (e.g., logical comes from hyperthreading)

Execution Environment : a description of hardware and software characteristics that defines the environment available to and requestable by a Grid job when submitted to a Computing Service
- Properties: URI/LocalID (?), Type (e.g., virtualnode, realnode, smp, thread, multicore), LogicalUnitNumber (?), Total, Used, Unavailable, Physical/LogicalCPUs, CPU (Vendor, Model, Version, ClockSpeed, InstructionSet, OtherDescription), Memory (RAMSize, VirtualSize), OS (Name, Release, Version), Node benchmark (e.g., SPECInt2000, SPECfp2000), NetworkConnectivity (Inbound, Outbound), PlatformType (e.g., IA64),
- the software part is described by the Application Environment entity (besides the OS)
- a computing resource is a collection of execution environments
- the execution environments can be of different types
  - a typical implementation of an execution environment is a computing node
  - a virtual machine image that can be requested by a job represents also a possible execution environment
    - different virtual machine images/execution environments can coexist on the same node
- issues:
  - how to deal with software packages that will be available only to certain VO's

Application Environment
- description of the application software environment available within one or more execution environments
- it can be in relationship with an execution environment entity and/or with the computing resource entity
- it should be used also for application environments described in terms of a simple tag (like the attribute RunTimeEnvironment in GLUE 1.3)
- properties: ID (URI, local, global, relative?), Catalogue/NameServer, Name, Version, Status (e.g.: tested, dynamic, installable), Lifetime, Bootstrapping (InstalledRoot, EnvironmentSetup, ModuleName), SoftwareData: (key,value)
- what about if they are available to only a certain VO/set of VOs?

Computing Service: a specialization of Service with addition of a relationship to a computing element and a relationship to the shares
- properties: Implementation (Name, Version), staging capabilities?,
- issue
  - does it apply to all Service entities?

Share -> a utilization target defined by a set of policies, by status information and by an association to resources
- a typical implementation of a share is a batch queue with the associated policies and status information
- the same share can be implemented using different batch system configuration/strategies
- in complex batch systems, it is possible to define different set of policies for the same batch queue, this will imply a share for each set of policies
- a share can be implemented by virtual machines management systems (to be extended)
- the model supports heterogeneity by being able to represent different execution environments associated to the same share
- properties: LocalID, Name, Max(WallTime, CPUTime, TotalJobs?, RunningJobs, WaitingJobs, nordugrid-cluster-prelrmsqueued, Memory, DiskSpace, SlotsPerJob, StageInStreams, StageOutStreams), Min (CPUTime, WallTime), Default (WallTime, CPUTime), preemption, priority?, authorization?, state (runningJobs, waitingJobs, totalJobs?, estimatedResponseTime, WorstResponseTime, freeJobSlots, Status?), meta-information (e.g., human-readable description, scheduling policy, comment), directory? (Data?, Application?), DefaultSE?, nordugrid-cluster-sessiondir-lifetime
- issues:
  - authorization vs. share
  - does this fully cover the concept of VOView of GLUE 1.3?
  - do we need priority in share? what is the use case for having them?
  - how to reflect policy on heterogeneous execution environment?
    - e.g., a share is associated to AMD and INTEL-based execution environment? do I want to be able to represent policy per type of execution environment? is it the trivial solution of splitting the share in two shares enough?
  - how to reflect authorization policies in the share? they should support not only VO's but also privilege attributes like one defined by VOMS
  - local vs. Grid job state information in shares (e.g., in nordugrid schema: nordugrid-queue-gridrunning nordugrid-queue-gridqueued nordugrid-queue-localqueued, nordugrid-queue-running)
  - status like production/draining/queuing/closed is per share or per service?
  - consider that waiting jobs can refer either to job waiting in the batch queue or in the front-end grid layer (e.g., during file staging); decide if the waitingJobs attribute refer to both or if we should add one more category (e.g. nordugrid-queue-prelrmsqueued )
    - this consideration also applies to MaxWaiting
  - in GLUE we have assignedJobSlots and maxRunningJobs; we believe that modeling only the max running is enough

Job: the JSDL is a job request document; the job information is a snapshot of the information related to a job as soon as it is created in a Grid (e.g., when it is submitted to the Broker)
- at the moment, this entity refers to only a single job
- some of the job attributes should possibly map the OGF usage record spec
- important is the state model;
- properties: ID (globalID, localID, globalOwner, localOwner, nordugrid-job-jobname), Requests/Environment (nordugrid-job-reqwalltime, nordugrid-job-reqwalltime nordugrid-job-reqcputime, nordugrid-job-runtimeenvironment, nordugrid-job-cpucount, nordugrid-job-stdout nordugrid-job-stderr nordugrid-job-stdin), Status (nordugrid-job-comment, nordugrid-job-status, nordugrid-job-queuerank, nordugrid-job-rerunable, nordugrid-job-exitcode, nordugrid-job-errors, nordugrid-job-executionnodes, nordugrid-job-execcluster, nordugrid-job-execqueue, nordugrid-job-usedwalltime nordugrid-job-usedmem nordugrid-job-usedcputime, nordugrid-job-completiontime, nordugrid-job-submissionui, nordugrid-job-submissiontime, nordugrid-job-clientsoftware, nordugrid-job-gmlog, nordugrid-job-sessiondirerasetime, nordugrid-job-proxyexpirationtime)
- issues:
  - it would be nice to have a timestamp-like information like (tag, timestamp), to be used for instance for status change info
  - we should sync these attributes with JSDL attributes and Usage Record attributes

Questions and Open issues

static vs. dynamic attribute values
- first, we should define what is static and dynamic
  - a possible improvement is to distinguish among configuration vs. monitoring
- should static/configuration-related attribute be separated from dynamic attributes?
- from time to time, people ask for that
  - pro: easier implementation
  - con: UML/Schema less readable and more complex
- possible approach:
  - at the UML class diagram level, static vs. dynamic is not described by separation into different classes, but with annotations/tagged values
  - at the concrete data model level (relational, XMLSchema), static vs. dynamic can be considered in order to simplify the implementation

do we need a common authorization entity?
- such kind of info is needed in several classes and we should have a common way to define them and to relate them to the various classes
- how would this relate to the share

how to deal with extensibility?
- in GLUE 1.3: tag-like approach: capability[*], (key,value)[*]

In which entities and in which way we should display owner information?
- Contact support and ownership information, is it enough to have it at the site level or should we have also at the "element" level?

when reviewing the directory information in Computing Resource, consider the standardization activity on Grid job environment

how to deal with interactive jobs? where do we publish information about the service providing interaction with jobs? (a different service specialization together with the computing service?)

Conceptual Model of the Storage Element

See the attachment (SE.ppt)

Attachments:

SE.ppt [InitialSketchOnEntities/SE.ppt]

GLUE2.zargo [InitialSketchOnEntities/GLUE2.zargo]

ComputingElement.png [InitialSketchOnEntities/ComputingElement.png]

Core.png [InitialSketchOnEntities/Core.png]

Show Details