This is a static archive of the previous Open Grid Forum GridForge content management system saved from host forge.ogf.org file /sf/wiki/do/viewPage/projects.gin/wiki/WorkerNodeEnvironment at Fri, 04 Nov 2022 20:12:38 GMT SourceForge : View Wiki Page: WorkerNodeEnvironment

Project Home

Tracker

Documents

Tasks

Source Code

Discussions

File Releases

Wiki

Project Admin
Search Wiki Pages Project: GIN-CG     Wiki > WorkerNodeEnvironment > View Wiki Page
wiki1786: WorkerNodeEnvironment
The Execution Environment

Introduction

This document presents a recommendation for a middleware-independent environment that a grid job can find on every execution host. The requirement for this has been demonstrated during various interoperations activities including the recent work within GIN-CG. This recommendation has taken inspiration from a similar recommendation from EGEE and has gathered input from other projects including, OSG, TeraGrid, NorduGrid and DEISA.

This document covers three main areas, environment variables, boot strapping environments and publishing application environments. It may be extended to include application software deployment.

The Environment Variables

The following is a list of the name and semantics of the environment variables. The semantics should agree with JDSL and the Glue Schema where applicable.
GRID_WN_ID
An identifier for the execution host within a cluster.
GRID_CLUSTER_ID
A unique identifier for the cluster. This should be in agreement with what is published into the information system.
GRID_SITE_ID
A unique identifier for the site. This should be in agreement with what is published into the information system.
GRID_NAME
The name of the grid infrastructure in which the site is participating.
GRID_LOCAL_JOB_ID
Job identifier within the local batch system.
GRID_GLOBAL_JOB_ID
Job identifier at the grid level.
GRID_USER_ID
Job submitter identifier.
GRID_VO_ID
The Unique identifier for the VO to which the job submitter belongs.
GRID_APPLICATION_ROOT
The path on the execution host where the application software can be found.
HOME
Home directory on the execution host.
TMPDIR
Directory to create temporary files
GRID_CLUSTER_HOME
Home directory which is shared across the cluster.
GRID_GLOBAL_HOME
Home directory which is shared across the grid infrastructure.
GRID_CLUSTER_SCRATCH
Shared which is shared across the cluster.
GRID_GLOBAL_SCRATCH
Shared across the grid infrastructure.
GRID_HOME_TYPE
Type of file system used. An enumerated list including, PFS, GPFS, PVFS, etc.
GRID_SCRATCH_TYPE
Type of file system used. An enumerated list including, PFS, GPFS, PVFS, etc.

Boot Strapping Environments

Bootstrapping scripts should be used to bootstrap application environments. The location of the bootstrapping scripts will depend on the VO and the application. VO names should be globally unique. All the application software for a specific VO should be located under the directory ${GRID_APPLICATION_ROOT}/${GRID_VO_ID}. It is up to the VO to manage the space beneath this directory. For common applications and middleware clients which pre-installed across an infrastructure eg, Globus, gLite etc. the same mechanism can be used however , instead of the ${GRID_VO_ID} the application name is used, for example. ${GRID_APPLICATION_ROOT}/globus.

Publishing Application Environments

The applications environments available should be published in accordance to the Glue Schema. The mechanism used publish is dependent on the method used to deploy the application software.

Application software deployment.

This topic still needs to be discussed.

Feedback

Aleksandr

Many statements in a draft are clearly non-applicable to some execution environements - some of them are not even grid-related. Hence it would be nice to have some tags like MUST, SHALL, etc. assigned to every item and action to be taken in case of absence of those items. Purpose/usage of defined items would be also nice to have. That would help to understand what is expected that item to represent.

Here are some items with meaning not clear to me:

GRID_SITE_ID - what is "site"? GRID_GLOBAL_JOB_ID - at Grid level job may have multiple IDs. Should it be one known to initial client or last one known during job's submission to batch system? Or list of them? GRID_VO_ID - there may be job's (and users) unrelated to any VO. Should there be some generic VO name? Or some ad-hoc name? Or VO per user? GRID_APPLICATION_ROOT - what is "application software"? HOME - what if jobs are not assigned any home-like environement? GRID_CLUSTER_HOME - what if there is no "home" shared among cluster's nodes? Same goes for GRID_GLOBAL_HOME, GRID_CLUSTER_SCRATCH and GRID_GLOBAL_SCRATCH. Since those locations are shared, their access protocol is most probably nor POSIX. What is format of associated value, URL? GRID_HOME_TYPE - why enumerated? What about exotic file systems?

Why bootstraping scripts location includes VO name at all? How about software shared among VOs? Used by non-VO users? Is there any namespace planned to distinguish between different softwares with same name?

A.K.

 



Versions Associations Attachments Back Links  
Version Version Comment Created By
Version 10 Laurence Field - 06/27/2007
Version 9 Laurence Field - 06/25/2007
Version 8 Laurence Field - 06/25/2007
Version 7 Laurence Field - 06/25/2007
Version 6 Laurence Field - 06/25/2007
Version 5 Laurence Field - 06/25/2007
Version 4 Laurence Field - 06/25/2007
Version 3 Laurence Field - 06/25/2007
Version 2 Laurence Field - 06/25/2007
Version 1 Laurence Field - 06/06/2007



The Open Grid Forum Contact Webmaster | Report a problem | GridForge Help
This is a static archive of the previous Open Grid Forum GridForge content management system saved from host forge.ogf.org file /sf/wiki/do/viewPage/projects.gin/wiki/WorkerNodeEnvironment at Fri, 04 Nov 2022 20:12:47 GMT