SourceForge : View Wiki Page: WorkerNodeEnvironment

Search Wiki Pages Project: GIN-CG Wiki > WorkerNodeEnvironment > View Wiki Page

wiki1786: WorkerNodeEnvironment

The Execution Environment

Introduction

This document presents a recommendation for a middleware-independent environment that a grid job can find on every execution host. The requirement for this has been demonstrated during various interoperations activities including the recent work within GIN-CG. This recommendation has taken inspiration from a similar recommendation from EGEE and has gathered input from other projects including, OSG, TeraGrid, NorduGrid and DEISA.

This document covers three main areas, environment variables, boot strapping environments and publishing application environments. It may be extended to include application software deployment.

The Environment Variables

The following is a list of the name and semantics of the environment variables. The semantics should agree with JDSL and the Glue Schema where applicable.

GRID_WN_ID: An identifier for the execution host within a cluster.

GRID_CLUSTER_ID: A unique identifier for the cluster. This should be in agreement with what is published into the information system.

GRID_SITE_ID: A unique identifier for the site. This should be in agreement with what is published into the information system.

GRID_NAME: The name of the grid infrastructure in which the site is participating.

GRID_LOCAL_JOB_ID: Job identifier within the local batch system.

GRID_GLOBAL_JOB_ID: Job identifier at the grid level.

GRID_USER_ID: Job submitter identifier.

GRID_VO_ID: The Unique identifier for the VO to which the job submitter belongs.

GRID_APPLICATION_ROOT: The path on the execution host where the application software can be found.

HOME: Home directory on the execution host.

TMPDIR: Directory to create temporary files

GRID_CLUSTER_HOME: Home directory which is shared across the cluster.

GRID_GLOBAL_HOME: Home directory which is shared across the grid infrastructure.

GRID_CLUSTER_SCRATCH: Shared which is shared across the cluster.

GRID_GLOBAL_SCRATCH: Shared across the grid infrastructure.

GRID_HOME_TYPE: Type of file system used. An enumerated list including, PFS, GPFS, PVFS, etc.

GRID_SCRATCH_TYPE: Type of file system used. An enumerated list including, PFS, GPFS, PVFS, etc.

Boot Strapping Environments

Bootstrapping scripts should be used to bootstrap application environments. The location of the bootstrapping scripts will depend on the VO and the application. VO names should be globally unique. All the application software for a specific VO should be located under the directory ${GRID_APPLICATION_ROOT}/${GRID_VO_ID}. It is up to the VO to manage the space beneath this directory. For common applications and middleware clients which pre-installed across an infrastructure eg, Globus, gLite etc. the same mechanism can be used however , instead of the ${GRID_VO_ID} the application name is used, for example. ${GRID_APPLICATION_ROOT}/globus.

Publishing Application Environments

The applications environments available should be published in accordance to the Glue Schema. The mechanism used publish is dependent on the method used to deploy the application software.

Application software deployment.

This topic still needs to be discussed.

Feedback

Aleksandr

Many statements in a draft are clearly non-applicable to some execution environements - some of them are not even grid-related. Hence it would be nice to have some tags like MUST, SHALL, etc. assigned to every item and action to be taken in case of absence of those items. Purpose/usage of defined items would be also nice to have. That would help to understand what is expected that item to represent.

Here are some items with meaning not clear to me:

GRID_SITE_ID - what is "site"? GRID_GLOBAL_JOB_ID - at Grid level job may have multiple IDs. Should it be one known to initial client or last one known during job's submission to batch system? Or list of them? GRID_VO_ID - there may be job's (and users) unrelated to any VO. Should there be some generic VO name? Or some ad-hoc name? Or VO per user? GRID_APPLICATION_ROOT - what is "application software"? HOME - what if jobs are not assigned any home-like environement? GRID_CLUSTER_HOME - what if there is no "home" shared among cluster's nodes? Same goes for GRID_GLOBAL_HOME, GRID_CLUSTER_SCRATCH and GRID_GLOBAL_SCRATCH. Since those locations are shared, their access protocol is most probably nor POSIX. What is format of associated value, URL? GRID_HOME_TYPE - why enumerated? What about exotic file systems?

Why bootstraping scripts location includes VO name at all? How about software shared among VOs? Used by non-VO users? Is there any namespace planned to distinguish between different softwares with same name?

A.K.

Hide Details

	Versions		Associations		Attachments		Back Links

Version	Version Comment	Created By
Version 10		Laurence Field - 06/27/2007
Version 9		Laurence Field - 06/25/2007
Version 8		Laurence Field - 06/25/2007
Version 7		Laurence Field - 06/25/2007
Version 6		Laurence Field - 06/25/2007
Version 5		Laurence Field - 06/25/2007
Version 4		Laurence Field - 06/25/2007
Version 3		Laurence Field - 06/25/2007
Version 2		Laurence Field - 06/25/2007
Version 1		Laurence Field - 06/06/2007