Description: |
Peter V. posted this to the mailing list. I put it here to track it for post V1 activity. See the mailing list with
subject:
[ogsi-wg] Iterator portType
for the discussion.
================
Hi folks,
In the DAIS WG on Thursday (at GGF, 6/26), there was discussion of the
properties of something they've called a Dataset. It occurred to me
that these properties are a specialization of what is often called an
iterator.
It seems to me there is value in defining a core Iterator portType
(aka interface). For one thing, it allows for Unix-style pipes in the
Grid context. Also, this portType would allow for generic procedures
to be created such as a service ("map") that takes a unary function
and an iterator and returns an iterator yielding the results of the
function applied to each element yielded by the input iterator. Such
a portType maps to various programming language constructs such as
Java's Iterator interface.
There is more on a possible iterator interface below, but first:
is it useful to standardize such a concept and, if so,
what is the appropriate forum for doing so and
what's the mechanism?
I suspect it is OGSI that should take on defining such core things,
although maybe OGSA should officially delegate the task to OGSI.
So, here's a starting point for a definition. The Iterator portType
would extend the GridService portType and have an operation
next() -> xsd:anyType
taking no parameters and yielding some chuck of XML. Or it may return
an EndOfData fault (to be specified). There should also be an
operation
hasNext() -> xsd:Boolean
that indicates whether there are more elements to be generated. The
Iterator portType would also define a single service data element
(SDE)
yieldType: xsd:QName
which would give the type of the elements produced by next().
The Iterator portType might also have a reset() operation with the
provision that not all services implement it it. Perhaps there'd be
an SDE announcing this functionality. Perhaps there'd also be a
non-blocking version of next().
At the level of Iterator, the semantics would be mostly unspecified,
allowing this interface to be used in many situations. Subinterfaces
could add constraints and/or add further operations. Operations
returning Iterators might also specify further properties. Note that
this interface could be used to access both precomputed (synchronous)
data and compute-on-demand data.
The DAIS-WG could define (say) a RowSetIterator portType that extends
(inherits from) this simple Iterator. It would specify that the
output of next() is a "row" (specified as an XML type) and provide an
SDE for determining the types of columns. It might also have SDEs to
allow introspection of various DB-related properties. Note that the
lifetime of such a grid service could be considered as close(),
allowing for both client initiated close and for close when data has
been consumed.
Since the Iterator returns any XML, one thing it can return is
Locators, allowing iterating over arbitrary collections of Grid
Services. Perhaps there would be portTypes extending the service
group portTypes that can iterate over service collections and
portTypes that populate a collection from an iterator.
I am aware that are many unspecified semantic issues. For instance,
what's the relationship of an iterator and its source when the source
is modified before the iterator completes? This core definition of
Iterator would deliberately *not* specify these sorts of issues,
allowing the iterator to be used in as many cases as possible. These
issues would be pushed to sub-portTypes and to the operations that
yield iterators.
So, is this useful and, if so, what should be done about it?
Pete . |