SourceForge : artf5886: XPath compliance for Query operations

Project: RUS-WG Trackers > Doc Change Request > View Artifact

Artifact artf5886 : XPath compliance for Query operations

Tracker:	Doc Change Request
Title:	XPath compliance for Query operations
Description:	Slide 6: Should RUS support full XPath or should it restrict XPath? Full XPath compliance would be good because clients would not mystically get back failed requests without knowing why. Full XPath compliance also means that you have to be able to return parts of UsageRecords if the client selected only a part of a record. This is currently not allowed by the specification because it requires you to return whole URs only. XPath queries can be quite slow to execute and destroy any performance. This was the reason why we have actually removed all XPath queries from our RUS and only allow pre-canned queries via separate operations (extractByUser etc.). For performance I would recommend to put the extractBy... methods back into the specification. Also XPath queries are hard to map to SQL queries and RDBs are needed for performance. How is this handled by other people. The RUSQueryTooComplexFault (or equivalent) would give a server the chance to respond reasonably when it deems the query as being too complex or taking too much time to execute. Suggestion 2 (restrict XPath) would be covered by Suggestion 1 (support full XPath, allow RUSQueryTooComplexFault) since a server could restrict the allowed XPath queries by returning the fault. In case of allowing partial URs to be returned, how can a client validate the result against the UR schema. This problem should also be tackled by the DAIS-WG since they also have to return query results. Check how they did address that problem.
Submitted By:	Gilbert Netzer
Submitted On:	05/28/2007 4:16 AM EDT
Last Modified:	05/28/2007 5:00 AM EDT

	Status / Comments		Change Log		Associations (2)		Attachments

Status
Group: *
Status:*	Open
Category: *
Customer: *
Priority: *	3
Assigned To: *	None
Reported in Release: *
Fixed in Release: *
Estimated Hours: *	0
Actual Hours: *	0

Comments

Gilbert Netzer: 05/28/2007 5:00 AM EDT

Comment:

Comments from EMail from Xiaoyou Chen, 05/18/2007 05:43 PM, Further down the EMail

We have a long discussion about RUS extration operations in terms of complex XPath and "BIG" returns. The proposed solutions by OGF20 are to have a 
RUSQueryTooComplexFault, which seems important for full XPath compatible input for extraction. However, does this really fit into the clarification 
within the specification???

Action:

Update

Gilbert Netzer: 05/28/2007 4:30 AM EDT

Comment:

Comments from EMail from Xiaoyou Chen, 05/18/2007 05:43 PM

Given aggreement on full XPath compliant RUS::Extraction, i will give a breif review on query-specific service interface defintions:
2.1 RUS::extractUsageRecords
      This operation allows query usage records according to XPath search term.
      Input: search criteria expression as XPath;
      Output: A list of returns;
                    OpeationResult;
      Faults: RUSUserNotAuthorisedFault; RUSInvalidInputFault; RUSInternalFault;
      Issues: This operation allows flexible query on returning partial usage information. e.g. A user only queries the charge information of his or 
her jobs, how to validate the returned usage records. 
      Potential Solutions:
                       Option 1: does not validate returns at all, and the list of returns;
                       Option 2: restrict the output returns to OGF-UR records with urf:RecordIdentity as a mandatory element. However, the 
specification does not clarify how to put urf:RecordIdentity into returns and leaves implementations to decide. For example, the implementation of RUS
::extractUsageRecords might check XPath input and append XPath statement with urf:RecordIdentity to be renturn implicitly. In this sense, the output 
of this operations should return a list of usage records.
     
2.2   RUS::extractRecordIds
       This operation helps clients to obtain record identities according to XPath search term.
       Input: search criteria expression as XPath;
       Output: A list of record identities (string);
                      OperationResult;
       Faults: RUSUserUnauthorisedFault; RUSInvalidInputFault; RUSInternalFault;
 
2.3   RUS::extractSpecUsageRecords (new Proposal)
        This operation allows queries of usage record using recordId as search key;
        Input: a list of recordIds as strings;
        Output: A list of usage records;
        Faults: RUSUserUnauthorisedFault; RUSRecordNotFoundFault; RUSInternalFault;

Action:

Update

Gilbert Netzer: 05/28/2007 4:25 AM EDT

Action:

Update
Description changed from

Slide 6: Should RUS support full XPath or should it restrict XPath?

Full XPath compliance would be good because clients would not mystically get
back failed requests without knowing why. 

Full XPath compliance also means that you have to be able to return parts of
UsageRecords if the client selected only a part of a record. This is currently
not allowed by the specification because it requires you to return whole URs
only. 

XPath queries can be quite slow to execute and destroy any performance. This 
was the reason why we have actually removed all XPath queries from our RUS and
only allow pre-canned queries via separate operations (extractByUser etc.).
For performance I would recommend to put the extractBy... methods back into
the specification.
Also XPath queries are hard to map to SQL queries and RDBs are needed for
performance. How is this handled by other people.

The RUSQueryTooComplexFault (or equivalent) would give a server the chance to
respond reasonably when it deems the query as being too complex or taking too
much time to execute.

Suggestion 2 (restrict XPath) would be covered by Suggestion 1 (support full
XPath, allow RUSQueryTooComplexFault) since a server could restrict the
allowed XPath queries by returning the fault.

A possible solution to long queries could be to allow for a reply of style "I
am too busy, come back later". This could also be used selectively to defer
complex queries until low system load allows there handling.
That could be hard to implement in case of many concurrent requests, and it
would not solve the problem of many complex queries.

One solution to the problem could be to require proper authorization to be
allowed to execute complex queries. (e.g. the authorization decision also
is based on the query).

One solution would also be to have a method to return a catalog of allowed
queries that the server is willing to process. A variant of this would be
to have a operation to check if a server is willing to execute a query.
This functionality could already be provided by the query operation, because 
it will tell you if the server rejected the query and return the result 
otherwise.
A idea in conjunction with this would be to have a minimum catalog of simple
queries that a instance has to execute to give clients a well known fall-back
mechanism.

In case of allowing partial URs to be returned, how can a client validate the
result against the UR schema.
This problem should also be tackled by the DAIS-WG since they also have to
return query results. Check how they did address that problem.

Slide 6: Should RUS support full XPath or should it restrict XPath?

Full XPath compliance would be good because clients would not mystically get
back failed requests without knowing why. 

Full XPath compliance also means that you have to be able to return parts of
UsageRecords if the client selected only a part of a record. This is currently
not allowed by the specification because it requires you to return whole URs
only. 

XPath queries can be quite slow to execute and destroy any performance. This 
was the reason why we have actually removed all XPath queries from our RUS and
only allow pre-canned queries via separate operations (extractByUser etc.).
For performance I would recommend to put the extractBy... methods back into
the specification.
Also XPath queries are hard to map to SQL queries and RDBs are needed for
performance. How is this handled by other people.

The RUSQueryTooComplexFault (or equivalent) would give a server the chance to
respond reasonably when it deems the query as being too complex or taking too
much time to execute.

Suggestion 2 (restrict XPath) would be covered by Suggestion 1 (support full
XPath, allow RUSQueryTooComplexFault) since a server could restrict the
allowed XPath queries by returning the fault.

In case of allowing partial URs to be returned, how can a client validate the
result against the UR schema.
This problem should also be tackled by the DAIS-WG since they also have to
return query results. Check how they did address that problem.

Gilbert Netzer: 05/28/2007 4:23 AM EDT

Comment:

Comment from EMail by Rosario Piro, 05/15/2007 08:14 PM

Full XPath support vs. XPath limitations:
- I wouldn't say that XPath queries destroy the performance, although they of course can slow it down, that depends on the complexity and also on the 
underlying database. Even with a relational DB schema the execution of a query can be awfully slow if an XPath has to be translated into an SQL 
statement that will cause many tables to be joint. Also, I think the most problematic part of the query is the selection of the requested records 
within the maybe millions of records in the database (being it relational or XML) and less the question whether, after these records have been found, 
they should be returned completely or just pieces of them (maybe returning complete records is even worse for the performance, for example if a 
relational DB is used which means the server will needs to reassemble the complete UR documents before returning them ...?). But I think although we 
should keep performance issues in mind (nonetheless we can't forsee many problems without having experienced a lot with this), we should not focus too
 much on that. The question is more whether we want the RUS interface to be completely XPath-compliant or leave the restrictions that are currently in
 place. I think full compliance is the better choice, above all since we're talking about standards :o)
- If XPath will allow to retrieve only pieces/parts of URs: Why should a client need to validate what it got back against the UR spec. If I'm 
interested only in a list of job IDs than I will check only whether what  I got back is a list of job IDs, not wether that list can be validated 
against the UR. And if I want to do that validation, this is perfectly fine, since I can always ask for entire records that I can then validate 
against the spec. It is up to the client to check what it gets back (knowing exactly what it wanted back; entire URs for validation, just a list of 
job IDs, or whatever ...)

Action:

Update

Gilbert Netzer: 05/28/2007 4:16 AM EDT
	Action:	Create

Return

< Previous

Next >