Implementation of XQuery Use Case

1 Use Case "XMP": Experiences and Exemplars

This use case contains several example queries that illustrate requirements gathered from the database and document communities.

1.1 Data

bib.dtd, bib.xml
books.dtd, books.xml
review.dtd, review.xml
prices.dtd, prices.xml

1.2 Queries

1.2.1 Q1

List books published by Addison-Wesley after 1991, including their year and title.

Solution in XQuery:

<bib>
{
for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
where $b/publisher = "Addison-Wesley" and $b/@year > 1991
return
<book year="{ $b/@year }">
{ $b/title }
</book>
}
</bib>

Result expected
Implementation in TOM

NOTE: no problem. easy for implementation

%match ....  <bib><book></book></bib>  ->

1.2.2 Q2

Create a flat list of all the title-author pairs, with each pair enclosed in a "result" element.

Solution in XQuery:

<results>
{
for $b in doc("http://bstore1.example.com/bib.xml")/bib/book,
$t in $b/title,
$a in $b/author
return
<result>
{ $t }
{ $a }
</result>
}
</results>
Result expected

Implementation in TOM

NOTE: no problem. easy for implementation

1.2.3 Q3

For each book in the bibliography, list the title and authors, grouped inside a "result" element.

Solution in XQuery:

<results>
{
for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
return
<result>
{ $b/title }
{ $b/author }
</result>
}
</results>
Result expected

Implementation in TOM

NOTE: no problem. easy for implementation

1.2.4 Q4

For each author in the bibliography, list the author's name and the titles of all books by that author, grouped inside a "result" element.

Solution in XQuery:

<results>
{
let $a := doc("http://bstore1.example.com/bib/bib.xml")//author
for $last in distinct-values($a/last),
$first in distinct-values($a[last=$last]/first)
order by $last, $first
return
<result>
<author>
<last>{ $last }</last>
<first>{ $first }</first>
</author>
{
for $b in doc("http://bstore1.example.com/bib.xml")/bib/book
where some $ba in $b/author
satisfies ($ba/last = $last and $ba/first=$first)
return $b/title
}
</result>
}
</results>
Result expected
Implementation in TOM

NOTE: DIFFICULT and COST TIME.  First get a list of authors, and in 2nd loop we must find all books written by each author.

I'm using a Vector to store list of authors.

1.2.5 Q5

For each book found at both bstore1.example.com and bstore2.example.com, list the title of the book and its price from each source.

Solution in XQuery:

<books-with-prices>
{
for $b in doc("http://bstore1.example.com/bib.xml")//book,
$a in doc("http://bstore2.example.com/reviews.xml")//entry
where $b/title = $a/title
return
<book-with-prices>
{ $b/title }
<price-bstore2>{ $a/price/text() }</price-bstore2>
<price-bstore1>{ $b/price/text() }</price-bstore1>
</book-with-prices>
}
</books-with-prices>
Result expected
Implementation in TOM

NOTE: NO problem

1.2.5 Q6

For each book that has at least one author, list the title and first two authors, and an empty "et-al" element if the book has additional authors.

Solution in XQuery:

<bib>
{
for $b in doc("http://bstore1.example.com/bib.xml")//book
where count($b/author) > 0
return
<book>
{ $b/title }
{
for $a in $b/author[position()<=2]
return $a
}
{
if (count($b/author) > 2)
then <et-al/>
else ()
}
</book>
}
</bib>
Result expected
Implementation in TOM

NOTE: NO PROBLEM. Must only use a counter for count the number of authors of each book

1.2.5 Q7

List the titles and years of all books published by Addison-Wesley after 1991, in alphabetic order.

Solution in XQuery:

<bib>
{
for $b in doc("http://bstore1.example.com/bib.xml")//book
where $b/publisher = "Addison-Wesley" and $b/@year > 1991
order by $b/title
return
<book>
{ $b/@year }
{ $b/title }
</book>
}
</bib>
Result expected
Implementation in TOM

NOTE: EASY

1.2.5 Q8

Find books in which the name of some element ends with the string "or" and the same element contains the string "Suciu" somewhere in its content. For each such book, return the title and the qualifying element.

Solution in XQuery:

for $b in doc("http://bstore1.example.com/bib.xml")//book
let $e := $b/*[contains(string(.), "Suciu")
and ends-with(local-name(.), "or")]
where exists($e)
return
<book>
{ $b/title }
{ $e }
</book>

In the above solution, string(), local-name() and ends-with() are functions defined in the Functions and Operators document.

Result expected
Implementation in TOM (no implementation)

NOTE: Very difficult.   I think it's not avaiable in TOM. (We must have available to find a Node with name end with "OR")

1.2.5 Q9

In the document "books.xml", find all section or chapter titles that contain the word "XML", regardless of the level of nesting.

Solution in XQuery:

<results>
{
for $t in doc("books.xml")//(chapter | section)/title
where contains($t/text(), "XML")
return $t
}
</results>
Result expected
Implementation in TOM

NOTE: Look at the '//" in the first line, it is a very strong operator of XQUERY (also the lack of TOM). This operator permit us to find all sub-NODEs in a NODE. 
IN TOM, we must use a recursif function.

"(chapter | section)"           this notation of XQUERY can be solved by using two conditional branches of %math

1.2.5 Q10

In the document "prices.xml", find the minimum price for each book, in the form of a "minprice" element with the book title as its title attribute.

Solution in XQuery:

<results>
{
let $doc := doc("prices.xml")
for $t in distinct-values($doc//book/title)
let $p := $doc//book[title = $t]/price
return
<minprice title="{ $t }">
<price>{ min($p) }</price>
</minprice>
}
</results>
Result expected

Implementation in TOM

NOTE: NO PROBLEM

1.2.5 Q11

For each book with an author, return the book with its title and authors. For each book with an editor, return a reference with the book title and the editor's affiliation.

Solution in XQuery:

<bib>
{
for $b in doc("http://bstore1.example.com/bib.xml")//book[author]
return
<book>
{ $b/title }
{ $b/author }
</book>
}
{
for $b in doc("http://bstore1.example.com/bib.xml")//book[editor]
return
<reference>
{ $b/title }
{$b/editor/affiliation}
</reference>
}
</bib>
Result expected

Implementation in TOM

NOTE: NO PROBLEM

1.2.5 Q12

Find pairs of books that have different titles but the same set of authors (possibly in a different order).

Solution in XQuery:

<bib>
{
for $book1 in doc("http://bstore1.example.com/bib.xml")//book,
$book2 in doc("http://bstore1.example.com/bib.xml")//book
let $aut1 := for $a in $book1/author
order by $a/last, $a/first
return $a
let $aut2 := for $a in $book2/author
order by $a/last, $a/first
return $a
where $book1 << $book2
and not($book1/title = $book2/title)
and deep-equal($aut1, $aut2)
return
<book-pair>
{ $book1/title }
{ $book2/title }
</book-pair>
}
</bib> 
The above solution uses a function, deep-equal(), which compares sequences. Two sequences are equal if all items in corresponding positions in the two sequences are equal - if the sequences are node sequences, the values of the nodes are used for comparison.

Result expected
Implementation in TOM (NO IMPLEMENTATION)  We can, but it's long

NOTE: VERY DIFFICULT:  I dont know how to compare two TNode (deep-equal in XQUERY)




1.2 Use Case "TREE": Queries that preserve hierarchy

Some XML document-types have a very flexible structure in which text is mixed with elements and many elements are optional. These document-types show a wide variation in structure from one document to another. In documents of these types, the ways in which elements are ordered and nested are usually quite important.

1.2.1 Description

An XML query language should have the ability to extract elements from documents while preserving their original hierarchy. This Use Case illustrates this requirement by means of a flexible document type named Book.

1.2.2 Document Type Definition (DTD)

This use case is based on an input document named "book.xml". The DTD for this schema is found in a file called "book.dtd":

  <!ELEMENT book (title, author+, section+)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT section (title, (p | figure | section)* )>
<!ATTLIST section
id ID #IMPLIED
difficulty CDATA #IMPLIED>
<!ELEMENT p (#PCDATA)>
<!ELEMENT figure (title, image)>
<!ATTLIST figure
width CDATA #REQUIRED
height CDATA #REQUIRED >
<!ELEMENT image EMPTY>
<!ATTLIST image
source CDATA #REQUIRED >

1.2.3 Sample Data

The queries in this use case are based on the following sample data.

<?xml version="1.0"?>
<!DOCTYPE book SYSTEM "book.dtd">
<book>
<title>Data on the Web</title>
<author>Serge Abiteboul</author>
<author>Peter Buneman</author>
<author>Dan Suciu</author>
<section id="intro" difficulty="easy" >
<title>Introduction</title>
<p>Text ... </p>
<section>
<title>Audience</title>
<p>Text ... </p>
</section>
<section>
<title>Web Data and the Two Cultures</title>
<p>Text ... </p>
<figure height="400" width="400">
<title>Traditional client/server architecture</title>
<image source="csarch.gif"/>
</figure>
<p>Text ... </p>
</section>
</section>
<section id="syntax" difficulty="medium" >
<title>A Syntax For Data</title>
<p>Text ... </p>
<figure height="200" width="500">
<title>Graph representations of structures</title>
<image source="graphs.gif"/>
</figure>
<p>Text ... </p>
<section>
<title>Base Types</title>
<p>Text ... </p>
</section>
<section>
<title>Representing Relational Databases</title>
<p>Text ... </p>
<figure height="250" width="400">
<title>Examples of Relations</title>
<image source="relations.gif"/>
</figure>
</section>
<section>
<title>Representing Object Databases</title>
<p>Text ... </p>
</section>
</section>
</book>

1.2.4 Queries and Results

1.2.4.1 Q1

Prepare a (nested) table of contents for Book1, listing all the sections and their titles. Preserve the original attributes of each <section> element, if any.

Solution in XQuery:

declare function local:toc($book-or-section as element()) as element()*
{
for $section in $book-or-section/section
return
<section>
{ $section/@* , $section/title , local:toc($section) }
</section>
};

<toc>
{
for $s in doc("book.xml")/book return local:toc($s)
}
</toc>

Expected Result:

NOTE: HOW can we represent all attribute of an element in TOM. In an element with attribute list FIXED, it's easy, but if the attribute list NOT FIXED, it's difficult.

other QUESTION:  ESCAPE from %match clause without using return keyword.

1.2.4.2 Q2

Prepare a (flat) figure list for Book1, listing all the figures and their titles. Preserve the original attributes of each <figure> element, if any.

Solution in XQuery:

<figlist>
{
for $f in doc("book.xml")//figure
return
<figure>
{ $f/@* }
{ $f/title }
</figure>
}
</figlist>

Expected Result:

NOTE:  LIKE query 1, HOW can we represent all attribute of an element in TOM. In an element with attribute list FIXED, it's easy, but if the attribute list NOT FIXED, it's difficult.
1.2.4.3 Q3

How many sections are in Book1, and how many figures?

Solution in XQuery:

<section_count>{ count(doc("book.xml")//section) }</section_count>, 
<figure_count>{ count(doc("book.xml")//figure) }</figure_count>

Expected Result:

1.2.4.4 Q4

How many top-level sections are in Book1?

Solution in XQuery:

<top_section_count>
{
count(doc("book.xml")/book/section)
}
</top_section_count>

Expected Result:

NOTE: NO PROBLEM

1.2.4.5 Q5

Make a flat list of the section elements in Book1. In place of its original attributes, each section element should have two attributes, containing the title of the section and the number of figures immediately contained in the section.

Solution in XQuery:

<section_list>
{
for $s in doc("book.xml")//section
let $f := $s/figure
return
<section title="{ $s/title/text() }" figcount="{ count($f) }"/>
}
</section_list>

Expected Result:

1.2.4.6 Q6

Make a nested list of the section elements in Book1, preserving their original attributes and hierarchy. Inside each section element, include the title of the section and an element that includes the number of figures immediately contained in the section.

Solution in XQuery:

declare function local:section-summary($book-or-section as element())
as element()*
{
for $section in $book-or-section
return
<section>
{ $section/@* }
{ $section/title }
<figcount>
{ count($section/figure) }
</figcount>
{ local:section-summary($section/section) }
</section>
};

<toc>
{
for $s in doc("book.xml")/book/section
return local:section-summary($s)
}
</toc>
Editorial note  
This solution was provided by Michael Wenger, a student at the University of Würzburg.

Expected Result:

NOTE:  LIKE query 1, HOW can we represent all attribute of an element in TOM. In an element with attribute list FIXED, it's easy, but if the attribute list NOT FIXED, it's difficult.










1.3 Use Case "SEQ" - Queries based on Sequence

This use case illustrates queries based on the sequence in which elements appear in a document.

1.3.1 Description

Although sequence is not significant in most traditional database systems or object systems, it can be quite significant in structured documents. This use case presents a series of queries based on a medical report.

1.3.2 Document Type Definition (DTD)

This use case is based on a medical report using the HL7 Patient Record Architecture. We simplify the DTD in this example, using only what is needed to understand the queries.

<!DOCTYPE report [
<!ELEMENT report (section*)>
<!ELEMENT section (section.title, section.content)>
<!ELEMENT section.title (#PCDATA )>
<!ELEMENT section.content (#PCDATA | anesthesia | prep
| incision | action | observation )*>
<!ELEMENT anesthesia (#PCDATA)>
<!ELEMENT prep ( (#PCDATA | action)* )> <== error: must be ( (#PCDATA | action)* )
<!ELEMENT incision ( (#PCDATA | geography | instrument)* )> <== error: must be (#PCDATA | geography | instrument)*
<!ELEMENT action ( (#PCDATA | instrument )* )>
<!ELEMENT observation (#PCDATA)>
<!ELEMENT geography (#PCDATA)>
<!ELEMENT instrument (#PCDATA)>
]>

1.3.3 Sample Data

The queries in this use case are based on the following sample data.

<report>
<section>
<section.title>Procedure</section.title>
<section.content>
The patient was taken to the operating room where she was placed
in supine position and
<anesthesia>induced under general anesthesia.</anesthesia>
<prep>
<action>A Foley catheter was placed to decompress the bladder</action>
and the abdomen was then prepped and draped in sterile fashion.
</prep>
<incision>
A curvilinear incision was made
<geography>in the midline immediately infraumbilical</geography>
and the subcutaneous tissue was divided
<instrument>using electrocautery.</instrument>
</incision>
The fascia was identified and
<action>#2 0 Maxon stay sutures were placed on each side of the midline.
</action>
<incision>
The fascia was divided using
<instrument>electrocautery</instrument>
and the peritoneum was entered.
</incision>
<observation>The small bowel was identified.</observation>
and
<action>
the
<instrument>Hasson trocar</instrument>
was placed under direct visualization.
</action>
<action>
The
<instrument>trocar</instrument>
was secured to the fascia using the stay sutures.
</action>
</section.content>
</section>
</report>

1.3.4 Queries and Results

1.3.4.1 Q1

In the Procedure section of Report1, what Instruments were used in the second Incision?

Solution in XQuery:

for $s in doc("report1.xml")//section[section.title = "Procedure"]
return ($s//incision)[2]/instrument

Expected Result:

NOTE: XQuery use the notation incision[2]/instrument to represente the instrument Node number  2 in a incision Node.   (must be implemented manually in TOM)


1.3.4.2 Q2

In the Procedure section of Report1, what are the first two Instruments to be used?

Solution in XQuery:

for $s in doc("report1.xml")//section[section.title = "Procedure"]
return ($s//instrument)[position()<=2]

Expected Result:

NOTE:  How to prevente TOM from continuing find matching pattern when some conditions were satisfied


1.3.4.3 Q3

In Report1, what Instruments were used in the first two Actions after the second Incision?

Solution in XQuery:

let $i2 := (doc("report1.xml")//incision)[2]
for $a in (doc("report1.xml")//action)[. >> $i2][position()<=2]
return $a//instrument

Expected Result:

1.3.4.4 Q4

In Report1, find "Procedure" sections where no Anesthesia element occurs before the first Incision

Solution in XQuery:

for $p in doc("report1.xml")//section[section.title = "Procedure"]
where not(some $a in $p//anesthesia satisfies
$a << ($p//incision)[1] )
return $p

Expected Result:

(No sections satisfy Q4, thankfully.)

1.3.4.5 Q5

In Report1, what happened between the first Incision and the second Incision?

Solution in XQuery:

declare function local:precedes($a as node(), $b as node()) as xs:boolean 
{
$a << $b
and
empty($a//node() intersect $b)
};


declare function local:follows($a as node(), $b as node()) as xs:boolean
{
$a >> $b
and
empty($b//node() intersect $a)
};

<critical_sequence>
{
let $proc := doc("report1.xml")//section[section.title="Procedure"][1]
for $n in $proc//node()
where local:follows($n, ($proc//incision)[1])
and local:precedes($n, ($proc//incision)[2])
return $n
}
</critical_sequence>

Here is another solution that is perhaps more efficient and less readable:

<critical_sequence>
{
let $proc := doc("report1.xml")//section[section.title="Procedure"][1],
$i1 := ($proc//incision)[1],
$i2 := ($proc//incision)[2]
for $n in $proc//node() except $i1//node()
where $n >> $i1 and $n << $i2
return $n
}
</critical_sequence>

Expected Result:

In the above output, the contents of the critical sequence element include a text node, an action element, and the text node containing the content of the action element. But the serialization we are using already shows all descendants of a given node. If $c is bound to a sequence of nodes, the following expression eliminates members of the sequence that are descendants of another node already found in the sequence:
$c except $c//node()

In the following solution, the between() function takes a sequence of nodes, a starting node, and an ending node, and returns the nodes between them:

declare function local:between($seq as node()*, $start as node(), $end as node())
as item()*
{
let $nodes :=
for $n in $seq except $start//node()
where $n >> $start and $n << $end
return $n
return $nodes except $nodes//node()
};

<critical_sequence>
{
let $proc := doc("report1.xml")//section[section.title="Procedure"][1],
$first := ($proc//incision)[1],
$second:= ($proc//incision)[2]
return local:between($proc//node(), $first, $second)
}
</critical_sequence>


Here is the output from the above query:





1.4 Use Case "R" - Access to Relational Data

One important use of an XML query language will be to access data stored in relational databases. This use case describes one possible way in which this access might be accomplished.

1.4.1 Description

A relational database system might present a view in which each table (relation) takes the form of an XML document. One way to represent a database table as an XML document is to allow the document element to represent the table itself, and each row (tuple) inside the table to be represented by a nested element. Inside the tuple-elements, each column is in turn represented by a nested element. Columns that allow null values are represented by optional elements, and a missing element denotes a null value.

As an example, consider a relational database used by an online auction. The auction maintains a USERS table containing information on registered users, each identified by a unique userid, who can either offer items for sale or bid on items. An ITEMS table lists items currently or recently for sale, with the userid of the user who offered each item. A BIDS table contains all bids on record, keyed by the userid of the bidder and the item number of the item to which the bid applies.

The three tables used by the online auction are below, with their column-names indicated in parentheses.

USERS ( USERID, NAME, RATING )

ITEMS ( ITEMNO, DESCRIPTION, OFFERED_BY, START_DATE, END_DATE, RESERVE_PRICE )

BIDS ( USERID, ITEMNO, BID, BID_DATE )

1.4.2 Document Type Definition (DTD)

This use case is based on three separate input documents named users.xml, items.xml, and bids.xml. Each of the documents represents one of the tables in the relational database described above, using the following DTDs:

<!DOCTYPE users [
<!ELEMENT users (user_tuple*)>
<!ELEMENT user_tuple (userid, name, rating?)>
<!ELEMENT userid (#PCDATA)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT rating (#PCDATA)>
]>

<!DOCTYPE items [
<!ELEMENT items (item_tuple*)>
<!ELEMENT item_tuple (itemno, description, offered_by,
start_date?, end_date?, reserve_price? )>
<!ELEMENT itemno (#PCDATA)>
<!ELEMENT description (#PCDATA)>
<!ELEMENT offered_by (#PCDATA)>
<!ELEMENT start_date (#PCDATA)>
<!ELEMENT end_date (#PCDATA)>
<!ELEMENT reserve_price (#PCDATA)>
]>

<!DOCTYPE bids [
<!ELEMENT bids (bid_tuple*)>
<!ELEMENT bid_tuple (userid, itemno, bid, bid_date)>
<!ELEMENT userid (#PCDATA)>
<!ELEMENT itemno (#PCDATA)>
<!ELEMENT bid (#PCDATA)>
<!ELEMENT bid_date (#PCDATA)>
]>

1.4.3 Sample Data

Here is an abbreviated set of data showing the XML format of the instances:

<items>
<item_tuple>
<itemno>1001</itemno>
<description>Red Bicycle</description>
<offered_by>U01</offered_by>
<start_date>1999-01-05</start_date>
<end_date>1999-01-20</end_date>
<reserve_price>40</reserve_price>
</item_tuple>
<!-- !!! Snip !!! -->

<users>
<user_tuple>
<userid>U01</userid>
<name>Tom Jones</name>
<rating>B</rating>
</user_tuple>
<!-- !!! Snip !!! -->

<bids>
<bid_tuple>
<userid>U02</userid>
<itemno>1001</itemno>
<bid>35</bid>
<bid_date>1999-01-07</bid_date>
</bid_tuple>
<bid_tuple>
<!-- !!! Snip !!! -->

The entire data set is represented by the following table:

USERS
USERID NAME RATING
U01 Tom Jones B
U02 Mary Doe A
U03 Dee Linquent D
U04 Roger Smith C
U05 Jack Sprat B
U06 Rip Van Winkle B
ITEMS
ITEMNO DESCRIPTION OFFERED_BY START_DATE END_DATE RESERVE_PRICE
1001 Red Bicycle U01 1999-01-05 1999-01-20 40
1002 Motorcycle U02 1999-02-11 1999-03-15 500
1003 Old Bicycle U02 1999-01-10 1999-02-20 25
1004 Tricycle U01 1999-02-25 1999-03-08 15
1005 Tennis Racket U03 1999-03-19 1999-04-30 20
1006 Helicopter U03 1999-05-05 1999-05-25 50000
1007 Racing Bicycle U04 1999-01-20 1999-02-20 200
1008 Broken Bicycle U01 1999-02-05 1999-03-06 25
BIDS
USERID ITEMNO BID BID_DATE
U02 1001 35 1999-01-07
U04 1001 40 1999-01-08
U02 1001 45 1999-01-11
U04 1001 50 1999-01-13
U02 1001 55 1999-01-15
U01 1002 400 1999-02-14
U02 1002 600 1999-02-16
U03 1002 800 1999-02-17
U04 1002 1000 1999-02-25
U02 1002 1200 1999-03-02
U04 1003 15 1999-01-22
U05 1003 20 1999-02-03
U01 1004 40 1999-03-05
U03 1007 175 1999-01-25
U05 1007 200 1999-02-08
U04 1007 225 1999-02-12

1.4.4 Queries and Results

1.4.4.1 Q1

List the item number and description of all bicycles that currently have an auction in progress, ordered by item number.

Solution in XQuery:

<result>
{
for $i in doc("items.xml")//item_tuple
where $i/start_date <= current-date()
and $i/end_date >= current-date()
and contains($i/description, "Bicycle")
order by $i/itemno
return
<item_tuple>
{ $i/itemno }
{ $i/description }
</item_tuple>
}
</result>

Note:

This solution assumes that the current date is 1999-01-31.

Expected Result:

NOTE: Date parsing problem: cannot use a %match like this:

%match (String a)  {
    (year, 'anystring' , month, 'anystring', day)  -> {
          ////  ..... 
    }
}


in condition that : anystring contain ' ' (space) or '-' or '/'. I thinks in most of case we use '_' and '/' for representing date and time.