<oscars>
element.
It's currently
3.68
I'll be updating the version number as I fix mistakes and typos. Check back from time to time if you want to stay current.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Return everything in the database. This particular query uses a call to XQuery's
doc()
function. As shown above, the Mark Logic Content Server
also let's us use
/oscars
and even
//
to
root a query. I use all three syntactic forms somewhat freely.
|
|
|
|
|
|
|
|
|
|
(This one shows a few mistakes in my structure that make me laugh. If you look hard, you can spot them too. I'll leave them in for the moment as a salutary example of why it's important, if your database is in flux, to periodically run queries like this to check for surprises. You'll surmise of course that I don't have a schema to validate against, or if I did, haven't bothered to do so. This is the next best thing.) |
|
|
|
|
|
(The latest is 3.67.) |
|
You should see <person>José Rivera</person>,
with an accent acute over the "e" in "Jose".
|
|
|
|
The previous query states that the winners have never been nominated before. This one states they've been nominated but never won. |
|
Which nominated songs had the same title as the movie they were nominated for? |
|
How many were there? |
|
Which winning songs had the same title as the movie they were nominated for? |
|
Which winning songs had the same title as the movie they were nominated for, where the movie also won for Best Picture? Results show both the winning song and the winning picture for any matches. |
|
How many awards has John Williams won vs. how many has he lost? |
|
Here's a first cut at attempting to answer this question. It doesn't provide quite the correct answer tho. The Academy database says that "All about Eve" and "Titanic" tied for first place with 14 nominations each, while this XQuery says (Don't believe me? Try it for yourself!) that "A Star is Born" received 17 nominations, "Titanic" received 16, and "All about Eve" is in fifth place with 14 nominations. What's going on? |
|
The prior query is over-counting because of the remakes problem. The earlier query on "King Kong" provided an earlier look at this. "A Star is Born" for example has been remade on three separate occasions,and the preceding XQuery doesn't discriminate between them, so the nomination counts shown are high. One way to correct for this is to pose the same query again, and then to look at each of the movies that survive this first step, further partitioning each remake by year and again eliminating any with less than the desired number of nominations. ( This query may take up to a minute or so to execute btw. It should be executing somewhat faster than that, at least as fast as my desktop machine, but its configuration is a bit awry at the moment and needs some TLC.) |
|
This is almost identical to the prior query, except that the number is different, and while that one looked at nominated films; this one looks at winners. Both queries are great candidates for parameterization (see below). |
|
This query highlights a current deficiency in my database; it produces a slightly different list than the one produced by AMPAS. Mine shows Bette Davis receiving 11 nominations for example, while AMPAS says 10. The Academy's number is correct (which is not surprising, since they're nothing if not sticklers for accuracy). Mine is high because I'm including a nomination for Best Actress in 1934 ("Of Human Bondage"). The Academy notes that "THIS IS NOT AN OFFICIAL NOMINATION", while my DB doesn't take this into account (and should). There are a number of such notes in the AMPAS database (all nominations from 1929 for example are officially marked as "UNOFFICIAL"). A few solutions present themselves:
|
|
Let's take a bit of a technical digression for those interested in XQuery. We can parameterize the preceding query by implementing it as an XQuery function. In this case, we'll build a function called
nominations()
that takes a single integer
argument to indicate how many nominations we're interested in looking at.
NOTE: Mark Logic's function declaration syntax differs slightly from that of the current XQuery specification. See Function syntax. |
|
|
|
Title says it all, I think. |
|
This one was fun. Note there were two awards for Directing in 1928 (formally "1927/1928" in Academy parlance): one for Comedy Picture and one for Dramatic Picture. |
|
Poor directors! |
count
attribute to the
nominations
element and removed the
year
attribute from the individual
nominations, since it's been hoisted upwards into
nominations
.
Finally, I've amalgamated the two
won
and
lost
elements into a single
won
attribute on the award name, that takes a boolean value of "true" or "false". |
musicSong
elements into
song
elements
and
musicScore
elements into
score
. The original "oscars.xml" document went through a number of such transformations while I was working
out the final names of the award categories I wanted.
In this example, the
default
clause on
typeswitch
returns
a null for any award other than the above two. This is handy if you just want to view the results of
a small, localized transformation such as this one. If you're doing a full and final award-name transformation, then it's useful to have
default
return something like
<ILLEGAL-AWARD-NAME/>
or such to pick up errors during development.
|
|
|
|
|
xdmp:query-meters()//*:elapsed-time/text().
This is a Mark Logic built-in function that
returns the elapsed time required to return a query result. If you like exploring query performance, you
can append this to the end of any query you're interested in timing. Clock starts ticking when the query begins.
xdmp:query-meters()
and
xdmp:estimate()
(a replacement for
fn:count()
that can improve on that function's performance in certain situations) are the only two
Mark Logic functions I'm exposing through the query interface, since a number of the others can let naughty people do nasty things to the database,
and that's something I'd prefer not to have happen.
define function
is now called
declare function
local:
(or a prefix that's been equated to
that namespace in your query.) Mark Logic doesn't require this.
lost
element have been removed (since in
this hypothetical example
they drove the node count over the allowed limit), and the containing
cinematography
element has been decorated with a "limit exceeded" warning:
<cinematography xmlns:oscar-results="http://www.fatdog.com/oscar-results"
oscar-results:warning="LIMIT [125000 item(s)] exceeded, 7 node(s) removed"
year="1939"
subcat="black and white">
<lost/>
</cinematography>