We've been using SPARQL, OWL, and N3 lately to prototype the reporting
of common research variables to the Society of Thoracic Surgeon's
(STS) National Database. The reports are being run against our large
RDF dataset of abstracted electronic patient records from the
Cleveland Clinic's Electronic Health Record system. Our dataset
consists of about 200,000 patients each represented as statements in
named RDF graphs. The STS variables we are responsible for deriving
are represented using a combination of OWL-DL and Notation 3. The
constraints that do not benefit from the restricted, tree-like nature of description logic are captured using secondary plain Horn clauses
(or rules) represented in Notation 3.
>
We use an open source logic reasoning system for the semantic web that
converts the constraints and a SPARQL query for an RDF dataset
governed by these OWL-DL constraints into provably optimal sets of
rules used to calculate an entailed RDF graph (the specifics of this
method is a subject of a paper I'm working on for the RuleML 2009
conference). Such an entailed RDF graph can then be targeted with
SPARQL queries to answer for the STS variables. A recent challenge
has been to try to capture the semantics of negation in order to
implement 'exclusion criteria'. This is typically of the form of a
class of procedures that do not involve combinations of one or more
kinds of other procedures. A recent update to FuXi includes the
ability to convert OWL-DL expressions that use owl:complementOf into
general, stratified, logic programs that can be evaluated using SPARQL
in order to implement the semantics of stable model negation (which is
quite different from the way owl:complementOf is meant to be
interpreted: according to the negation of first-order logic).
>
In particular, statements of classical (first-order) negation are
making assertions about the lack of existence of models that satisfy
the positive form of such a statement in a theory. I prefer this
explanation to the way the term 'open world' assumption is often
used to describe this interpretation of negated terms in a
description logic language. Database theory, ofcourse, does not interpret negated terms
in this way, but instead (intuitively) understands statements of negated terms to be
'true' if the (ground) positive form is not in the set of known facts (the
database).
Our use of negation, and the nature of knowledge
recorded in a computer-based patient record system seems (so far) to
lend itself more to the database interpretation where there is an
understanding that a curated medical information system would have its
data entered under the governance of policies that would allow
medically useful inferences to be made from the absence of certain
facts about patient care.
>
In particular, if a fact is known to not be true about a patient or
some activity involving a patient, it is not recorded. This common
understanding can be used to make inferences about whether facts in a
patient record satisfy an exclusion criteria. Below is an example of
this:
>
Consider the following OWL descriptions of a class of operations:
SubClassOf(
IntersectionOf(
Operation
PostOpInHOspitalEvent
ObjectAllValuesFrom(
involves
ComplementOf( UnionOf( CardiacProcedure ThoracicControlBleeding ) ) )
)
sts:ReopForOtherNonCardiac )
The syntax above is the OWL2 functional-style syntax. We can
paraphrase the general class inclusion (GCI) axiom above as saying:
".. all operations that followed another operation and do not involve any
cardiac procedures or thoracic control bleeding procedures."
"
The original documentation for this variable in the STS adult cardiac
database manual says:
Indicate whether the patient returned to the operating room for
other non-cardiac reasons
Now, if we assume that all operations of interest and the involved
procedures are explicitly recorded in our patient RDF dataset. This
general class inclusion axioms can be reduced into a set of rules that use negated
'literals' (as they are called); understood to capture the semantics of
default negation (or the 'closed world assumption'). It is worth noting that this is exemplary of a class of expressions that description logic, tableaux-based reasoning algorithms often have problems with.
Conjunctive query answering for stratified datalog is a well-studied class of
problems in database theory. It is through the insight of this canon of theory that FuXi is now able to reduce
OWL-DL expressions that use owl:complementOf into sets of rules (or
logic programs) that can be efficiently processed in order to
implement SPARQL entailment regimes for combinations of OWL and
rule-based representations for the semantic web such as
Notation 3 or RIF core.
>
The current FuXi implementation converts the GCI into the following
two RIF rules:
Forall ?X ?QrjeKHuq961 (
?X # sts:ReopForOtherNonCardiac
:- And(
?X # PostOpInHospitalEvent
?X # Operation,
Naf ?X[involves -> ?QrjeKHuq961] ) )
Forall ?X ?QrjeKHuq961 (
?X # sts:ReopForOtherNonCardiac
:- And(
?X # PostOpInHospitalEvent ,
?X # Operation,
?X[involves -> ?QrjeKHuq961],
Naf ?QrjeKHuq961 # CardiacProcedure,
Naf ?QrjeKHuq961 # ThoracicControlBleeding ) )
Note, Naf is in the (current) 30 July 2008 version of the "RIF
Framework for Logic Dialects"
The first rule describes members of the clas of ReopForOtherNonCardiac as those post-operative operations (i.e., operations that follow another operation in the same patient hospital visit or episode) that do not involve other procedures.
The second rule applies to those post-operative operations that do involve other procedures where these other operations are not either cardiac procedures or thoracic control bleeding procedures.
These RIF rules can be exchanged with other RIF-compliant rule-based
systems that implement any of the well-accepted semantics for negated
formulas in horn clause logic (stable models, well-founded models,
stratified models, etc.). A recent modification to FuXi makes
use of a programmatic SPARQL interface for Python that a colleague of
mind has been working on called telescope. It works with
rdflib (same as FuXi) and is used as a control layer that converts
negated RIF rules into a series of SPARQL queries involving
OPTIONAL/FILTER/!BOUND that are used to calculate "stratified models"
(i.e., the finite set of facts that can be inferred from the set of
rules that include negated literals).
>
Renzo Angles et al. (2008) and Polleres, A. (2007) have since
demonstrated that the expressive power of SPARQL coincides with that
of datalog with negation, so it comes as no suprise that certain
datalog clauses (or rules) can be converted into SPARQL queries using
so-called copy-patterns and the introduction of a MINUS operator. For
the details of how this operator works and how its semantics are
equivalent with that of datalog, the reader is urged to read any of
the above mentioned papers.
>
telescope is used to programatically convert MINUS operators into a
SPARQL queries that answer for RIF rules with the corresponding
negated frame formulas below:
SELECT ?X
WHERE {
?X a PostOpInHospitalEvent .
?X a Operation
#The post-operative operation does not invlolve any procedures
OPTIONAL { ?X involves ?QrjeKHuq961 }
FILTER (!bound(?QrjeKHuq961))
}
SELECT ?X
WHERE {
?X a PostOpInHospitalEvent .
?X a Operation .
?X involves ?QrjeKHuq961
#In the case where the post-operative operation involves a procedure
it is *not* either a
# cardiac procedure or a thoracic control bleeding
OPTIONAL {
?QrjeKHuq961 a CardiacProcedure .
?QrjeKHuq961 a ThoracicControlBleeding .
?QrjeKHuq16542 a CardiacProcedure .
?QrjeKHuq16542 a ThoracicControlBleeding
FILTER (?QrjeKHuq961 = ?QrjeKHuq16542)
}
FILTER (!bound(?QrjeKHuq16542))
}
I'll be adding a wiki shortly (on the python-dlp google code wiki)
describing the explicit APIs that can be used for this purpose, but I
wanted to give the feature some context in the recent work I've been
doing on applications of semantic web for medical informatics
>
-- Chimezie
Permalink
| Leave a comment »