Discussion:
Duplicate entries produced by SPARQL query
lookman sanni
2016-11-15 00:09:33 UTC
Permalink
Hi,

I have the following RDF:

<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:j.0="http://www.somelink.org/ofdf#">
<rdf:Description rdf:about="
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678">
<j.0:creditedToCust>ABCDEFG1_5#XYZ1</j.0:creditedToCust>
<j.0:createdOnDate>ABCDEFG1_5#20151208</j.0:createdOnDate>
<j.0:consumedOnDate>ABCDEFG1_5#20151111</j.0:consumedOnDate>
<j.0:creditedFor>ABCDEFG1_5#124.50</j.0:creditedFor>
<j.0:createdOnDate>ABCDEFG2#20151104</j.0:createdOnDate>
<j.0:creditedToCust>ABCDEFG2#XYZ2</j.0:creditedToCust>
<j.0:creditedFor>ABCDEFG2#124.50</j.0:creditedFor>
<j.0:consumedOnDate>ABCDEFG2#20151111</j.0:consumedOnDate>
</rdf:Description>
</rdf:RDF>

and the following SPARQL query:

String qs1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
"PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
"PREFIX ticket: <http://www.lookouster.org/ofdf#> " +
"SELECT ?lticket ?type ?bf1 ?cust1 ?tAmount1 ?dtCreation1 ?dtTravel1 ?bf2
?cust2 ?tAmount2 ?dtCreation2 ?dtTravel2 " +
"WHERE { " +
"?lticket ticket:creditedToCust ?_cust1 .\n" +
"?lticket ticket:creditedToCust ?_cust2 .\n" +
"?lticket ticket:creditedFor ?_tAmount1 .\n" +
"?lticket ticket:creditedFor ?_tAmount2 .\n" +
"?lticket ticket:createdOnDate ?_dtCreation1 .\n" +
"?lticket ticket:createdOnDate ?_dtCreation2 .\n" +
"?lticket ticket:consumedOnDate ?_dtTravel1 .\n" +
"?lticket ticket:consumedOnDate ?_dtTravel2 .\n" +
"FILTER ( strafter( ?_cust1, '#') != strafter( ?_cust2, '#') ) .\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_tAmount1, '#') ) .\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtCreation1, '#') )
.\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtTravel1, '#') ) .\n"
+
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_tAmount2, '#') ) .\n" +
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtCreation2, '#') )
.\n" +
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtTravel2, '#') ) .\n"
+
"FILTER ( strafter( ?_tAmount1, '#' ) = strafter( ?_tAmount2, '#' ) )
.\n" +
"bind(strafter( ?_cust1, '#' ) as ?cust1) .\n" +
"bind(strafter( ?_cust2, '#' ) as ?cust2) .\n" +
"bind(strbefore( ?_cust1, '#' ) as ?bf1) .\n" +
"bind(strbefore( ?_cust2, '#' ) as ?bf2) .\n" +
"bind(strafter( ?_tAmount1, '#' ) as ?tAmount1) .\n" +
"bind(strafter( ?_tAmount2, '#' ) as ?tAmount2) .\n" +
"bind(strafter( ?_dtCreation1, '#' ) as ?dtCreation1) .\n" +
"bind(strafter( ?_dtCreation2, '#' ) as ?dtCreation2) .\n" +
"bind(strafter( ?_dtTravel1, '#' ) as ?dtTravel1) .\n" +
"bind(strafter( ?_dtTravel2, '#' ) as ?dtTravel2) .\n" +
"values ?type { '2x Credit Notes' } .\n" +
"} ORDER BY ?lticket ?dtCreation1 " ;

The query ran well, but return the following 2 lines:

lticket,type,bf1,cust1,tAmount1,dtCreation1,dtTravel1,bf2,cust2,tAmount2,dtCreation2,dtTravel2
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG2,XYZ2,124.50,20151104,20151111,ABCDEFG1,XYZ1,124.50,20151208,20151111
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG1,XYZ1,124.50,20151208,20151111,ABCDEFG2,XYZ2,124.50,20151104,20151111


How do I manage to only output one line, given that the two lines are just
inverted ?

Thank you for your help.
--
Best Regards

Lookman SANNI
Brian McBride
2016-11-15 08:27:50 UTC
Permalink
Would adding a filter

FILTER (?bf1 <= ?bf2)

or similar do the trick?

Brian
Post by lookman sanni
Hi,
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:j.0="http://www.somelink.org/ofdf#">
<rdf:Description rdf:about="
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678">
<j.0:creditedToCust>ABCDEFG1_5#XYZ1</j.0:creditedToCust>
<j.0:createdOnDate>ABCDEFG1_5#20151208</j.0:createdOnDate>
<j.0:consumedOnDate>ABCDEFG1_5#20151111</j.0:consumedOnDate>
<j.0:creditedFor>ABCDEFG1_5#124.50</j.0:creditedFor>
<j.0:createdOnDate>ABCDEFG2#20151104</j.0:createdOnDate>
<j.0:creditedToCust>ABCDEFG2#XYZ2</j.0:creditedToCust>
<j.0:creditedFor>ABCDEFG2#124.50</j.0:creditedFor>
<j.0:consumedOnDate>ABCDEFG2#20151111</j.0:consumedOnDate>
</rdf:Description>
</rdf:RDF>
String qs1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
"PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
"PREFIX ticket: <http://www.lookouster.org/ofdf#> " +
"SELECT ?lticket ?type ?bf1 ?cust1 ?tAmount1 ?dtCreation1 ?dtTravel1 ?bf2
?cust2 ?tAmount2 ?dtCreation2 ?dtTravel2 " +
"WHERE { " +
"?lticket ticket:creditedToCust ?_cust1 .\n" +
"?lticket ticket:creditedToCust ?_cust2 .\n" +
"?lticket ticket:creditedFor ?_tAmount1 .\n" +
"?lticket ticket:creditedFor ?_tAmount2 .\n" +
"?lticket ticket:createdOnDate ?_dtCreation1 .\n" +
"?lticket ticket:createdOnDate ?_dtCreation2 .\n" +
"?lticket ticket:consumedOnDate ?_dtTravel1 .\n" +
"?lticket ticket:consumedOnDate ?_dtTravel2 .\n" +
"FILTER ( strafter( ?_cust1, '#') != strafter( ?_cust2, '#') ) .\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_tAmount1, '#') ) .\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtCreation1, '#') )
.\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtTravel1, '#') ) .\n"
+
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_tAmount2, '#') ) .\n" +
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtCreation2, '#') )
.\n" +
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtTravel2, '#') ) .\n"
+
"FILTER ( strafter( ?_tAmount1, '#' ) = strafter( ?_tAmount2, '#' ) )
.\n" +
"bind(strafter( ?_cust1, '#' ) as ?cust1) .\n" +
"bind(strafter( ?_cust2, '#' ) as ?cust2) .\n" +
"bind(strbefore( ?_cust1, '#' ) as ?bf1) .\n" +
"bind(strbefore( ?_cust2, '#' ) as ?bf2) .\n" +
"bind(strafter( ?_tAmount1, '#' ) as ?tAmount1) .\n" +
"bind(strafter( ?_tAmount2, '#' ) as ?tAmount2) .\n" +
"bind(strafter( ?_dtCreation1, '#' ) as ?dtCreation1) .\n" +
"bind(strafter( ?_dtCreation2, '#' ) as ?dtCreation2) .\n" +
"bind(strafter( ?_dtTravel1, '#' ) as ?dtTravel1) .\n" +
"bind(strafter( ?_dtTravel2, '#' ) as ?dtTravel2) .\n" +
"values ?type { '2x Credit Notes' } .\n" +
"} ORDER BY ?lticket ?dtCreation1 " ;
lticket,type,bf1,cust1,tAmount1,dtCreation1,dtTravel1,bf2,cust2,tAmount2,dtCreation2,dtTravel2
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG2,XYZ2,124.50,20151104,20151111,ABCDEFG1,XYZ1,124.50,20151208,20151111
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG1,XYZ1,124.50,20151208,20151111,ABCDEFG2,XYZ2,124.50,20151104,20151111
How do I manage to only output one line, given that the two lines are just
inverted ?
Thank you for your help.
lookman sanni
2016-11-15 21:51:59 UTC
Permalink
That's what I actually implemented. I was hoping there would be something
sexier.

Thanks though Brian.
Post by Brian McBride
Would adding a filter
FILTER (?bf1 <= ?bf2)
or similar do the trick?
Brian
Post by lookman sanni
Hi,
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:j.0="http://www.somelink.org/ofdf#">
<rdf:Description rdf:about="
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678">
<j.0:creditedToCust>ABCDEFG1_5#XYZ1</j.0:creditedToCust>
<j.0:createdOnDate>ABCDEFG1_5#20151208</j.0:createdOnDate>
<j.0:consumedOnDate>ABCDEFG1_5#20151111</j.0:consumedOnDate>
<j.0:creditedFor>ABCDEFG1_5#124.50</j.0:creditedFor>
<j.0:createdOnDate>ABCDEFG2#20151104</j.0:createdOnDate>
<j.0:creditedToCust>ABCDEFG2#XYZ2</j.0:creditedToCust>
<j.0:creditedFor>ABCDEFG2#124.50</j.0:creditedFor>
<j.0:consumedOnDate>ABCDEFG2#20151111</j.0:consumedOnDate>
</rdf:Description>
</rdf:RDF>
String qs1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
"PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
"PREFIX ticket: <http://www.lookouster.org/ofdf#> " +
"SELECT ?lticket ?type ?bf1 ?cust1 ?tAmount1 ?dtCreation1 ?dtTravel1 ?bf2
?cust2 ?tAmount2 ?dtCreation2 ?dtTravel2 " +
"WHERE { " +
"?lticket ticket:creditedToCust ?_cust1 .\n" +
"?lticket ticket:creditedToCust ?_cust2 .\n" +
"?lticket ticket:creditedFor ?_tAmount1 .\n" +
"?lticket ticket:creditedFor ?_tAmount2 .\n" +
"?lticket ticket:createdOnDate ?_dtCreation1 .\n" +
"?lticket ticket:createdOnDate ?_dtCreation2 .\n" +
"?lticket ticket:consumedOnDate ?_dtTravel1 .\n" +
"?lticket ticket:consumedOnDate ?_dtTravel2 .\n" +
"FILTER ( strafter( ?_cust1, '#') != strafter( ?_cust2, '#') ) .\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_tAmount1, '#') )
.\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtCreation1, '#') )
.\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtTravel1, '#') )
.\n"
+
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_tAmount2, '#') )
.\n" +
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtCreation2, '#') )
.\n" +
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtTravel2, '#') )
.\n"
+
"FILTER ( strafter( ?_tAmount1, '#' ) = strafter( ?_tAmount2, '#' ) )
.\n" +
"bind(strafter( ?_cust1, '#' ) as ?cust1) .\n" +
"bind(strafter( ?_cust2, '#' ) as ?cust2) .\n" +
"bind(strbefore( ?_cust1, '#' ) as ?bf1) .\n" +
"bind(strbefore( ?_cust2, '#' ) as ?bf2) .\n" +
"bind(strafter( ?_tAmount1, '#' ) as ?tAmount1) .\n" +
"bind(strafter( ?_tAmount2, '#' ) as ?tAmount2) .\n" +
"bind(strafter( ?_dtCreation1, '#' ) as ?dtCreation1) .\n" +
"bind(strafter( ?_dtCreation2, '#' ) as ?dtCreation2) .\n" +
"bind(strafter( ?_dtTravel1, '#' ) as ?dtTravel1) .\n" +
"bind(strafter( ?_dtTravel2, '#' ) as ?dtTravel2) .\n" +
"values ?type { '2x Credit Notes' } .\n" +
"} ORDER BY ?lticket ?dtCreation1 " ;
lticket,type,bf1,cust1,tAmount1,dtCreation1,dtTravel1,bf2,
cust2,tAmount2,dtCreation2,dtTravel2
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG2,XYZ2,124.50,20151104,20151111,ABCDEFG1,XYZ1,
124.50,20151208,20151111
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG1,XYZ1,124.50,20151208,20151111,ABCDEFG2,XYZ2,
124.50,20151104,20151111
How do I manage to only output one line, given that the two lines are just
inverted ?
Thank you for your help.
--
Best Regards

Lookman SANNI
Lorenz B.
2016-11-16 10:32:33 UTC
Permalink
No, that's the usual way how to avoid duplicates.

One comment for the future: Please

a) show the data in Turtle format instead of RDF/XML - RDF/XML is
verbose and difficult to read and Turtle is what SPARQL BGPs are made of.
b) show the "real" SPARQL query, i.e. not as Java String concatenation
- by the way, this can also be used to debug syntax errors more easily.

Both will help us to give faster and maybe also better support, although
in the current case it wasn't the case.


Cheers,
Lorenz
Post by lookman sanni
That's what I actually implemented. I was hoping there would be something
sexier.
Thanks though Brian.
Post by Brian McBride
Would adding a filter
FILTER (?bf1 <= ?bf2)
or similar do the trick?
Brian
Post by lookman sanni
Hi,
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:j.0="http://www.somelink.org/ofdf#">
<rdf:Description rdf:about="
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678">
<j.0:creditedToCust>ABCDEFG1_5#XYZ1</j.0:creditedToCust>
<j.0:createdOnDate>ABCDEFG1_5#20151208</j.0:createdOnDate>
<j.0:consumedOnDate>ABCDEFG1_5#20151111</j.0:consumedOnDate>
<j.0:creditedFor>ABCDEFG1_5#124.50</j.0:creditedFor>
<j.0:createdOnDate>ABCDEFG2#20151104</j.0:createdOnDate>
<j.0:creditedToCust>ABCDEFG2#XYZ2</j.0:creditedToCust>
<j.0:creditedFor>ABCDEFG2#124.50</j.0:creditedFor>
<j.0:consumedOnDate>ABCDEFG2#20151111</j.0:consumedOnDate>
</rdf:Description>
</rdf:RDF>
String qs1 = "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
"PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
"PREFIX ticket: <http://www.lookouster.org/ofdf#> " +
"SELECT ?lticket ?type ?bf1 ?cust1 ?tAmount1 ?dtCreation1 ?dtTravel1 ?bf2
?cust2 ?tAmount2 ?dtCreation2 ?dtTravel2 " +
"WHERE { " +
"?lticket ticket:creditedToCust ?_cust1 .\n" +
"?lticket ticket:creditedToCust ?_cust2 .\n" +
"?lticket ticket:creditedFor ?_tAmount1 .\n" +
"?lticket ticket:creditedFor ?_tAmount2 .\n" +
"?lticket ticket:createdOnDate ?_dtCreation1 .\n" +
"?lticket ticket:createdOnDate ?_dtCreation2 .\n" +
"?lticket ticket:consumedOnDate ?_dtTravel1 .\n" +
"?lticket ticket:consumedOnDate ?_dtTravel2 .\n" +
"FILTER ( strafter( ?_cust1, '#') != strafter( ?_cust2, '#') ) .\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_tAmount1, '#') )
.\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtCreation1, '#') )
.\n" +
"FILTER ( strbefore( ?_cust1, '#') = strbefore( ?_dtTravel1, '#') )
.\n"
+
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_tAmount2, '#') )
.\n" +
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtCreation2, '#') )
.\n" +
"FILTER ( strbefore( ?_cust2, '#') = strbefore( ?_dtTravel2, '#') )
.\n"
+
"FILTER ( strafter( ?_tAmount1, '#' ) = strafter( ?_tAmount2, '#' ) )
.\n" +
"bind(strafter( ?_cust1, '#' ) as ?cust1) .\n" +
"bind(strafter( ?_cust2, '#' ) as ?cust2) .\n" +
"bind(strbefore( ?_cust1, '#' ) as ?bf1) .\n" +
"bind(strbefore( ?_cust2, '#' ) as ?bf2) .\n" +
"bind(strafter( ?_tAmount1, '#' ) as ?tAmount1) .\n" +
"bind(strafter( ?_tAmount2, '#' ) as ?tAmount2) .\n" +
"bind(strafter( ?_dtCreation1, '#' ) as ?dtCreation1) .\n" +
"bind(strafter( ?_dtCreation2, '#' ) as ?dtCreation2) .\n" +
"bind(strafter( ?_dtTravel1, '#' ) as ?dtTravel1) .\n" +
"bind(strafter( ?_dtTravel2, '#' ) as ?dtTravel2) .\n" +
"values ?type { '2x Credit Notes' } .\n" +
"} ORDER BY ?lticket ?dtCreation1 " ;
lticket,type,bf1,cust1,tAmount1,dtCreation1,dtTravel1,bf2,
cust2,tAmount2,dtCreation2,dtTravel2
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG2,XYZ2,124.50,20151104,20151111,ABCDEFG1,XYZ1,
124.50,20151208,20151111
http://www.somelink.org/ofdf#ticket/supplier-123/ticket-12345678,2x Credit
Notes,ABCDEFG1,XYZ1,124.50,20151208,20151111,ABCDEFG2,XYZ2,
124.50,20151104,20151111
How do I manage to only output one line, given that the two lines are just
inverted ?
Thank you for your help.
--
Lorenz BÃŒhmann
AKSW group, University of Leipzig
Group: http://aksw.org - semantic web research center
Loading...