Discussion:
Big queries without Java
Mikael Pesonen
2016-06-20 08:20:00 UTC
Permalink
Hi,

I have a graph with about 10 million triples and separate Lucene index.
Now Im querying data first from Jena and then filtering results with
Lucene query result.
Problem is that getting ~100k results is too slow:

SELECT ?s ?p ?o WHERE {
GRAPH <...graph...> {
?s dcterms:isPartOf <...collection...> .
?s ?p ?o } }

Querying Lucene first and then sending its result to Jena could be
faster, but sparql query size will be 10-100k so command line wont work.
Is it possible to make large Jena queries without Java?

Thanks,
Mikael
--
www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: ***@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Linnankatu 10 A
FI-20100 Turku
FINLAND
Osma Suominen
2016-06-20 09:10:56 UTC
Permalink
Hi Mikael!

There are many ways.

If you use Fuseki, then you can submit SPARQL queries over HTTP. The
easiest way is probably to use the rsparql command line tool (from the
Jena distribution) or s-query (from the Fuseki distribution).

If you only have a TDB but no Fuseki, then you can use the tdbquery
command. It takes a --query (or --file) parameter specifying a file with
the SPARQL query.

-Osma
Post by Mikael Pesonen
Hi,
I have a graph with about 10 million triples and separate Lucene index.
Now Im querying data first from Jena and then filtering results with
Lucene query result.
SELECT ?s ?p ?o WHERE {
GRAPH <...graph...> {
?s dcterms:isPartOf <...collection...> .
?s ?p ?o } }
Querying Lucene first and then sending its result to Jena could be
faster, but sparql query size will be 10-100k so command line wont work.
Is it possible to make large Jena queries without Java?
Thanks,
Mikael
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
***@helsinki.fi
http://www.nationallibrary.fi
Mikael Pesonen
2016-06-20 11:10:06 UTC
Permalink
Thanks Osma!

Looks like I had an false idea on query limit with s-query command line
tool. Works fine with 100k query.

Br,
Mikael
Post by Osma Suominen
Hi Mikael!
There are many ways.
If you use Fuseki, then you can submit SPARQL queries over HTTP. The
easiest way is probably to use the rsparql command line tool (from the
Jena distribution) or s-query (from the Fuseki distribution).
If you only have a TDB but no Fuseki, then you can use the tdbquery
command. It takes a --query (or --file) parameter specifying a file
with the SPARQL query.
-Osma
Post by Mikael Pesonen
Hi,
I have a graph with about 10 million triples and separate Lucene index.
Now Im querying data first from Jena and then filtering results with
Lucene query result.
SELECT ?s ?p ?o WHERE {
GRAPH <...graph...> {
?s dcterms:isPartOf <...collection...> .
?s ?p ?o } }
Querying Lucene first and then sending its result to Jena could be
faster, but sparql query size will be 10-100k so command line wont work.
Is it possible to make large Jena queries without Java?
Thanks,
Mikael
--
www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: ***@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Linnankatu 10 A
FI-20100 Turku
FINLAND
Loading...