High CPU usage with Apache Jena Fuseki

Discussion:

Wen, Chen

2016-06-30 17:51:00 UTC

Hi,
I am having a problem with fuseki-server. Every time when I try to do an ontology based query or just click on "count triples in all graphs", the CPU runs on almost 100% and hangs there. I have to terminate the process to get CPU usage back down.

I have a customized config.ttl for tdb:
<#tdb> rdf:type fuseki:Service ;
fuseki:name "tdb" ; # http://host/inf
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceQuery "query" ; # SPARQL query service (alt name)
fuseki:serviceUpdate "update" ; # SPARQL update service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store protocol (read and write)
# A separate read-only graph store endpoint:
fuseki:serviceReadGraphStore "get" ; # SPARQL Graph store protocol (read only)
fuseki:dataset <#dataset2> ; #select which set to
.

tdb:GraphTDB rdfs:subClassOf ja:Model .

<#dataset2> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model2>;
.

And I also increased JVM memory as below in fuseki-server.bat:
java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g -XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar fuseki-server.jar %*

I have only 124 tuples loaded. And It works if I do a query without any specific criteria like:
select ?s ?p ?o
where
{
?s ?p ?o .
}
limit 100

However if I do a simple ontology specific query, the CPU goes up high and cannot recover anymore:
SELECT ?patient
WHERE
{
?patient <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://sample.org/dental-ontology/RIDO_0000083> .
}
limit 100

Am I missing anything? Can somebody advise?

Andy Seaborne

2016-06-30 20:34:44 UTC

Permalink

Post by Wen, Chen
Hi,
I am having a problem with fuseki-server. Every time when I try to do an ontology based query or just click on "count triples in all graphs", the CPU runs on almost 100% and hangs there. I have to terminate the process to get CPU usage back down.
<#tdb> rdf:type fuseki:Service ;
fuseki:name "tdb" ; # http://host/inf
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceQuery "query" ; # SPARQL query service (alt name)
fuseki:serviceUpdate "update" ; # SPARQL update service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph store protocol (read and write)
fuseki:serviceReadGraphStore "get" ; # SPARQL Graph store protocol (read only)
fuseki:dataset <#dataset2> ; #select which set to
.
tdb:GraphTDB rdfs:subClassOf ja:Model .
<#dataset2> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model2>;
.

Where does <#model2> go to in the config?

Post by Wen, Chen
java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g -XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar fuseki-server.jar %*

Only -jar is needed, not -cp

How big is the physical RAM in the machine?

If it is say 16G, then -Xmx15g is not a good idea as it may force the OS
to swap the java heap.

For TDB, much of the caching is off heap so -Xmx15g detracting from
that. Allow 2G per TDB database + 2G for Fuseki.

Post by Wen, Chen
select ?s ?p ?o
where
{
?s ?p ?o .
}
limit 100
SELECT ?patient
WHERE
{
?patient <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://sample.org/dental-ontology/RIDO_0000083> .
}
limit 100
Am I missing anything? Can somebody advise?

Wen, Chen

2016-07-01 13:33:39 UTC

Permalink

Thank you Andy. This machine has 64G memory. Below is the model2 config. Do you see anything wrong?

<#model2> a ja:InfModel;
ja:baseModel
[a ja:MemoryModel ;
ja:content [ja:externalContent <file:///E:/sample-dental-ontology-rdfxml.owl>]] ;
ja:reasoner
[ ja:reasonerURL
<http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>];
.

-----Original Message-----
From: Andy Seaborne [mailto:***@apache.org]
Sent: Thursday, June 30, 2016 4:35 PM
To: ***@jena.apache.org
Subject: Re: High CPU usage with Apache Jena Fuseki

Where does <#model2> go to in the config?

Post by Wen, Chen
java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g
-XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar
fuseki-server.jar %*

Only -jar is needed, not -cp

How big is the physical RAM in the machine?

If it is say 16G, then -Xmx15g is not a good idea as it may force the OS to swap the java heap.

For TDB, much of the caching is off heap so -Xmx15g detracting from that. Allow 2G per TDB database + 2G for Fuseki.

Andy Seaborne

2016-07-02 09:43:41 UTC

Permalink

It is possible that the inference you are using is causing a lot of
calculation. That's driven by what's in sample-dental-ontology-rdfxml.owl.

Andy

Post by Wen, Chen
Thank you Andy. This machine has 64G memory. Below is the model2 config. Do you see anything wrong?
<#model2> a ja:InfModel;
ja:baseModel
[a ja:MemoryModel ;
ja:content [ja:externalContent <file:///E:/sample-dental-ontology-rdfxml.owl>]] ;
ja:reasoner
[ ja:reasonerURL
<http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>];
.
-----Original Message-----
Sent: Thursday, June 30, 2016 4:35 PM
Subject: Re: High CPU usage with Apache Jena Fuseki

Where does <#model2> go to in the config?

Post by Wen, Chen
java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g
-XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar
fuseki-server.jar %*

Only -jar is needed, not -cp
How big is the physical RAM in the machine?
If it is say 16G, then -Xmx15g is not a good idea as it may force the OS to swap the java heap.
For TDB, much of the caching is off heap so -Xmx15g detracting from that. Allow 2G per TDB database + 2G for Fuseki.

Dave Reynolds

2016-07-02 14:44:58 UTC

Permalink

One option to try would be to use the OWLMicro reasoner configuration:
http://jena.hpl.hp.com/2003/OWLMicroFBRuleReasoner

If that is still too low performance, and if your data is static, then
perform the inference ahead of time, store the inference closure and
serve that without further runtime inference.

Dav

Post by Andy Seaborne
It is possible that the inference you are using is causing a lot of
calculation. That's driven by what's in sample-dental-ontology-rdfxml.owl.
Andy

Post by Wen, Chen
Thank you Andy. This machine has 64G memory. Below is the model2
config. Do you see anything wrong?
<#model2> a ja:InfModel;
ja:baseModel
[a ja:MemoryModel ;
ja:content [ja:externalContent
<file:///E:/sample-dental-ontology-rdfxml.owl>]] ;
ja:reasoner
[ ja:reasonerURL
<http://jena.hpl.hp.com/2003/OWLFBRuleReasoner>];
.
-----Original Message-----
Sent: Thursday, June 30, 2016 4:35 PM
Subject: Re: High CPU usage with Apache Jena Fuseki

Post by Wen, Chen
Hi,
I am having a problem with fuseki-server. Every time when I try to do
an ontology based query or just click on "count triples in all
graphs", the CPU runs on almost 100% and hangs there. I have to
terminate the process to get CPU usage back down.
<#tdb> rdf:type fuseki:Service ;
fuseki:name "tdb" ; # http://host/inf
fuseki:serviceQuery "sparql" ; # SPARQL query service
fuseki:serviceQuery "query" ; # SPARQL query service (alt name)
fuseki:serviceUpdate "update" ; # SPARQL update service
fuseki:serviceUpload "upload" ; # Non-SPARQL upload service
fuseki:serviceReadWriteGraphStore "data" ; # SPARQL Graph
store protocol (read and write)
fuseki:serviceReadGraphStore "get" ; # SPARQL Graph
store protocol (read only)
fuseki:dataset <#dataset2> ; #select which set to
.
tdb:GraphTDB rdfs:subClassOf ja:Model .
<#dataset2> rdf:type ja:RDFDataset ;
ja:defaultGraph <#model2>;
.

Where does <#model2> go to in the config?

Post by Wen, Chen
java -cp jena-tdb-3.1.0.jar:jena-arq-3.1.0.jar -Xms1g -Xmx15g
-XX:NewSize=4g -XX:MaxNewSize=4g -XX:SurvivorRatio=8 -jar
fuseki-server.jar %*

Only -jar is needed, not -cp
How big is the physical RAM in the machine?
If it is say 16G, then -Xmx15g is not a good idea as it may force the
OS to swap the java heap.
For TDB, much of the caching is off heap so -Xmx15g detracting from
that. Allow 2G per TDB database + 2G for Fuseki.

Post by Wen, Chen
I have only 124 tuples loaded. And It works if I do a query without
select ?s ?p ?o
where
{
?s ?p ?o .
}
limit 100
However if I do a simple ontology specific query, the CPU goes up
SELECT ?patient
WHERE
{
?patient <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://sample.org/dental-ontology/RIDO_0000083> .
}
limit 100
Am I missing anything? Can somebody advise?