RDF/SPARQL and Flink

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

RDF/SPARQL and Flink

Tomas Knap
Good afternoon, 

Currently we are using UnifiedViews (unifiedviews.eu) for RDF data processing. So you may define various RDF data processing tasks in UnifiedViews, e.g.: 1) extract data from certain SPARQL endpoint A, 2) extract data from certain folder and convert it to RDF, 3) merge RDF data outputted by these two sources, 4) execute series of SPARQL update queries on top of that, 5) load to the target repository.

We are thinking about using Apache Flink as a backend for executing the RDF data processing tasks (to scale out). But what is still not clear to us is how these kind of tasks processing RDF data (as the one above) may be supported with Flink. For example, how would you read RDF data, how would you support in Flink plugins which execute SPARQL update/construct queries?

If you can share with us some materials regarding processing of RDF data in Flink, support of SPARQL (Update) queries in Flink etc. that would be great and will help us to decide about future steps!

Thanks, 
Tomas Knap



--
Tomas Knap, PhD
Technical Consultant & Researcher

Semantic Web Company GmbH 



Reply | Threaded
Open this post in threaded view
|

Re: RDF/SPARQL and Flink

rmetzger0
Hi Tomas,

I'm really not an RDF processing expert, but since nobody responded for 4 days, I'll try to give you some pointers:
I know that there've been discussions regarding RDF processing on this mailing list before.
Also, there seems to be a project that uses Flink: http://sansa-stack.net/user-guide/


Regards,
Robert


On Tue, Nov 15, 2016 at 3:47 PM, Tomas Knap <[hidden email]> wrote:
Good afternoon, 

Currently we are using UnifiedViews (unifiedviews.eu) for RDF data processing. So you may define various RDF data processing tasks in UnifiedViews, e.g.: 1) extract data from certain SPARQL endpoint A, 2) extract data from certain folder and convert it to RDF, 3) merge RDF data outputted by these two sources, 4) execute series of SPARQL update queries on top of that, 5) load to the target repository.

We are thinking about using Apache Flink as a backend for executing the RDF data processing tasks (to scale out). But what is still not clear to us is how these kind of tasks processing RDF data (as the one above) may be supported with Flink. For example, how would you read RDF data, how would you support in Flink plugins which execute SPARQL update/construct queries?

If you can share with us some materials regarding processing of RDF data in Flink, support of SPARQL (Update) queries in Flink etc. that would be great and will help us to decide about future steps!

Thanks, 
Tomas Knap



--
Tomas Knap, PhD
Technical Consultant & Researcher

Semantic Web Company GmbH