Sunday, September 28, 2014

Loading JSON-LD Into Elasticsearch

From the elasticsearch mailing list

Amine Bouayad amine@***.com via googlegroups.com 




Thank you all for your responses and interesting conversation about RDF serialization into ES. With regards to my original post, I ended up using a solution based on RDFlib: 


It works as expected, and compacting the content by using @context does the trick and is flexible. It is an in-memory process however, which could be an issue for those with very large RDF files. When using Jena, I didn't find the ability to add @context mappings, but maybe I didn't dig enough.

On a side note, looks like the rdflib-jsonld solution already has support for XSD literals and lists, so perhaps it could be extended to map directly into ES _type if that is a good direction.

With my Json-ld file ready for ingestion into ES, I do have another question: are there utilities to bulk load such documents (the json-ld contains individual documents per ES, each with an _id), or do I just write a script that calls curl -XPUT for each record in the json-ld file? Seems like a pretty common use case.

Thanks again to all, interesting stuff. Happy to contribute to extending an existing solution.

Amine

No comments:

Post a Comment