rdf-io

Utilities to link Django to RDF stores and inferencers.

Why: Allows semantic data models and rules to be used to generate rich views of content, and expose standardised access and query interfaces - such as SPARQL and the Linked Data Platform. Conversely, allow use of Django to manage content in RDF stores :-)

compatibility

Tested with django 1.11 + python 2.7 and django 3.0 with python 3.8

Features

installation

get a working copy with

git clone https://github.com/rob-metalinkage/django-rdf-io
pip install -e (where you put it)

in your master django project:

Automated publishing of updated to RDF

This is really only guaranteed for pushing additions and updates - deletions are not handled, although updates will tend to replace statements.

on startup to enable (necessary after django-reload)

NB - TODO a way to force this to happen automatically - needs to happen after both RDF_IO and the target models are installed, so cant go in initialisation for either model.

{SERVER_URL}/rdf_io/ctl_signals/sync

to turn on publishing for a model class

1) check Auto-push flag is checked in an ObjectMapping for that model class 2) save - should register a post_save signal for the model class

to turn on/off publishing for all model classes

{SERVER_URL}/rdf_io/ctl_signals/(on/off)

Usage

Overview

1) Define mappings for your target models using the admin interface $SERVER/admin/rdf_io (see below) 2) To create an online resource use {SERVER_URL}/rdf_io/to_rdf/{model_name}/id/{model_id} {SERVER_URL}/rdf_io/to_rdf/{model_name}/key/{model_natural_key}

Object Mappings

Mappings to RDF are done for Django models. Each mapping consists of: 1) an identifier mapping to generate the URI for the object 2) a set of AttributeMapping elements that map a list of values to a RDF predicate 3) a set of EmbeddedMapping that map a list of values to complex object property (optionally wrapped in a blank node) 4) a filter to limit the set of objects the mapping applies to

More than one object mapping may exist for a Django model. The RDF graph is the union of all the configured Object Mapping outputs. (Note that a ServiceBinding may be bound to a specific mapping, but the default behaviour is for this to be used to find all ServiceBindings for a gioven django modeltype - and they all get the composite graph (this may be changed to supported publishing different graphs to different RDf stores in future.)

Mapping syntax

Mapping is non trivial - because the elements of your model may need to extracted from related models

Mapping is from elements in a Django model to a RDF value (a URI or a literal)

source model elements may be defined using XPath-like syntax, with nesting using django filter style , ab .(dot) or / notation, where each element of the path may support an optional filter.

path = (literal|element([./]element)*)

literal = "a quoted string" | 'a quoted string' | <a URI>  

element = (property|related_model_expr)([filter])?

property = a valid name of a property of a django model 

related_model_expr = model_name(\({property}\))? 

filter = (field(!)?=literal)((,| AND )field(!)?=literal)* | literal((,| OR )literal)*

Notes:

Status:

beta, functionally complete initial capability:

todo:

API

Serialising within python

from django.contrib.contenttypes.models import ContentType

from rdflib import Graph

from rdf_io.views import build_rdf
from rdf_io.models import ObjectMapping

from my_app.models import Task

# This example assumes ...
#   * you have created a model called `Task` and there’s at least one task
#   * you have created a mapping for the Task model

object_to_serialize = Task.objects.first()

content_type = ContentType.objects.get(model='task')
obj_mapping_list = ObjectMapping.objects.filter(content_type=content_type)

graph = Graph()

build_rdf(graph, object_to_serialize, obj_mapping_list, includemembers=True)

print(graph.serialize(format="turtle"))

Serialising using django views:

{SERVER_URL}/rdf_io/to_rdf/{model_name}/{model_id}

Deprecated

Marmotta LDP

e.g.

# RDF triplestore settings
RDFSTORE = { 
    'default' : {
        'server' : "".join((SITEURL,":8080/marmotta" )),
        # model and slug are special - slug will revert to id if not present
        'target' : "/ldp/{model}/{slug}",
        # this could be pulled from settings
        'auth' : ('admin', 'pass123')
        },
    # define special patterns for nested models
    'scheme' : {
        'target' : "/ldp/voc/{slug}",
        },
    'concept' : {
        'target' : "/ldp/voc/{scheme__slug}/{term}",
        }
}        

If auto_publish is set in an Object Mapping then the RDF-IO mapping is triggered automatically when saving an object once an ObjectMapping is defined for that object type.

A bulk load to the RDF store can be achieved with /rdf_io/sync_remote/{model}(,{model})*

Note that containers need to be create in the right order - so for the SKOS example this must be /rdf_io/sync_remote/scheme,concept

Design Goals

We have four types of apps then: 1 the master project 1 the RDF serializer utility 1 imported apps that have default RDF serializations 1 imported apps that may or may not have RDF serialisations defined in the project settings.

I suspect that this may all be a fairly common pattern - but I've only seen far more heavyweight approaches to RDF trying to fully model RDF and implement SPARQL - all I want to do is spit some stuff out into an external triple-store.

default RDF serialisations are handled by loading initial_data fixtures. RDF_IO objects are defined using natural keys to allow default mappings for modules to be loaded in any order. It may be more elegant to use settings so these defaults can be customised more easily.

Signals are registered when an ObjectMapping is defined for a model.