mpf-buf-rdf
The following describes usage and an example of the mpf-buf-rdf component. This component can be placed pretty much anywhere in a pipeline to generate RDF metadata about the GstBuffer buffers that are connected to its input. When used with a "tee" component it can be used to associate metadata with buffers (audio, video, etc) as they flow through a pipeline.
The main use of this component is in testing other components that handle metadata (e.g. mpf-rdf-mux, which allows multiplexing of RDF metadata from various sources), but it could also evolve into a useful component in its own right to allow generation of metadata from buffers generated by other components.
Issues:
This component generates a very simple RDF output, each buffer is described by three RDF triples: index, timestamp, and length. The units for these quantities are left undefined: is "timestamp" milliseconds, microseconds, nanoseconds, or some other unit? is index zero or 1 based. It might be better if these questions were addressed in the RDF itself, either with a schema of RDF triples that was issued at the start of the output defining terms, or with a schema of triples that was registered with the pipeline though an event mechanism into some central store.
We are interested in discussing these issues and developing this and similar components to meet the needs of the community.
Simple Pipeline (mpf-buf-rdf/tests/test-1)
Use a fakesrc component to generate 10 buffers, each containing 10 bytes. Feed each source into the mpf-buf-rdf transformation to create GRDF data. Convert the RDF to textual RDF/N3 using the mpf-rdf-ton3 component, and sink it to standard output.
#!/bin/sh
#
# Fire up one data-buffer sources, and pass it through the mpf-rdf-buf element.
# This element generates RDF descriptions of the buffer index, length, and timestamp, which is then converted
# into readable N3 format and sent out of stdout.
#
src="fakesrc num-buffers=10 sizetype=2 sizemax=10 filltype=4 datarate=10 do-timestamp=true"
$prefix $args gst-launch$version \
$src ! mpf-buf-rdf name=rdf-a \
rdf-a.rdf ! queue ! mpf-rdf-ton3 ! filesink location=/dev/stdout \
Expected Output
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
_:rdf-a0 <urn:rdf:appscio.com/ver_1.0/buffer/timestamp> "0".
_:rdf-a0 <urn:rdf:appscio.com/ver_1.0/buffer/length> "10".
_:rdf-a0 <urn:rdf:appscio.com/ver_1.0/buffer/index> "0".
_:rdf-a0 <dcterms:type> <urn:rdf:appscio.com/ver_1.0/buffer>.
_:rdf-a1 <urn:rdf:appscio.com/ver_1.0/buffer/timestamp> "1000000000".
_:rdf-a1 <urn:rdf:appscio.com/ver_1.0/buffer/length> "10".
_:rdf-a1 <urn:rdf:appscio.com/ver_1.0/buffer/index> "1".
_:rdf-a1 <dcterms:type> <urn:rdf:appscio.com/ver_1.0/buffer>.
...
_:rdf-a9 <urn:rdf:appscio.com/ver_1.0/buffer/timestamp> "9000000000".
_:rdf-a9 <urn:rdf:appscio.com/ver_1.0/buffer/length> "10".
_:rdf-a9 <urn:rdf:appscio.com/ver_1.0/buffer/index> "9".
_:rdf-a9 <dcterms:type> <urn:rdf:appscio.com/ver_1.0/buffer>.
Got EOS from element "pipeline0".
Execution ended after 2766741 ns.
Setting pipeline to PAUSED ...
Setting pipeline to READY ...
Setting pipeline to NULL ...
Freeing pipeline ...
Output consists of RDF/N3 triples containing timestamp (nanoseconds), length (bytes) and buffer index (zero based). The subject (e.g. _:rdf-a9) is generated automatically by the GRDF software based on the name assigned to the mpf-rdf-buf component, and is unique for the run of the pipeline.
Properties
Element Properties:
name : The name of the object
flags: readable, writable
String. Default: null Current: "appscio-mpf-buf-rdf0"
loglevel : Logging level (0-3 = DEBUG .. ERROR)
flags: readable, writable
Integer. Range: 0 - 3 Default: 2 Current: 2
key : Key to use for rdf triples
flags: readable, writable
String. Default: null Current: null

'timestamp' in the ontology
I'd like to promote the discussion of 'timestamp' past being a lexical property of the blank node for each buffer and get to teh idea of a reusable 'time' in the pipeline. (Which responds to the issue of "what units")
We have two cases where we need to understand the meaning of a time property - in the pipeline itself, and in the metadata store to which we report out the piepline metadata; we should seek ease of translation between them.
In a general metadata store I think we we will want to use W3C Time-OWL ontology (http://www.w3.org/TR/owl-time/), so it would be good if the semantics of the 'timestamp' fit the model of a Time-OWL 'Instant' ... Instant uses xsd:duration strings to represent times. Since you're using a lexical string to pass the timestamp, I would write them as 'PT0S' (for time zero) and 'PT9S' for 9 seconds (or, suppose you were at time 9 seconds, 12345 ns then you would write P9.000012345S') This handles the question of units directly in the lexical string. (If it's something we do a *lot* of, we might want to have our extension of Instant include #NS binary property, but that would be premature optimization right now)
The other question is what namesapce this goes in. We will do a distinct blog-based discussion post about this but I think 'time' stuff belongs in the 'core' which we haven't codified (but it's time to do that ;)
As a peek ahead: we wouldn't expect 'core' to change frequently and we'd want it to be efficiently fetched - so, it will use a 'hash' namespace (see http://www.w3.org/TR/2008/NOTE-swbp-vocab-pub-20080828/#hash) ... so, :http://appscio.org/ontologies/ver_1.0/core#timestamp and we can work tomorrow on what would be the right properties/restrictions.