wwwL(inked)

Linked data, RDF, SPARQL and the future

Presentation by TheodorosPloumis / @theoploumis

Meetup No 29 - 16 March 2016 - TechMinistry.

Under Attribution 4.0 International license.

What is Semantic Web

  • Humans can understand web content
  • Machines (computers) need some help
  • In a semantic environment Machines can understand the meaning of a thing (document, port, text, system status etc) and link to/from it.
  • Metadata are a type of semantics

What do Humans read


            Drupal meetup Thessaloniki will take place
            in Techministry
            at Wednesday 16 March 2016
            from 19:00 to 21:00.

          

What do Machines read


            Word word link word word word
            word link
            word link
            word link.

          

What should Machines read


    -> Event description with url http://mydrupal.gr/node/2881

    Word word Link of a City word word word
    word Link of a Place
    word link of a Date
    word link of a 24h Time.

          

What is Linked Data

  • Really connected structured web
  • Bizer, Heath and Berners-Lee (2009)
  • Internet of Things not only of documents
  • Linked Data may not be Open Data
  • Automatic decisions
  • Find information faster, easier and with accuracy
  • The basic 'engine' behind Big Data
  • The key for Open Government/Cities/Health/...

The 5th star of the Open Data

Linked data simplified

One giant global graph

Source by Manu Sporny

Tech behind linked data

  • URI (Uniform Resource Identifier)
  • RDF (Resource Description Framework)
  • http
  • Vocabularies - Ontologies
  • SPARQL

The 4 principles of Linked Data

Tim Berners-Lee, 2006

  • Use URIs to name (identify) things
  • Use HTTP URIs so that these things can be looked up (interpreted, dereferenced)
  • Provide useful information about what a name identifies when it's looked up, using open standards such as RDF, SPARQL etc
  • Refer to other things using their HTTP URI-based names when publishing data on the Web

RDF

  • A graph data model for describing things
  • Describe the relationships between things
  • W3C specification (1997)
  • Only describes resources
  • It's a data model, a concept not a data format
  • It needs serialization
  • Current version RDF 1.1 (2014) after 1.0 (2004)

The RDF triples

  • [subject] [predicate] [object]
  • "John knows Mary"
  • We must state explicitly the nature of the connection
  • We can download or query triples from triplestores
  • Types of RDF triples: Literal Triples and RDF Links

The RDF Anatomy

RDF - URI


      Absolute IRI which may include a # fragment.
      <http://www.example.org/>
      <http://www.example.org/#fragment>

      Relative IRI resolved against base IRI.
      <abc.rdf>

      Base IRI, usually the query document IRI
      <>

      IRI shorthand using XML-style prefix ex and local name.
      Declared with PREFIX (SPARQL) or @prefix (Turtle)
      ex:name
    

RDF - Literal


      A Unicode string with an optional language tag.
      "hello"
      "bonjour"@fr
      "1234"
          

RDF - Typed Literal

Literals with an XML schema datatype

      A Unicode string and datatype IRI for encoding datatypes.
      "1234"^^<https://www.w3.org/2001/XMLSchema#string>

      Abbreviated with an XML QName style as:
      "1234"^^xsd:string

      Short forms for several common datatypes:
      -10

      "-10"^^xsd:integer

      1.2345
      "1.2345"^^xsd:decimal

      true
      "true"^^xsd:boolean
          

Taxonomies, Vocabularies and Ontologies

  • Domain-specific terms for describing classes (groups) of things and how they relate to each other
  • Lightweight ontologies in RDF often referred as vocabularies
  • They borrow classes/properties from each other
  • We can create our own, extend or reuse ontologies
  • Popular: Schema.org, SKOS, FoaF, DCMI, SIOC etc
  • Overview of Linked Open Vocabularies at LOV

RDF - Namespaces and Prefixes

NamespacePrefixNamespace URI
RDFrdf:http://www.w3.org/1999/02/22-rdf-syntax-ns#
Dublin Coredc:http://purl.org/dc/elements/1.1/
FOAFfoaf:http://xmlns.com/foaf/0.1/
XML Schema Datatypesxsd:http://www.w3.org/2001/XMLSchema#
RDFSrdfs:http://www.w3.org/2000/01/rdf-schema#
OWLowl:http://www.w3.org/2002/07/owl#

Find prefixes at prefix.cc

The RDF Graph

  • A collections of statements about a thing
  • Starting with the same Subject
  • The URIs occurring as subject and object are the nodes in the graph
  • A real example of a webpage

Serializing RDF data

Example RDFa: demos, real webpage, webpage parsed

SPARQL 1.1 (2013)

SPARQL - Reference synopsis

Patterns Modifiers Query Forms
RDF terms DISTINCT SELECT
triple patterns REDUCED CONSTRUCT
Basic graph patterns PROJECT DESCRIBE
Groups ORDER BY ASK
OPTIONAL LIMIT  
UNION OFFSET  
GRAPH    
FILTER    

SPARQL - Common Syntax


    # prefix declarations
    PREFIX foo: <http://example.com/resources/>
    ...
    # dataset definition
    FROM ...
    # result clause
    SELECT ... ?variables
    # query pattern
    WHERE {
        ... ?variables
    }
    # query modifiers
    ORDER BY ...
    LIMIT n OFFSET m
          

SPARQL - Examples

SPARQL - Tips

  • Same results can be achived with different queries
  • More accurate queries are faster
  • Prefer Group Graph patterns instead of FILTER
  • Avoid SELECT *
  • Avoid ORDER BY
  • Avoid DISTINCT (use REDUCED)
  • Use OFFSET, LIMIT to paginate results
  • Variables are case insensitive (?var is the same as ?VAR)
  • Variables cannot change name on the same query
  • Variables can start with ? or $

SPARQL - Cheatsheets

The problems of Linked Data

  • Confusion of standarization (see w3c)
  • Hard for developers to understand
  • Which schema and ontology to use
  • Need for powerful/special software
  • Missing metadata for the datasets
  • Cannot really rely on endpoints (see SPARQL endpoint status from Datahub.io linked datasets)
  • Quality and accuracy of the data

The (linked) future

Follow them

Q&A

TheodorosPloumis.com

@theoploumis

Presentation code: github.com/theodorosploumis/linked-data