About

LSD Dimensions is an observatory of the current usage of Data Structure Definitions (DSDs), dimensions and codes in Linked Statistical Data (LSD).

LSD Dimensions is an aggregator of all qb:DataStructureDefinition and qb:DimensionProperty resources (and their associated triples), as defined in the RDF Data Cube vocabulary (W3C recommendation for publishing statistical data on the Web), that can be currently found in the Linked Data Cloud (read: the SPARQL endpoints in Datahub.io). Its purpose is to improve the reusability of statistical dimensions, codes and concept schemes in the Web of Data, providing an interface for users (future work: also for programs) to search for resources commonly used to describe open statistical datasets. The gathering of all DSDs is intended for developing ways of enhancing comparability of Linked Statistical Data by leveraging semantically rich descriptions of cube components.

Motivation

RDF Data Cube (QB) has boosted the publication of Linked Statistical Data (LSD) as Linked Open Data (LOD) by providing a means ``to publish multi-dimensional data, such as statistics, on the web in such a way that they can be linked to related data sets and concepts''. QB defines cubes as sets of observations affected by dimensions, measures and attributes. For example, the observation ``the measured life expectancy of males in Newport in the period 2004-2006 is 76.7 years'' has three dimensions (time period, with value 2004-2006; region, with value Newport; and sex, with value male), a measure (population life expectancy) and two attributes (the units of measure, years; and the metadata status, measured, to make explicit that the observation was measured instead of, for instance, estimated or interpolated). In some cases, it is useful to also define codes, a closed set of values taken by a dimension (e.g. sensible codes for the dimension sex could be male and female).

There is a vast diversity of domains to publish LSD about, and quite some dimensions and codes can be very heterogeneous, domain specific and hardly comparable. To this end, QB allows users to mint their own URIs to create arbitrary dimensions and associated codes. Conversely, some other dimensions and codes are quite common in statistics, and could be easily reused. However, publishers of LSD have no means to monitor the dimensions and codes currently used in other datasets published in QB as LOD, and consequently they cannot (a) link to them; nor (b) reuse them.

This is the motivation behind LSD Dimensions: it monitors the usage of existing dimensions and codes in LSD. It allows users to browse, search and gain insight into these dimensions and codes. We depict the diversity of statistical variables in LOD, improving their reusability.

How Does It Work?

Read our paper.

Future Extensions

  • Display not only defined codes, but used codes in qb:Observation resources.

  • Model the retrieved data in RDF and serve it via a SPARQL endpoint.

  • Addition of other interesting dimension metadata (such as rdfs:subPropertyOf or rdfs:range).

  • Interesting data analyses on all dimensions.