ACTIVE Deliverable template - WIKI

PlanetData
Deliverable Review Form
Part I - Comments List
deliverable
name
Best practices on how to provide selfdescribing data
month deliverable due
deliverable number
24
lead participant
Responsible person
KIT
Maribel Acosta
reviewer
Sent for review
(date)
Oscar Corcho (UPM)
Sent back to authors
7.10.2012
(date)
Other participants
13.10.2012
SCIENTIFIC
comment
1
2
3
1
D4.2
The executive summary still lacks a set of “message to take home”
messages, rather than simply being a summary of the contents that are
dealt with in the deliverable. In fact, this is clear also because the
abstract is a cut from the executive summary.
The section on types of data is a bit too skewed towards what is going
to be the focus of the deliverable. Although it is ok, probably the
introduction to this section or even the title should express clearly that
you are focusing on providing not a complete overview of different
types of data (e.g., it is strange not to see in table 1 as structured and
static a data warehouse, for instance), but on types of open data, or
simply on the types of data that are the main focus of PlanetData,
although in any case this table is still a bit too incomplete. For
instance, table 1 may have a better caption like “Some examples of
different types of data” to suggest that it is not aiming at being
exhaustive.
Section 2 organisation is a bit strange, since different dimensions are
often mixed together with not a clear structure. For instance, section
2.1 talks about structured data, and divides it into static and streaming,
but then the content is also about semi-structured data (e.g., second
paragraph of the section), which should be probably a section 2.2,
while the current section 2.2 may be moved to 2.3. Besides, the
distinction between static and streaming may be left outside. I propose
(C)ompulsory
(H)ighly
advisable
(O)ptional 1
C
C
C
Do the authors have to address the comment in order to make the deliverable final (Compulsory)? Is it
advisable but not compulsory to address the comment to make the deliverable final (Advisable)? Is it a minor
comment that is optional to be addressed by the authors for the final version (Optional)?
PlanetData
4
5
6
7
8
9
10
11
12
13
14
15
16
the following organisation:
2.1 Characteristics related to structuredness
2.1.1 Structured data
2.1.2 Semistructured data
2.1.3 Unstructured data (here I would leave all the properties that are
currently in sections 2.2.1, but without using subsections, but mainly
bullets, and summarising those sections 2.2.x).
2.2 Characteristics related to dynamicity
2.2.1 Static data
2.2.2 Streaming data
Funny enough, I do not agree anymore with the contents of the
reference on “Sequeda and Corcho [33]”. Anyway, I think that the
reference can be left there.
I don’t understand “Arguably, text is structured sound”
Section 3. Where is the first definition taken from? Is there a reference
to it? Besides, the title may be probably Self-describing data,
removing “Definition of”.
I don’t understand the first sentence of section 3.1 “Based on the
creation of XML documents…”
Rename section 3.2 title “Advantages” adding of what they are
advantages. Besides, this section would read better if subsections are
converted into bullet points.
Section 4 is in general ok. Only minor issues: in section 4.5, and
beyond, when you talk about the different syntaxes of RDF, I would
just leave RDF/XML and Turtle. The others are not relevant any more.
Still, the section is not completely aligned with the types of data
sources that you have been mentioning in section 2.
In section 5.1 the lifecycle stages that you mention are not the same as
those that you identify in section 4 (this also happens in 5.1.2, etc.).
This disalignment should be solved or explained. For instance,
interlinking, which is important in Linked Data, is not in section 4
clearly expressed. In section 5.1.4 you may refer to other RDB2RDF
tools, such as the ones that are described in the W3C WG on test
cases. Besides, maybe you want to rename Google Refine by Open
Refine, or mention that this will be the name, as announced in this last
wee.
In section 5.3 I think that it would be good to complement footnote 35
with a referenced saying that “this work has been jointly done by UPM
and Universidad Simón Bolívar (Edna Ruckhaus)”. Besides, didn’t
Jean Paul send you some text on the AirQualityEgg work that we have
been doing with people at COSM? I will talk to him.
Big space before start of section 5.4.2
Figure 10 does not really present a list of available datasets in
Windows Azure, as commented in page 41.
I think that section 5.4 would benefit a lot from a final table comparing
characteristics of the different data markets that have been analysed
Page 50 is empty
Conclusions are ok as a summary. Where are the lessons learned?
Which are the next steps to follow?
Deliverable D4.2
O
H
H
H
H
H
C
H
H
C
H
H
C
ADMINISTRATIVE (e.g layout problems (empty pages, track changes/comments
visible), broken links, missing sections (Introduction, Conclusion, etc.), incomplete TOC,
spelling/grammar mistakes
comment
©ompulsory
(H)ighly
Page 2 of (4)
Deliverable D4.2
1
Make sure that you complete for the final version the missing
information in the frontpage and in the document information sheet.
They are all empty, including the abstract.
PlanetData
advisable
(O)ptional 2
C
2
Do the authors have to address the comment in order to make the deliverable final (Compulsory)? Is it
advisable but not compulsory to address the comment to make the deliverable final (Advisable)? Is it a minor
comment that is optional to be addressed by the authors for the final version (Optional)?
Page 3 of (4)
PlanetData
Deliverable D4.2
Part II – Summary
overall marking G
Comments
VG (very good) / G (good) / S (generally satisfactory / P (poor)
The deliverable has improved wrt v0.1 that I read at Month 24. It provides now a better analysis
of the main requirements from open data scenarios in terms of self-descriptiveness, both for
publishers and consumers, with some current real examples on Linked Data, on Linked Sensor
Data, on news, and also on data marketplaces. I would have expected some additional
contributions on data licenses, privacy constraints and provenance, which are only briefly
mentioned in some places of the document, but I reckon that according to the description of this
deliverable in the DoW they are not compulsory.
After addressing the Quality Assessor’s comments, report back to him/her re-using this review form.
Page 4 of (4)