A Protocol of a Systematic Mapping Study for Domain-Specific

A Protocol of a Systematic Mapping Study for
Domain-Specific Languages
Tomaž Kosar1 , Sudev Bohra2 , Marjan Mernik1
1 University
2 Carnegie
of Maribor, Faculty of Electrical Engineering and Computer Science,
Smetanova ulica 17, 2000 Maribor, Slovenia
Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213-3890, USA
1. Introduction
This document describes the protocol for a Systematic Mapping Study (SMS)
for Domain-Specific Languages (DSLs) and it is meant as an electronic supplement to the paper [17]. DSL research is spreading into many software development methodologies (e.g., Generative Programming, Product Lines, Software
Factories, Language-Oriented Programming, and Model-Driven Engineering),
vast areas of application domains (e.g., control systems, data intensive apps,
embedded systems, security, simulation, testing, web), and different development approaches (e.g., external and internal DSLs), it is hard to obtain a complete knowledge of the DSL research field, and foreseen DSL research trends.
Indeed, there is a substantial body of publications on DSL research (see for
example some expert literature reviews from the past [6][20]). Therefore, the
main objective is to perform SMS [13, 23] on DSLs for better understanding the
DSL research field, identifying research trends, and possible open issues. The
work on this SMS started in July 2013 and finished in October 2014 (with some
corrections from the period of May-August 2015).
Several guidelines were studied before starting SMS on DSLs ([13, 23]) as
well as reports on other SMSs (e.g., [8, 2, 1, 19, 9, 10, 18]). The guidelines for
performing Systematic Reviews (SRs) in Software Engineering (SE) [13] outlined
three phases: planning the review, conducting the review, and reporting the
review. The sub-tasks of planning the review are [13]:
• Identification of the need for a review. For this SMS see the discussion in
Section 2.
• Commissioning a review. According to [13] this step is unrequired for a
research team undertaking a review for their own needs, or for one being
undertaken by a student during his/her project or thesis (BSc., MSc.,
PhD.).
• Specifying the research questions. For this SMS see Section 3.
Preprint submitted to Elsevier
October 28, 2015
• Developing a review protocol. For this SMS see Section 4.
• Evaluating the review protocol. One of the benefits of SRs is that the
review protocol is precisely defined before the review is conducted. This
assures that the results will be unbiased and reliable (at least to some
extent). However, due to lack of appropriate funding the review protocol
will not be re-examined by a group of independent experts for this SMS.
The sub-tasks of conducting the review are [13]:
• Identification of relevant primary studies relating to the research questions
using the search strategy defined in the review protocol (see Section 4).
• Selection of primary studies based on previously defined inclusion/exclusion
criteria identifying those primary studies that provide direct evidence
about the research questions. Identified primary studies for this SMS
will be published on a project web-page.
• Study quality assessment, where it is investigated as to whether quality
differences provide an explanation for differences in study results (e.g.,
results from controlled experiments can be more trustworthy than from
observational study), or as a means of weighting the importance of individual studies when results are being synthesised. This step is optional
for SMSs and will be skipped in this SMS, too. On the other hand, only
peer-review papers will be considered and hence weak forms of quality
assessment will be achieved in this SMS.
• Data extraction and monitoring, where the data extraction forms will be
designed to collect the information needed to address the review questions and the study quality criteria (if included in the study). The data
extraction form is presented in Section 5.
• Data synthesis summarizing the results of the included primary studies.
The sub-tasks of reporting the review are [13]:
• Specifying dissemination mechanisms. As this is an academic study we
assume that dissemination of results will be achieved by publishing it in a
scientific journal and summarised on a web-page, where the basic results
will be found by practitioners.
• Formatting the main report, either as technical/project report, BSc/MSc./
PhD. thesis, conference or journal papers. We will submit the work to a
journal.
• Evaluating the report. This will be achieved in our case by submitting
the work to a peer reviewed journal.
The outlined three phases (planning the review, conducting the review, and
reporting the review) with the aforementioned sub-phases were later simplified
in [23] into five stages:
2
• defining research questions,
• conducting a search for primary studies,
• screening primary studies based on inclusion/exclusion criteria,
• classifying the primary studies, and
• data extraction and aggregation.
This simplified structure has been adopted by many researchers (e.g., [8, 24,
7, 19, 10, 18]), and even by the authors of the original guidelines [15]. Hence
we will also adopted it. We will also take into account the good practices and
lessons learned from applying the SR process within an SE domain [4].
2. Identification of the need for a review
The first step in performing SR, according to [13], is to ensure that a SR is
necessary in the first place. Researchers should identify and review any existing
SRs of the research topic under investigation. Although, annotated DSL bibliography [6] is not an example of SR, more than 70 publications have been summarised. Furthermore, DSL terminology, DSL advantages and disadvantages,
as well as DSL design methodologies, and DSL implementation techniques have
been discussed. This work can be classified as an ’expert literature review’
paper. Yet, another ’expert literature review’ paper is our own survey paper
on DSLs [20] where more than 150 primary studies were used to classify DSL
research work, and to find patterns during various phases of DSL development
(decision, analysis, design, implementation). However, this work was published
10 years ago, in 2005, hence it is reasonable to ask what has been the research
space of the literature in the field of DSLs after publishing the survey paper on
DSLs. The first attempt of SMS for DSLs was the work [22], which unfortunately was unsatisfactory and needed to be repeated. In particular, we didn’t
find study [22] very useful as the authors classify the primary studies regarding a
research focus with respect to keywords found in the primary studies and not as
already-established research focused within the DSL field. Although the authors
of [22] follow the guidelines on how to perform SMSs in SE [23] the outcome
- classification of research focus (in the authors’ words a DSL research type)
on ADL (Architecture Description Language), DSAL (Domain-Specific Aspect
Language), DSML (Domain-Specific Modeling Languages), external DSL, internal DSL, method or process, technique, and tools, are far from being satisfactory.
Our SMS will not be an exact replication of SMS in [22] due to differences in
research questions and classifications, as well as the inclusion of primary studies
(we want to concentrate solely on DSLs, and DSMLs will be excluded). Overall,
there is a need for SMS on DSLs (due to DSL’s broad nature) for summarising
recent knowledge about DSLs, in order to draw more general conclusions about
DSL research and to discover promising directions for DSL research in the near
future.
3
3. Specifying the research questions
In this section we report on research questions and the rationale behind them.
RQ1 : What has been the research space of the literature within the field of
DSLs since the survey paper on DSLs [20] was published 10 years ago?
RQ2 : What have been the trends and demographics of the literature within
the field of DSLs after the survey on DSLs [20] was published 10 years ago?
The research question RQ1 will further be split into three sub-questions.
RQ1.1 T ype of contribution : What is the main contribution of DSL studies
with respect to techniques/methods, tools, processes, and measurements?
The particular study would fall into:
• the ’DSL development techniques/methods’ category if the study’s main
contribution were to be a technique/method of any DSL development
phase: domain analysis, design, implementation (e.g., DSL compiler), validation, and maintenance (e.g., DSL evolution).
• the ’DSL development tools’ category if the study’s main contribution
were to be a tool that supported one or more phases of DSL development
(domain analysis, design, implementation, validation, and maintenance).
• the ’DSL processes’ category if the study’s main contribution were to be
the description of a flow from one phase into another (e.g., how outputs
from one phase could be used as inputs for another phase), or the DSL development process were to be discussed within a wider context of software
engineering (e.g., integration within a larger project), or a particular DSL
process were to be described (DSL debugging, DSL testing, DSL usability
test).
• the ’DSL measurement’ category if the study’s main contribution were
to be a proposal or application of metrics regarding the effectiveness of
DSL approaches (e.g., measuring comprehensibilities of DSL programs,
measuring the productivities of DSL users).
Note that we will not differentiate between ’techniques’ and ’methods’ to
support easier classification and replication of this study as was also suggested
in [26]. The rationale for such a decision would be that the difference were to
be subtle and as such might cause problems during classification. Moreover,
even the authors of primary studies might use different criteria and hence the
result of such classification would be rather unreliable. We will apply the following definition from [26]: “A method or technique is a solution to one or
more problems. It might be argued that a method is usually more related to a
process which involves humans in the loop while a technique is usually more focused on the automated part of such a process. As for distinguishing between a
4
method and a technique, the standard glossaries for software engineering, e.g.
IEEE 729 (IEEE 1983), do not allow a sharp distinction between ’method’ and
’technique’.” On the other hand, the SMS on DSLs conducted by [22] used the
following classification: method/process, technique, and tools. Moreover, this
classification was mixed within a research focus area: ADL, DSAL, DSML, external DSL, internal DSL. Overall, such a classification was rather unconvincing.
RQ1.2 T ype of research : What types of research methods have been used
in DSL studies?
Many different proposal for classifying the types of research methods exist
(e.g., [25], [11], [27]), but most SMSs (e.g., [8, 21, 24, 22, 18]) have followed
the guidelines [23] for classifying the types of research methods defined in [27]
due to simplicity. Some other possible classifications are presented in [25] (formal theory, design and modelling, empirical work, hypothesis testing, other) or
in [11] (conceptual analysis, conceptual analysis/mathematical, concept implementation (proof of concept), case study, data analysis, field study, laboratory
experiment, literature review/analysis, mathematical proof, and simulation).
Indeed, some SMS (e.g., [1]) have used partial classification from [11] and classified papers within: experiment, case study, conceptual analysis/mathematical,
descriptive, literature review, and survey. Such a classification might even be
harder to use. We have also decided to use classification as was suggested for
SMSs in [23] and is based on [27]:
• Opinion paper, which reports on authors’ opinions as to whether a certain
technique/method/tool/process/measurement is good or bad.
• Experience paper, which reports on authors’ experiences of certain techniques/methods/tools/processes/measurements, as used in practice.
• Philosophical/conceptual paper, which provides the taxonomy of a research field or a conceptual framework for structuring phenomena under
investigation/design.
• Solution proposal, which proposes a certain technique/method/tool/process/measurement as a solution to a particular problem and explains it
as small example or a good line of argumentation. However, technique/
method/tool/process/measurement has not yet been implemented.
• Validation research, where a certain technique/method/tool/process/measurement is implemented as a solution to a problem and validated by simulations, prototyping, experiments, mathematical systematical analysis,
mathematical proof of properties, etc. However, the implementation has
not yet been evaluated in practice.
• Evaluation research, where a certain technique/method/tool/process/measurement as a solution to a problem is implemented and validated in prac5
tice. Its benefits and drawbacks are evaluated by controlled experiments,
observational studies, or case studies.
However, we decided to include some improvements to the process of classification with the aim of being a more reliable and replicable process. In [23] it
was suggested that the types of research methods could be further classified into
empirical research (validation research and evaluation research), and into nonempirical research (opinion paper, experience paper, philosophical/conceptual
paper, and solution proposal). However, it seems that this coarse-grained classification has been unacceptable amongst SR researchers so far. In our opinion,
this broader classification is very useful for obtaining a broader picture of the
research field and is more reliable than the fine-grained classification.
For classification to be as uniform as possible we have also specified sufficient conditions for each particular research type as an example of agreed
interpretation [28]. A sufficient condition for experience paper is that the authors’ opinion was gained from practical experience. A sufficient condition for
a philosophical/conceptual paper is that with gained knowledge authors were
able to propose taxonomy or new conceptual framework. A sufficient condition
for the solution proposal is that with gained knowledge authors have proposed
a new solution, which has not as yet been implemented. A sufficient condition for validation research is that the proposed new solution is implemented
and hence validated at least by prototyping. Finally, a sufficient condition for
evaluation research is that a solution has been evaluated in practice by a controlled experiment (randomised experiment or quasi-experiment), observational
study, or case study. These conditions might also create an ordering amongst research types: opinion paper, experience paper, philosophical/conceptual paper,
solution proposal, validation research, evaluation research. It might be argued
that such ordering of research types would be too simplistic or unsuitable. For
example, one might value validation research more than evaluation research,
as in the former a new solution (technique/method/tool/process/measurement)
for a problem has been invented and scientifically validated (e.g., with mathematical proof), whilst in the latter this already implemented solution has only
been validated in practice, finding its true benefits (or drawbacks) in practice.
Yet another example, one might value a philosophical/conceptual paper more
than a solution proposal, as to develop a good taxonomy of a research field
requires complete knowledge, and often some generalisation skills. Whilst for
proposing a certain technique/method/tool/process/measurement as a solution
to a particular problem no such complete understanding of a research field is
necessary. Our aim regarding the ordering is not to trigger such debates or to
imply such conclusions (e.g., evaluation research is valued more than validation research). Instead, our ordering just naturally implies a research life-cycle,
which should have ended with validation in practice. This is vividly described
in [25] as: “Quite the contrary - new ideas are needed more than ever. But computer scientists must find out how good these ideas are and use experimentation
to guide them to the profitable ones.” We are convinced that such ordering of
research types by stating sufficient conditions, in addition to two-level classifi-
6
cation, would improve the reliabilities of research type classification and SMSs
in the future.
RQ1.3 F ocus area : Which research topics have been investigated in DSL
studies?
The following DSL development phases based on a DSL survey paper [20]
will be included: domain analysis, design phase, implementation phase, validation phase, and maintenance phase.
The research question RQ2, which is about the trends and demographics of
DSL research, will be further split into four sub-questions.
• RQ2.1 P ublication count by year: What is the annual number of publications within this field?
• RQ2.2 T op cited papers: Which DSL primary studies used in this SMS
are cited the most?
• RQ2.3 Active institutions: Rather than identifying researchers within
the DSL field we will opted for identifying DSL groups working at particular institutions. This will be measured by the number of published
papers. How DSL groups are connected together (being co-authors of a
same primary study)?
• RQ2.4 T op venues: Which venues (e.g., journals, conferences, workshops)
are the main targets of DSL papers? Note, that there have been no specialised DSL journals or conferences spanning many years. Hence, it would
be interesting to know where DSL researchers have been mostly published.
4. The protocol
According to Kitchenham and Charters [13] a protocol is “a plan that describes the conduct of a proposed systematic literature review.” In this section
the protocol of our SMS is described. In particular, which search string is used,
how to conduct the search for primary studies, what are inclusion and exclusion
criteria, and rules for classifying primary studies.
4.1. The Search String
There are many synonyms for DSL such as: application-oriented language,
special purpose language, specialised language, task-specific language, application language, and little language. In the guidelines [13] it is suggested that
synonyms should be used within the search string in order to broaden the literature search. In order to eliminate some threats to the validity of this study
due to possible omissions of synonyms within the search string, a pilot literature
search was performed during protocol design to verify whether the following synonyms had still been used in the research literature during the period 2006–2012.
7
The results showed that the following synonyms were, more or less unused anymore (increase in the number of hits was less than 0.05%): application-oriented
language, application language, and task specific language. The following synonyms had been used rarely (increase in the number of hits was 1%-2%): little
language and special purpose language. The synonym ’specialised language’ had
been used more often than other synonyms (increase in the number of hits was
5.5%). However, most of the publications had not described DSL research (they
had been from the linguistic field) and the relevant number of hits was less than
for ’little language’. Because these synonyms are rarely used nowadays we will
exclude them within the search string. As the acronym ’DSL’ is omnipresent
nowadays and used in some papers without introduction we will include it within
the search string. As this SMS will be started during the Summer of 2013 it
would be possible to also include within this study some recent primary studies which were published in 2013. However, in this case the replication of this
study would be extremely hard to achieve because publications published in the
second half of the year 2013 would have to be excluded manually. In view of
the aforementioned reasons it has been decided to use the following elementary
search string:
(”domain − specif ic language” OR ”DSL”)
AN D year > 2005 AN D year < 2013
The rationale for the search string is now fully explained.
4.2. Conducting the search for primary studies
According to the guidelines in [13] all relevant studies should be found whilst
performing SR, whilst this recommendation was relaxed later for SMSs [14].
Indeed, as indicated in [28] it is more likely that we will only deal with a subset
of all relevant publications. In order to have a rough indication as to how many
relevant publications exist and to decide how to conduct the search, we did a
preliminary search on the following Digital Libraries (DLs) (Table 1), which
were available to us. Preliminary screening has also shown that most of them
are relevant primary studies. In such a broad topic, as DSLs are, we have
become convinced that all relevant primary studies cannot be identified and
the inclusion of statistical methods would be needed in any case to produce
proper generalisation. We decided due to the broadness of publications on DSL
research, that our search for primary studies, automatic or manual, will be based
on the margin of error (confidence interval) [5]. We will include DLs until the
requested margin of error (confidence interval) is smaller or equal to 5%. The
margin of error in other sciences is commonly set between 5%-10% [3, 12, 16].
Hence, our search for primary studies will be driven by the level of precision.
When the margin of error is small enough an interpretation of data can be done
with high confidence. We can add manual search anytime into the suggested
process (as well as in the case when all DLs are exhausted). In the case that the
specified margin of error cannot be achieved and no new primary studies can be
8
start
?
defining research questions
?
defining a search string &
inclusion/exclusion criteria
-?
HH
H
can more HH
no
H relevant publications H
HH be found H
HHyes
?
select a new digital library or add
new publications by manual search
?
screening relevant publications
?
classifying primary studies
and data extraction
?
HH
H
H
nois requested marginHH
HH of error achieved H
HH
yes
H
?
aggregating results
?
reporting results
?
stop
Figure 1: SMS procedure
found by automatic or manual search we will stop the search process and report
on the achieved margin of error. The suggested process is described in Figure1.
In the above process the order of DLs is unspecified. We have decided to
start with the ISI Web of Science as the number of hits was the highest. If the
9
number of DLs from Table 1 will be insufficient with respect of the requested
margin of error we should add new DLs (e.g., GoogleScholar) or to perform a
manual search.
The search string specified in Section 4.1 has been customised for particular
DLs as follows:
ISI Web of Science:
(((T S = (”domain − specif ic language”) OR T S = (DSL)
OR T I = (”domain − specif ic language”) OR T I = (DSL))
AN D P Y = (2006 − 2012)) AN D SU = (Computer science))
IEEE Xplore:
((((”P ublication T itle” : domain−specif ic language) OR ”P ublication T itle” : DSL)
OR ”Author Keywords” : domain−specif ic language) OR ”AuthorKeywords” : DSL)
additional constraint : P ublication Y ear : 2006 − 2012
ACM Digital Library:
(”domain − specif ic language” OR DSL) and
(Keywords : ”domain − specif ic language” OR Keywords : DSL)
additional constraint : P ublication Y ear : 2006 − 2012
Science Direct:
pub − date > 2005 and pub − date < 2013 and
T IT LE−ABST R−KEY (”domain−specif ic language”) or T IT LE−ABST R−KEY (DSL)
additional constraint : [All Sources(Computer Science)]
.
4.2.1. Inclusion/exclusion criteria
The following inclusion/exclusion criteria have been defined. Similar criteria
can be found in many other SMSs (e.g., [8, 21, 24, 22, 18]).
The inclusion criteria:
• study must have addressed DSL research,
10
Table 1:
Digital Library
ISI Web of Science
IEEE Xplore
ACM Digital Library
Science Direct
Preliminary identification of relevant publications
accessible at
no. of publications
http:// http://sub3.webofknowledge.com
792
http://ieeexplore.ieee.org
527
http://dl.acm.org
361
http://www.sciencedirect.com
135
Σ 1815
• peer reviewed studies had been published in journals, conferences, and
workshops,
• study must be written in English,
• study must be accessible electronically, and
• computer science literature.
The exclusion criteria:
• irrelevant publications that lie outside the core DSL research field, which
also excludes DSMLs, modelware, and MDE publications, visual/graphical
languages (based on graph-grammars or other formalisms) or those mentioning DSL as future work;
• non-peer reviewed studies (abstracts, tutorials, editorials, slides, talks,
tool demonstrations, posters, panels, keynotes, technical reports);
• peer-reviewed but not published in journals, conferences, workshops (e.g.,
PhD thesis, books, patents);
• publications not in English;
• electronically non-accessible; and
• non-computer science literature.
The inclusion and exclusion criteria will be applied to the titles, keywords,
and abstracts. In those case where it will not be completely clear from the title,
keywords, and abstract that a publication really addressed the DSL research
then such publications will be temporarily included but might be excluded during the next phase (classification phase) when the whole publication (not only
the abstract) will be read. Hence, only publications that are clearly outside
the scope will be excluded during this phase. Possible mistakes made during
this phase will be largely eliminated. Search for primary studies and applying
inclusion/exclusion criteria will be done by the first and second authors.
11
4.2.2. Classifying the papers
Classification is one of the most critical and time-consuming steps when conducting SMSs. In regard to our research questions, it is not expected that all
relevant information will be inferred from the abstracts. Hence, the classification
will be done based on reading the whole primary study. It was recommended
[13] that the classification of primary studies is done by at least two authors,
and in the case of disagreement the authors need to decide how to resolve it.
However, such a process only increases the reliability of classification inside the
group, where probably the opinions of the more senior researchers would prevail
anyway. It was shown in [28] that this process didn’t improve the reliabilities of
classifications between different research groups. In the case of two independent
SMSs on product line testing 22 out of 33 papers were classified differently [28].
Due to these reasons we decided that classification of the primary studies will
be done by the third author, who is the most experienced in terms of published
relevant DSL research publications. We are convinced that reliability of classifications between different research groups in such manner will not be lower
than reported in [28] and can be only increased by more precise guidelines how
to classify primary studies (e.g., having “standarized classification scheme with
an agreed interpretation” [28]). However, we acknowledged that this decision
might introduce some bias, which is a valid threat to validity. To mitigate this
threat, at least partially, some improvements to the process of classification has
been suggested: two level classification and introduction of sufficient conditions
(see Section 3).
During this phase similar studies will also be identified (e.g., if a conference
paper had a more recent journal version we will include only the latter). For this
reason we will start classifying backwards, from 2012 to 2006. In order to avoid
a fatigue during classification (as a possible threat to validity), classification
will be performed in blocks of at most two hours followed by at least one hour
breaks.
5. The extraction form
Based on research questions (Section 3) and on the protocol specified in
Section 4 the following extraction form has been proposed (Figure 2), which
indicates what data will be extracted.
To support this extraction form a web based application has been developed
(Figure 3).
6. Divergences from the original Protocol
In our SMS [17] we slightly diverged from the protocol as described in Section
4. During the reviewing process of [17] a mistake was found in computing the
margin of error. This mistake was corrected in the period from May-August
2015, by classifying another 86 primary studies from ACM Digital Library, thus
achieving the requested margin of error. However, given that it was impossible
12
Figure 2: The extraction form for DSL systematic mapping study
to ensure a random sample of papers, the ambition to use the margin of error
was discarded [17].
13
Figure 3: Web application for DSL systematic mapping study
References
[1] A. Ampatzoglou, S. Charalampidou, I. Stamelos. Research state of the art
on GoF design patterns: A mapping study. Journal of Systems and Software,
86(7), 1945–1964, 2013.
[2] S. Barney, K. Petersen, M. Svahnberg, A. Aurum, H. Barney. Software
quality trade-offs: A systematic map. Information and Software Technology,
54(7), 651–662, 2012.
[3] J.E. Bartlett II, J.W. Kotrlik, C.C. Higgins. Organizational Research: Determining Appropriate Sample Size in Survey Research. Information technology,
learning, and performance, 19(1), 43–50, 2001.
[4] P. Brereton, B. Kitchenham, D. Budgen, M. Turner, M. Khalil. Lessons
from applying the systematic literature review process within the software
engineering domain. Journal of Systems and Software, 80(4), 571–583, 2007.
[5] W.G. Cochran. Sampling techniques. John Wiley & Sons, New York, NY,
1977.
[6] A. van Deursen, P. Klint, J. Visser. Domain-specific languages: an annotated
bibliography. ACM SIGPLAN Notices, 35(6), 26–36, 2000.
14
[7] F. Elberzhager, A. Rosbach, J. Munch, R. Eschbach. Reducing test effort: A
systematic mapping study on existing approaches. Information and Software
Technology, 54(10), 1092–1106, 2012.
[8] E. Engström, P. Runeson. Software product line testing a systematic mapping study. Information and Software Technology, 53(1), 2–13, 2011.
[9] A.M. Fernández-Sáez, M. Genero, M.R.V. Chaudron. Empirical studies concerning the maintenance of UML diagrams and their use in the maintenance
of code: A systematic mapping study. Information and Software Technology,
55(7), 1119–1142, 2013.
[10] V. Garousi, A. Mesbah, A. Betin-Can, S. Mirshokraie. A Systematic Mapping Study on Web Application Testing. Information and Software Technology, 55(8), 1374–1396, 2013.
[11] R. Glass, V. Ramesh, I. Vessey. An analysis of research in the computing
disciplines. Communication of the ACM, 47(6), 89–94, 2004.
[12] G.D. Israel. Determining Sample Size. Program Evaluation and Organizational Development, PEOD-6, University of Florida, Institute of Food and
Agriculture Sciences, 1992.
[13] B. Kitchenham, S. Charters. Guidelines for performing systematic literature reviews in software engineering. EBSE Techical Report, Keele University,
2007.
[14] B. Kitchenham, D. Budgen, P. Brereton. The value of mapping studies-a
participant-observer case study. Proceedings of the 14th International Conference on Evaluation and Assessment in Software Engineering, (EASE’10),
pages 25-33, 2010.
[15] B. Kitchenham, D. Budgen, P. Brereton. Using mapping studies as the
basis for further research - A participant-observer case study. Information
and Software Technology, 53(6), 638–651, 2011.
[16] S.S. Kohles, J.B. Roberts, M.L. Upton, C.G. Wilson, L.J. Bonassar, A.L. Schlichting. Direct perfusion measurements of cancellous bone
anisotropic permeability. Journal of Biomechanics, 34(9), 1197–1202, 2001.
[17] T. Kosar, S. Bohra, M. Mernik. Domain-Specific Languages: A Systematic
Mapping Study. Information and Software Technology, Submitted, 2015.
[18] M.A. Laguna, Y. Crespo. A systematic mapping study on software product
line evolution: From legacy system reengineering to product line refactoring.
Science of Computer Programming, 78(8), 1010–1034, 2013.
[19] A. Mehmood. D.N.A. Jawawi. Aspect-oriented model-driven code generation: A systematic mapping study. Information and Software Technology,
55(2), 395–411, 2013.
15
[20] M. Mernik, J. Heering, A.M. Sloane. When and how to develop domainspecific languages. ACM Computing Surveys, 37(4), 316–344, 2005.
[21] P.A. da Mota Silveira Neto, I. do Carmo Machado, J.D. McGregor,
E.S. de Almeida, S.R. de Lemos Meira. A systematic mapping study of
software productlines testing. Information and Software Technology, 53(5),
407–423, 2011.
[22] L.M. do Nascimento, D. Leite Viana, P.A.M. Silveira Neto, D.A.O. Martins, V. Cardoso Garcia, S.R.L. Meira. A Systematic Mapping Study on
Domain-Specific Languages. Proceedings of the 7th International Conference
on Software Engineering Advances (ICSEA’12), pages 179–187, 2012.
[23] K. Petersen, R. Feldt, S. Mujtaba, M. Mattsson. Systematic Mapping
Studies in Software Engineering. Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering (EASE’08),
pages 71–80, 2008.
[24] I.F. Silva, P.A. da Mota Silveira Neto, P. O’Leary, E. Santana de Almeida,
S.R. de Lemos Meira. Agile software product lines: a systematic mapping
study. Software: Practice and Experience, 41(8), 899–920, 2011.
[25] W. Tichy, P. Lukowicz, L. Prechelt, E. Heinz. Experimental evaluation in
computer science: a quantitative study. Journal of Systems and Software,
28(1), 9–18, 1997.
[26] P. Tonella, M. Torchiano, B. Du Bois, T. Systä. Empirical studies in reverse engineering: state of the art and future trends. Empirical Software
Engineering, 12(5), 551–571, 2007.
[27] R. Wieringa, N. Maiden, N. Mead, C. Rolland. Requirements engineering paper classification and evaluation criteria: a proposal and a discussion.
Requirements Engineering, 11(1), 102-107, 2006.
[28] C. Wohlin, P. Runeson, P.A. da Mota Silveira Neto, E. Engstrom,
I. do Carmo Machado, E. Santana de Almeida. On the reliability of mapping
studies in software engineering. The Journal of Systems and Software, 86(10),
2594-2610, 2013.
16