JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
343
Baseball Digest Production System using
Inductive Logic Programming
Masaru Miyazaki, Masahiro Shibata, and Nobuyuki Yagi
Science and Technology Research Laboratories
Japan Broadcasting Corporation (NHK), Tokyo, JAPAN
Email: {miyazaki.m-fk, shibata.m-mg, yagi.n-iy}@nhk.or.jp
Abstract— Here, we propose a technique to acquire knowledge for baseball digest video production using an inductive
inference approach. We integrated the concept of inductive
logic programming (ILP) and baseball game metadata
to enable learning of the highlight scene definition from
digest video produced by a TV director. ILP is a learning
method formed at the intersection of machine learning and
logic programming, and an ILP processor can acquire the
highlight scene definition by inductive learning from scenes
that are selected as highlights in sports news. This technique
makes it possible to generate a semantic digest automatically,
which includes not only score scenes but also attractive
scenes reflecting the director’s intention.
Index Terms— sports digest, inductive logic programming,
ontology, semantic web, Web Ontology Language, OWL
I. I NTRODUCTION
When program directors make sport digests, they must
select scenes that are considered attractive, and edit them
while watching the game video. Therefore, they require
excellent knowledge regarding both highlight scene selection and the sport being played, such as baseball.
In addition, they require advanced skill in the editing
operation. The realization of the system which extract
highlight scene candidate automatically and generate digest video makes the program production more efficient.
This paper describes the trial production of a digest
production system that can make baseball digest videos
easily by inductive learning from the information of
director-selected highlight scenes. Section II introduces
related work regarding automatic digest video generation
in comparison with our proposal method. In section III, a
technique to extract highlight scenes by inductive learning
is described. The first half of section III introduces the
baseball ontology and the Web Ontology Language(OWL)
metadata framework we proposed, and the second half
presents a technique to learn the concept of highlight
scenes using the OWL metadata. In section IV , we
describe the functions of our digest production system. In
addition, section V presents a system evaluation experiment using actual game data. In section VI, we discuss
the results of the experiment and future tasks.
This paper is based on “Baseball Digest Production System using
Inductive Logic Programming,” by Masaru Miyazaki, Masahiro Shibata, Nobuyuki Yagi, which appeared in the Proceedings of the IEEE
International Workshop on Data Semantics for Multimedia Systems
c 2008
and Applications (DSMSA), Berkeley, USA, December 2008. ⃝
IEEE.
© 2010 ACADEMY PUBLISHER
doi:10.4304/jmm.5.4.343-351
II. R ELATED WORK
For baseball, techniques to extract highlight scenes
using audio and video data to allow automatic generation
of digest video have been proposed. Rui et al. [12]
focused on detecting highlights using announcer’s excited
speech and bat-and-ball impact sounds. Xiong et al. [14]
used MPEG-7 audio feature and entropic prior HMM to
detect audience’s applause or cheering. Han et al. [2]
integrated multimedia features (image, audio, and speech
clues) using a maximum entropy-based method to detect
and classify highlights. However, these techniques involve
only the extraction of exciting scenes, and do not provide
a means of utilizing the fullest extent of the director’s
knowledge.
Scene extraction techniques that use domain knowledge about the sports game have also been proposed.
Lao et al. [5] used audio clues and context knowledge
to detect and classify various tennis events. However,
the context knowledge only describes the hierarchical
structure of a tennis game that is based on the heuristic
rules for tennis event detection. Hashimoto et al. [4] [3]
proposed a technique that uses particular game events as
parameters and calculates the situation importance before
events happen. It is possible to judge whether the scene
represents a chance of scoring using situation importance,
but the calculation method of the importance was fixed.
All directors have various criteria for judgment of whether
a scene is important or not. Our digest production system
uses the director’s knowledge to acquire and accumulate
these criteria.
III. D IGEST SCENE EXTRACTION USING INDUCTIVE
LEARNING
The use of metadata describing game situations is effective for automatic digest production. The metadata include
tags that indicate the scene information, such as hit, home
run, etc., and therefore digest video can be generated
automatically by collecting these batting scenes. However,
the digest video that is actually produced by directors
often includes not only home run scenes but also chances
to score, outs, and scenes that may attract TV viewers.
To extract highlight scenes that would be exciting to the
audience, a technique is required to judge the importance
of the scene from the game situation.
344
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
A. OWL Metadata Framework
We proposed a metadata system that can add higherlevel semantic information to the game metadata automatically [6]. We applied ontology technology commonly
used in research regarding the semantic web to baseball
game metadata. In this system, all metadata are described
as instances of Web Ontology Language (OWL) [1], and
considered scenes including both one pitching event and
one batting event as a minimum scene unit (sceneUnit)
in the metadata. Then, we added time code, event information, ball/strike/out count, score, runner information,
etc., to each scene unit. Fig. 1 shows part of the game
metadata corresponding to the situation shown in Table
I. This scene was selected as a highlight scene by the
director.
Metadata to describe the baseball game situation is
produced manually in real-time by well-trained annotators of sports data companies. These metadata are disTABLE I.
G AME SITUATION OF SCENE U NIT 0110102.
sceneUnit0110102
Batter ID
Pitcher ID
Ball Count
Strike Count
Out Count
Pitching
Runner
Pitching No.
Batting Result
...
player11549
player11034
1
2
1
Screwball
First
4
Homerun
...
<rdf:RDF xmlns:rdf="http://www.w3.org/...rdf-syntax-=ns#
xmlns:bo="http://www.nhk.or.jp/baseballont.owl#"
......
<owl:Ontology rdf:about=.....>
<owl:imports rdf:resource="http://../baseballont.owl"/>
</owl:Ontology>
......
<bo:SceneUnit rdf:ID="sceneUnit0110102" highlight="TRUE">
<bo:isSceneUnit rdf:resource="#atBat_01101"/>
<bo:pitchingNo rdf:datatype="...">4</bo:pitchingNo>
<bo:strike rdf:datatype="...">2</bo:strike>
<bo:ball rdf:datatype="...">1</bo:ball>
<bo:out rdf:datatype="...">1</bo:out>
<bo:onBase rdf:resource=".../bbont.owl#first/>
<bo:time rdf:datatype="..">2008-01-10T18:12:27<bo:time>
......
<bo:hasEvent>
<bo:Screwball rdf:ID="screwball_0110102_1">
<bo:agent rdf:resource=".../player.owl#player11034"/>
<bo:course rdf:resource=".../b_ont.owl#high_and_in"/>
</bo:Screwball>
</bo:hasEvent>
<bo:hasEvent>
<bo:HomeRun rdf:ID="homeRun_0110102_2">
<bo:agent rdf:resource=".../player.owl#player11549"/>
<bo:causedBy rdf:resource="#screwball_0110102_1"/>
</bo:HomeRun>
</bo:hasEvent>
......
</bo:SceneUnit>
tributed commercially and used by broadcasting stations
and sports web sites, etc. By translating the distributed
metadata, it is easy to produce the OWL metadata shown
in Fig. 1.
As mentioned above, an attractive digest cannot be
produced by a technique that simply collects scenes, such
as hit scenes, using game metadata. In our system, game
situations of the highlight scenes are described as OWL
classes of ontology in advance. Input game metadata are
analyzed automatically using this ontology. The system
can add higher-level information, such as “chance for
outcome reversal,” to the metadata, and accumulate it into
the game knowledgebase.
B. Baseball Ontology
Our baseball ontology includes about 200 base classes
and 140 properties. Fig. 2 shows part of the baseball
ontology. It mainly includes upper-lower (i.e. subClassOf ) relations and definitions of exciting game situations.
Complicated situations in the game can be described by
the director by combining these concepts. Fig. 3 shows an
example of a complex concept in our baseball ontology. It
defines the highlight scene “chance for outcome reversal”
in OWL. Using such high-level information, the system
can easily extract the complicated situations as highlight
scenes.
Thus, the director can retrieve applicable scenes easily
by making such definitions in advance. However, this
system has a number of drawbacks with regard to the
generality of the definition and the high cost of constructing these definitions manually. Therefore, an automated
technique is required that can learn the general concept
of a highlight scene from the features of a scene selected
by the director. This paper provides a novel technique to
acquire highlight definition by introducing an inductive
learning method into this framework.
Event
ScoringEvent
Swing
GameEndingHit
GameEndingHR
BattingEvent
Hit
PitchingEvent
Walk
ExtraBaseHit
Single
Homerun
Triple
SoloHR
ThreeRunHR
Double
BaseLoadedHR
......
</rdf:RDF>
Figure 1. Example of game metadata.
© 2010 ACADEMY PUBLISHER
Figure 2. Upper-lower relations in baseball ontology.
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
<rdf:RDF xmlns:rdf="http://www.w3.org/...rdf-syntax-=ns#
xmlns:bo="http://www.nhk.or.jp/baseballont.owl#"
......
<owl:Class rdf:ID="ChanceForReversal">
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<owl:Class rdf:about="#ScoringPosition"/>
<owl:Restriction>
<owl:onProperty>
<owl:DatatypeProperty rdf:about="#difOfScore"/>
</owl:onProperty>
<owl:hasValue rdf:datatype="...">1</owl:hasValue>
</owl:Restriction>
<owl:Restriction>
<owl:onProperty>
<owl:TransitiveProperty rdf:about="#hasEvent"/>
</owl:onProperty>
<owl:someValuesFrom>
<owl:Class>
<owl:unionOf rdf:parseType="Collection">
<owl:Class rdf:about="#Hit">
<owl:Class rdf:about="#Grounder">
<owl:Class rdf:about="#Fly">
</owl:unionOf>
</owl:Class>
</owl:someValuesFrom>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
</owl:Class>
......
</rdf:RDF>
Figure 3. Example of highlight scene definition in baseball ontology.
C. OWL Metadata Framework using ILP
Concept extraction using director-produced digest
video can be considered inductive learning from examples. In this work, we applied ILP [7] as a method of
inductive learning of the highlight scene definition. ILP
is the framework of inductive inference on the predicate
logic, which aims to obtain a classification rule in various
classification problems. We introduced the ILP system
Progol [8] into our system, and built a function that
can learn the knowledge of the director inductively. This
system can acquire the highlight definition by inductive
learning from user-selected scenes as training data without
the need to describe the definition manually. Fig. 4 shows
an overview of our system.
345
count, the names of the pitcher and batter, results of
batting, etc.). Fig. 5 shows the data generation process
for ILP.
In addition, Progol requires the bias for inference
control as input data. The bias defines mode declarations
and type information as well as other parameters. Mode
declarations declare a goal concept and various concepts
that define the goal concept. In this case, the goal concept
is the definition of the highlight scene, and so we set
it to be described by various concepts, such as pitching,
batting, scoring, runner information, etc. Type information
expresses the domain of instances, and corresponds with
the class declaration of each instance included in the
metadata. All input is given to Progol in Prolog clauses.
The game metadata shown in Fig. 1 can be described by
triples (subject-predicate-object expression). The structure
of the metadata is simple, so it can be translated uniquely
into Prolog clauses by automatically extracting the parts
available for learning from OWL metadata in N-Triple
form. Table II shows an example of game metadata
described by triples.
In Table II, the first line shows the instance
sceneUnit0110102 selected as a highlight scene by the
director, while the second line shows that the instance
sceneUnit0120103 is not selected. These are translated
into clauses of positive and negative examples, respectively. The third line shows that sceneUnit0120103 is
an instance of class SceneUnit in the baseball ontology.
The third line is translated into the clauses of the type
information. Lines 4 to 8 show ball/strike/out count,
OWL Metadata Framework
Director
Describe
Highlight Scene
Candidate
Baseball
Ontology
Game
Metadata
Game KB
Schema
Sport Data
Distributor
Automatic
Translation
XML
Scene Retrieval
by SPARQL
OWL
Metadata
Training Data
Highlight
Annotation
Highlight KB
Highlight
Definition
ILP Processor
(Progol)
Inductive Learning Section
D. Data for ILP
ILP requires a goal concept, examples of facts about
the concept, and background knowledge as input. The
examples are classified as either positive or negative
examples, which do and do not belong to the correct class
as a goal concept, respectively. Background knowledge
indicates features about the examples. A typical goal
concept in concept learning of a highlight scene is “What
type of situation is suitable for highlight scenes?” Positive
and negative examples are scenes (i.e., instances) that
are or are not selected as highlights in broadcast sports
news, respectively. Background knowledge shows the
game situations of a certain scene (e.g., ball/strike/out
© 2010 ACADEMY PUBLISHER
Figure 4. Digest production system overview.
TABLE II.
T RIPLES IN GAME DB.
Subject
sceneUnit0110102
sceneUnit0120103
sceneUnit0120103
sceneUnit0110102
sceneUnit0110102
sceneUnit0110102
sceneUnit0110102
sceneUnit0110102
...
Predicate
highlight
highlight
rdf:type
strike
ball
out
teamA_Score
onBase
...
Object
TRUE
FALSE
SceneUnit
2
1
1
1
First
...
346
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
runner information, etc., which indicate the situation of
sceneUnit0110102, and these are translated into the
clauses of background knowledge. Table III shows an
example of Prolog clauses that correspond to a positive
example, negative example, background knowledge, and
type information that are input data of Progol.
E. Learning of the Highlight Scene Definition by ILP
Fig. 6 shows the ILP process in highlight definition
extraction. Progol generates a hypothesis space (lattice)
by adopting the most specific hypothesis (MSH) which is
derived from positive examples and background knowledge. Then, Progol searches the hypothesis space to obtain
hypotheses that explain most of the positive examples and
the fewest negative examples. In addition, it outputs the
hypothesis with the highest evaluation value as a highlight
scene definition. Fig. 7 shows the output example of
Progol that corresponds to the game situation as shown
in Table III.
The first line of Fig. 7 is the definition that can be
readily imagined, but the second line shows the definition
of a remarkable scene regardless of the result of batting.
The system can extract a highlight definition not only
with regard to the simple hit scene but also complicated
situations such as chances to score.
Scene
selected
Metadata
Scene0100303
Positive
Example
Scene0100304
Scene
not selected
Scene0610201
select
belongs to correct
class as a goal
concept
does not
belong to
correct class
as a goal
concept
Scene0610202
select
Negative
Example
Scene0610301
Game situation
select
Scene0900401
Director
Background
Knowledge
F. Highlight Scene Retrieval
This system can accumulate the highlight definition
generated by ILP into the highlight knowledgebase.
Therefore, we can use the definition for scene extraction
from the game knowledgebase. The output of Progol
consists of the Prolog clauses shown in Fig. 7, and is
therefore not directly available for scene retrieval from the
game knowledgebase. Therefore, this system translates
the highlight definition in Prolog clauses to the query
language SPARQL [9] automatically, and applies it to
the game knowledgebase. Fig. 8 shows an example of
translation from the Prolog clauses to SPARQL.
IV. D IGEST PRODUCTION SYSTEM
We developed a prototype of a digest production system
that acquires a highlight definition automatically using
ILP. Fig. 9 shows the interface of the digest production
system.
This interface has the following functions.
• Metadata editing function:
The user can edit the game metadata using a dedicated editor, and can also make learning data for ILP
by adding the highlight information to the metadata.
• Highlight scene definition function:
Using the GUI, the user can make highlight scene
definitions by manually combining the concepts in
the baseball ontology.
• Inductive learning function:
The user can choose any game as the learning data,
and perform inductive learning processing of the
highlight scene definition. The highlight knowledgebase can accumulate the definitions extracted by
learning.
• Scene retrieval function:
The user can retrieve the scenes from any game in the
game knowledgebase by selecting baseball concepts
highlight(A):-hasevent(A,B),batkind(B,homerun).
highlight(A):-onbase(A,second),
onbase(A,third),strike(A,2).
Inductive Logic
Programming
Figure 7. Example of progol output.
Figure 5. Data generation process for ILP.
Highlight Knowledgebase
TABLE III.
C LAUSES FOR ILP.
Positive
Example
Negative
Example
Background
Knowledge
Type
Information
Prolog clauses
highlight(sceneUnit0110102).
...
:- highlight(sceneUnit0120404).
...
strike(sceneUnit0120404, 2).
ball(sceneUnit0120404, 1).
onBase(sceneUnit0120404, first).
...
sceneUnit(sceneUnit0120404).
atBat(atBat01204).
inning(inning01).
...
© 2010 ACADEMY PUBLISHER
Positive Ex.
Highlight scene definition
highlight(A) :- onbase(A,second)
, onbase(A,third),strike(A,2).
Progol
Negative Ex.
Background
Knowledge
Game Metadata
Translation
Game
Knowledgebase
SPARQL:
PREFIX bo: <http://www.nhk.or.jp/bo.owl#>
SELECT ?scene
WHERE {
?scene bo:onBase bo:third .
?scene bo:onBase bo:second .
?scene bo:strike 2ˆˆxsd:int .
}
Highlight scene candidate
Figure 8. Translation from clauses to SPARQL.
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
highlight(A).
Generates Hypothesis Lattice
(MSH Subset)
347
P=10
N=280
Evaluates each hypothesis
Strong Hypothesis
highlight(A):strike(A,1).
highlight(A):- P=6
onBase(A,first).
P=6
N=20
P=9
N=20
N=20
P=7
N=30
P=8
N=36
P=6
N=0
Weak Hypothesis
Outputs the hypothesis with
the highest evaluation value highlight(A) :hasEvent(A,B), hasEvent(A,C),
hasEvent(A,D), onBase(A,second),
onBase(A,first), onBase(A,third), strike(A,1), ball(A,0), out(A,2), ……
ballKind(D,slider), agent(B,player_10266),…… batKind(B,triple),
point(C,3), causedBy(C,B).
highlight(A) :- onbase(A,second)
, onbase(A,third),strike(A,2).
Most Specific Hypothesis
P=1
N=0
Figure 6. Process of ILP.
Figure 9. Digest production interface overview.
in the ontology or using highlight definition in the
highlight knowledgebase.
Using these interfaces, the highlight definition from
the metadata of one game can be applied to retrieval of
© 2010 ACADEMY PUBLISHER
highlight scenes from other games. This interface provides
a function to rearrange the highlight scene candidates and
produce a digest video. It is also possible to generate a
digest video automatically by playing back the highlight
348
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
scene candidates continuously.
V. E XPERIMENTS
A. Training & Test Data
We performed experiments using the learning method
to obtain highlight definitions from broadcast sports news.
First, we made a set of game metadata from 20 Japan
Professional Baseball games manually. In this experiment,
all metadata were generated manually. It was necessary to
annotate each sceneUnit while watching the game video,
and therefore it took about 2 hours for one game. However, in reality, our framework is not designed for using
manually generated metadata. It is compliant with the
game metadata format distributed by sports data company,
so it is not necessary to generate metadata manually.
In addition, we added highlight information to the
scene that was chosen for digest video in broadcast sports
news. Using our digest production interface, we can add
the highlight information in about 10 minutes per one
game. We extracted positive examples (scenes selected
as highlights), negative examples (scenes not selected
as highlights), and background knowledge (ball/strike/out
count, runner information, etc., for all scenes) from the
metadata of all games. Table IV shows the average
number of these training data in one game, and Table
V shows part of the background knowledge used in the
present study.
B. Highlight Scene Extraction using ILP
An outline of the experiments is presented bellow.
• Experiment 1
First, we used Progol and extracted highlight definitions from each game. In addition, we retrieved
scenes from the same game using these definitions.
TABLE IV.
N UMBER OF TRAINING DATA IN ONE GAME .(AVERAGE OF 20
GAMES )
Training Data
Positive Example (Highlight Scene)
Negative Example (non-Highlight Scene)
Background Knowledge
Number of data
10
282
6872
TABLE V.
L IST OF BACKGROUND KNOWLEDGE .
Type
Pitching Event (Ball Type)
Batting Event(Result)
Point
Base Information
Ball count
Strike count
Out count
Team A Score
Team B Score
Difference of Score
Pitching No.
Pitcher ID
Batter ID
...
© 2010 ACADEMY PUBLISHER
Value Example
FastBall,CurveBall,. . .
Hit,SwingAndMiss,. . .
1,2,3,. . .
First,Second,Third
0,1,2,3
0,1,2
0,1,2
0,1,2,. . .
0,1,2,. . .
0,1,2,. . .
1,2,3,. . .
pid002,. . .
pid023,. . .
...
We used all highlight scenes as positive examples,
and selected the same number of negative examples
at random .
• Experiment 2
We performed the same experiment according to the
procedure described for Experiment 1 but without
using negative examples.
• Experiment 3
Next, we selected 10 games as training data and another 10 games as test data at random, and extracted
highlight definition from the training data using
positive and negative examples. Then, we retrieved
the scenes by applying the highlight definitions of
training data to the test data. These processes were
repeated ten times. We used all highlight scenes as
positive examples, and selected the same number of
negative examples at random.
• Experiment 4
We performed the same experiment according to the
procedure described for Experiment 3 but without
using negative examples.
The highlight scene definition acquired from Experiment 3 and 4 are shown in Fig. 10 and 11, respectively.
C. Highlight Scene Extraction using Decision Tree
We also extracted highlight scene using decision tree
to confirm the advantage of ILP. We used the C4.5 [10]
[11] learning algorithm implemented in the data mining
tool WEKA [13] for rule extraction.
• Experiment 5
We used C4.5 and extracted a decision tree from each
game. We retrieved scenes from the same game using
these trees. We used all highlight scenes as positive
examples, and selected the same number of negative
examples at random.
• Experiment 6
We performed the same experiment according to the
procedure described for Experiment 5 but without
using negative examples.
• Experiment 7
Next, we selected 10 games as training data and another 10 games as test data at random, and extracted
10 decision trees from the training data. Then, we
highlight(A):-hasevent(A,B),batkind(B,homerun).
highlight(A):-onbase(A,third),pitchingNo(A,6).
highlight(A):-hasEvent(A,B),point(B,2).
Figure 10. Examples of highlight scene definition extracted in Experiment 3.
highlight(A):-hasevent(A,B),teamAScore(A,6),
pichingNo(A,4),ballKind(B,hit).
highlight(A):-hasEvent(A,B),strike(A,2),
batKind(B,swingAndMiss),strike(A,2).
highlight(A):-hasEvent(A,B),agent(B,pid20162),
batKind(B,grounder).
Figure 11. Examples of highlight scene definition extracted in Experiment 4.
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
retrieved the scenes by applying all decision trees of
training data to the test data. These processes were
repeated ten times. We used all highlight scenes as
positive examples, and selected the same number of
negative examples at random.
Table VI shows parameters of decision tree learning,
and Fig. 12 shows an examples of a decision tree extracted
in these experiments.
D. Evaluation
The average number of highlight definitions extracted
by ILP is shown in Table VII, and the average number of
leaves and size of the tree extracted by C4.5 are shown
in Table VIII except for Experiment 6.
We have defined the parameters Precision, Recall and
F-measure of scene retrieval and used them to measure
the performance of our system. Precision P , Recall R and
TABLE VI.
PARAMETERS OF C4.5 DECISION TREE LEARNING .
Confidence threshold for pruning
Minimum number of instance per leaf
0.25
2
Point
< = 0
> 0
TRUE
Top/Bottom
= Top
= Bottom
FALSE
Strike Count
< = 1
> 1
FALSE
349
F-measure F defined as
the number of correct highlight scenes retrieved
P =
the number of all highlight scenes retrieved
the number of correct highlight scenes retrieved
R =
the actual number of highlight scenes
2P R
F =
P +R
The results of the experiments 1, 2, 5, and 6 are shown
in Table IX.
The results of Experiments 1 and 5 indicated poor
evaluation by ILP in comparison with C4.5. Thus, the
learning ability of C4.5 is slightly better than ILP when
test data are the same as the learning data and negative
examples are used. In Experiment 2, the average number
of extracted highlight definitions was 3.45; the average
number of actual highlight scenes in one game was 10.
These definitions could properly retrieve about 58% of the
actual highlight scenes in the game. Compared with the
results obtained using decision trees (Experiments 5 and
6), the recall was low but ILP showed high precision and
F-measure. The decision tree showed better recall than
ILP. The results of Experiment 5 showed that the decision
tree could express each highlight scene. However, there
were trees that classified all highlight events using about
20 leaves. In Experiment 6, there were no negative
examples, and so all extracted decision trees had only
one leaf. This tree judged all scenes as highlight scenes,
and so the precision was so low.
Table X shows the results of Experiments 3 and 7. The
results of Experiment 3 indicated that we could extract
common highlight definitions of all games(e.g., Fig. 10),
such as “scoring scenes,” with precision of retrieval >
60%. However, the recall was low because we could
not extract the highlight definitions expressing exciting
situations peculiar to each game. Experiment 7 shows that
it is difficult to apply such a tree to other games because of
a lack of generality. Fig. 13 shows the tree with the most
leaves (only show “TRUE” leaves), and Fig. 14 shows
definitions corresponding to the tree.
TRUE
TABLE IX.
R ESULTS OF EXPERIMENTS 1, 2, 5, AND 6.
Figure 12. Example of decision tree extracted from one game.(number
of leaves = 4, size of tree = 7)
TABLE VII.
N UMBER OF EXTRACTED DEFINITIONS BY ILP.
Experiment
Experiment
Experiment
Experiment
1
2
3
4
Average number of
extracted highlight definition
2.3 (from 1 game)
3.45 (from 1 game)
4.44 (from 10 games)
26.8 (from 10 games)
TABLE VIII.
S IZE OF DECISION TREE .
Average number of leaves
Average size of trees
© 2010 ACADEMY PUBLISHER
7.3
9.1
Experiment 1 (ILP)
Avg. of 20 games
Experiment 2 (ILP)
Avg. of 20 games
Experiment 5 (C4.5)
Avg. of 20 games
Experiment 6 (C4.5)
Avg. of 20 games
P
R
F
0.12
0.68
0.19
1.0
0.58
0.7
0.23
0.88
0.26
0.03
1.0
0.06
TABLE X.
R ESULTS OF THE EXPERIMENTS 3 AND 7.
Experiment 3 (ILP)
Avg. of 10 times
Experiment 7 (C4.5)
Avg. of 10 times
P
R
F
0.61
0.44
0.51
0.18
0.60
0.20
350
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
Point
>0
< = 0
TRUE
BattingEvent
Sacrifice
Swing&Miss
Error
TRUE
Grounder
OutCount
TRUE
LineDrive
TRUE
> 1
TRUE
TRUE
Figure 13. Decision tree with the most leaves.
highlight(A):-hasevent(A,B),batKind(B,error).
highlight(A):-hasevent(A,B),batKind(B,lineDrive).
highlight(A):-hasevent(A,B),point(B,1).
highlight(A):-hasevent(A,B),point(B,2).
highlight(A):-hasevent(A,B),batKind(B,Homerun).
highlight(A):-onBase(A,first),pitchingNo(A,9).
highlight(A):-hasEvent(A,B),pichingNo(A,5),
agent(B,player_11774).
Figure 14. Highlight Definition corresponding to the above tree.
TABLE XI.
R ESULTS OF E XPERIMENT 4.
Experiment 4 (ILP)
Avg. of 10 times
P
R
F
0.13
0.65
0.21
Table XI shows the results of Experiment 4. In Experiment 4, we could extract many definitions that express
remarkable game situations peculiar to each game(e.g.,
Fig. 11). However, the precision was low because these
lacked generality to allow application to other games.
These results indicated that knowledge regarding negative
examples has a marked effect on the generality of the definitions obtained, and further studies regarding appropriate
amounts and types of negative example are necessary.
VI. C ONCLUSIONS & F UTURE P LANS
In this paper, we proposed a method that learns highlight scene definitions automatically from the metadata
of real digest video in broadcast sports news using ILP.
The experimental results showed that ILP can extract
highlight scenes with greater accuracy than C4.5 decision
trees, and indicated that the intention of directors to select
highlight scenes from game video is greatly influenced
© 2010 ACADEMY PUBLISHER
by the situation peculiar to each game. The results of
Experiments 3 and 7 showed that highlight definitions
extracted by ILP are general definitions, and therefore
can be applied to other games.
Some issues must be resolved to improve the accuracy
of the system. To increase the precision of retrieval, it is
important to use not only information from one sceneUnit,
but also relations between more than one sceneUnit.
The flow of the game is very important in the selection
of highlight scenes. Therefore, it is preferable to add
parameters that express the context of the game event
relevant to the background knowledge. Then, we can
expect to extract more general highlight definitions that
are common to all of the games.
In contrast, the experimental results indicated that the
recall is low in retrieval using ILP. In this experiment,
we used only game metadata (OWL instances) to obtain
highlight definitions by ILP, but OWL baseball ontology
itself did not contribute to the ILP process. In comparison
to the decision tree, ILP has the advantage that it can
make use of various types of background knowledge.
We can use not only the game situation but also player
information that is not related to the game and advanced
knowledge in ontology. To increase the recall of retrieval,
we are currently planning to introduce more appropriate
concept definition of our baseball ontology and detailed
game situation as background knowledge. For example,
the concept “ScoringPosition” which is described in our
baseball ontology, can be transformed to background
knowledge as shown in Fig. 15.
Definition in Baseball Ontology:
<owl:Class rdf:ID="ScoringPosition">
<owl:equivalentClass>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<owl:Class rdf:about="#SceneUnit"/>
<owl:Class>
<owl:unionOf rdf;parseType="Collection">
<owl:Restriction>
<owl:onProperty>
<owl:ObjectProperty rdf:about="#onBase"/>
</owl:onProperty>
<owl:hasValue rdf:resource="#second/>
</owl:Restriction>
<owl:Restriction>
<owl:onProperty>
<owl:ObjectProperty rdf:about="#onBase"/>
</owl:onProperty>
<owl:hasValue rdf:resource="#third/>
</owl:Restriction>
</owl:unionOf>
</owl:Class>
</owl:intersectionOf>
</owl:Class>
</owl:equivalentClass>
</owl:Class>
⇓
Background Knowledge:
scoringPosition(X):-sceneUnit(X),onBase(X,second).
scoringPosition(X):-sceneUnit(X),onBase(X,third).
Figure 15.
knowledge.
Example of translation from definition to background
JOURNAL OF MULTIMEDIA, VOL. 5, NO. 4, AUGUST 2010
The situations in which a runner is on the second or
third base are both chances to score. By integrating these
situations into one event class, scoringPosition, we can
expect to obtain more desirable generalization and high
recall value. Generally, OWL ontology includes many
anonymous classes, and so it is not easy to translate to
Prolog clauses in comparison with OWL instances. We
will continue development of the technique to transform
OWL ontology to unique Prolog clauses.
Our baseball ontology can be extended by translating
the highlight definition that obtained by inductive learning
into the concept definition of OWL. Furthermore, if highlight definition can be shared between various systems, it
will also become possible to generate digest video in the
viewer’s TV set-top box. We are also planning to convert
the highlight definition into a semantic web rule language.
This digest production system is fully compliant with
real-time baseball game metadata format distributed by
the sports data company in Japan. Therefore, it is possible
to receive such a metadata and extract the highlight scene
candidates automatically. In future, we will study about
the business model and issues regarding this framework
from the standpoint of practical application.
R EFERENCES
[1] Sean Bechhofer, Frank van Harmelen, Jim Hendler,
Ian Horrocks, Deborah L. McGuinness, Peter F. PatelSchneider, and Lynn Andrea Stein. OWL Web Ontology
Language reference. W3C Recommendation, 10 February
2004.
[2] Mei Han, Wei Hua, Wei Xu, and Yihong Gong. An
integrated baseball digest system using maximum entropy
method. In MULTIMEDIA ’02: Proceedings of the tenth
ACM international conference on Multimedia, pages 347–
350, New York, NY, USA, 2002. ACM.
[3] Takako Hashimoto, Takashi Katooka, Atsushi Iizawa, and
Hiroyuki Kitagawa. Significant scene extraction method
using situation importance. In ICDCSW ’04: Proceedings
of the 24th International Conference on Distributed Computing Systems Workshops - W7: EC (ICDCSW’04), pages
248–253, Washington, DC, USA, 2004. IEEE Computer
Society.
[4] Takako Hashimoto, Yukari Shirota, Atsushi Iizawa, and
Hiroyuki Kitagawa. A rule-based scheme to make personal
digests from video program meta data. In DEXA ’01: Proceedings of the 12th International Conference on Database
and Expert Systems Applications, pages 243–253, London,
UK, 2001. Springer-Verlag.
[5] Weilun Lao, Jungong Han, and Peter H. N. de With.
Automatic sports video analysis using audio clues and
context knowledge. In IMSA’06: Proceedings of the 24th
IASTED international conference on Internet and multimedia systems and applications, pages 198–202, Anaheim,
CA, USA, 2006. ACTA Press.
[6] Masaru Miyazaki, Takeshi Kobayakawa, Jun Goto,
Nobuyuki Hiruma, and Nobuyuki Yagi. OWL metadata
framework for a baseball Q&A system. In Proceedings of
the Poster Track. 5th International Semantic Web Conference (ISWC2006), 2006.
[7] Stephen Muggleton. Inductive logic programming. New
Generation Computing, 8(4):295–318, 1991.
[8] ”Stephen Muggleton”. Inverse entailment and Progol. New
Generation Computing, Special issue on Inductive Logic
Programming, 13(3-4):245–286, 1995.
© 2010 ACADEMY PUBLISHER
351
[9] Eric Prud’hommeaux and Andy Seaborne. Sparql query
language for rdf. Technical report, W3C, January 2008.
[10] J. R. Quinlan. Induction of decision trees. Mach. Learn.,
1(1):81–106, 1986.
[11] J. R. Quinlan. C4.5: programs for machine learning.
Morgan Kaufmann Publishers Inc., San Francisco, CA,
USA, 1993.
[12] Yong Rui, Anoop Gupta, and Alex Acero. Automatically
extracting highlights for TV baseball programs. In MULTIMEDIA ’00: Proceedings of the eighth ACM international
conference on Multimedia, pages 105–115, New York, NY,
USA, 2000. ACM.
[13] Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools and techniques, 2nd Edition. Morgan
Kaufmann, San Francisco, 2005.
[14] Ziyou Xiong, R. Radhakrishnan, A. Divakaran, and T. S.
Huang. Audio events detection based highlights extraction
from baseball, golf and soccer games in a unified framework. In ICME ’03: Proceedings of the 2003 International
Conference on Multimedia and Expo - Volume 3 (ICME
’03), pages 401–404, Washington, DC, USA, 2003. IEEE
Computer Society.
Masaru Miyazaki received the B.E. degree from Waseda
University in 1995 and the M.E. degree in computer science
from Tokyo Institute of Technology in 1997. He then joined
NHK (Japan Broadcasting Corporation) and has been with NHK
Science and Technology Research Laboratories since 2000. His
research interests include knowledge processing, information
retrieval technology, semantic web technology and contents
handling systems. He is a member of IEICE, JSAI, and ITEJ.
Masahiro Shibata is the Director of the Human & Information
Science Division of Science and Technology Research Laboratories of NHK(Japan Broadcasting Corporation). He graduated in
1979 from the Electronics Department, Faculty of Engineering,
Kyoto University, from which he also received his degree
of Master in 1981 and Ph. D. in 2003. He joined NHK in
1981, and has been engaged in research and development on
information retrieval technology, image database and contents
handling systems.
Nobuyuki Yagi is the Director of the Planning and Coordination Division at the Science and Technology Research
Laboratories of NHK (Japan Broadcasting Corporation). His
research interests include image and video signal processing,
multimedia processing, content production technology, computer
architecture, and digital broadcasting. He received the B.E.,
M.E., and Ph.D. degrees in electronic engineering from Kyoto
University, Japan, in 1978, 1980, and 1992. He joined NHK in
1980, and worked at the Kofu Broadcasting Station, the Science
and Technology Research Laboratories, and in the Engineering
Administration Department and Programming Department of
NHK. He was also an affiliate professor of the Tokyo Institute
of Technology from 2005 to 2008. He has contributed to
standardization activities at ITU, SMPTE, EBU, and ARIB. He
is a member of IEEE, IEICE, IPSJ, and ITEJ.
© Copyright 2026 Paperzz