Herwin van Welbergen, Yuyu Xu,
Marcus Thiebaux, Wei-Wen Feng,
Jingqiao Fu, Dennis Reidsma, Ari Shapiro
Testing the adherence to the BML standard by
realizers
(and, by this, also testing for weaknesses in
the standard)
Better software engineering practices for
virtual humans
◦ Daily automated testing (regression, acceptance, …)
bml1 starts (globaltime=10)
gaze1 starts (localtime=0)
gaze1 ready (localtime=1)
speech1 start (localtime=1)
…
speech1 end (locatime=10)
bml1 ends (globaltime=20)
BML Realizers are complex software
components
They form the backbone of virtual human
applications by research groups
Yet their testing has so far been limited to
time consuming manual inspection of the
execution of BML scripts
Automatic tests are automatic
◦ No wasted developer time
Automatic tests can be run often
◦ Early error detection
A realizer provides standardized interface
◦ BML goes in
◦ Feedback comes out
Several realizers implement this interface
We can make automatic tests for this
interface
These tests can be used for all realizers
implementing the interface
Message flow and behavior execution
Adherence to time constraints
Error handling
Acceptance testing
Send BML to a realizer, capture all feedback
Wait until the realizer has executed the BML
Verify assertions on the received feedback
<bml id="bml1">
<speech id="speech1" start="6">
<text>Hey punk <sync id="s1" />what do ya want?</text>
</speech>
<head id="nod1" action="ROTATION" rotation="X"
start="speech1:s1"/>
</bml>
What can we assert about the expected feedback?
- All default sync points remain in order for all behaviors
- The head nod will start within |ε| seconds of the start of “what”
- Speech1 starts within |ε| seconds of t=6
- There will be no exceptions or warnings
- The block will end
…
…
<bml id="bml1">
<speech id="speech1" start="6">
<text>Hey punk <sync id="s1" />what do ya want?</text>
</speech>
<head id="nod1" action="ROTATION" rotation="X" start="speech1:s1"/>
</bml>
@Test public void testSpeechNodTimedToSync() {
}
realizerPort.performBML(readTestFile("testspeech_nodtimedtosync.xml"));
waitForBMLEndFeedback("bml1");
assertSyncsInOrder("bml1", "speech1", "start", "ready", "stroke_start",
"stroke", "s1", "stroke_end", "relax", "end");
assertAllBMLSyncsInBMLOrder("bml1", "nod1");
assertBlockStartAndStopFeedbacks("bml1");
assertRelativeSyncTime("bml1", "speech1", "start", 6);
assertLinkedSyncs("bml1", "speech1", "s1", "bml1", "nod1", "start");
assertNoExceptions();
assertNoWarnings();
A generic testing framework that can test all
BML Realizers
Provides a set of tests + assertions
And functionality to easily author new tests
◦ Utility methods (e.g. waitForBMLEndFeedback)
◦ Custom assertions (e.g. assertLinkedSyncs)
Code and tests released on SF, together with
a nice set of example movies to showcase
various realizers
Using RealizerTester
◦ Additional tests for Elckerlyc specific functionality
◦ Regression testing
◦ Acceptance testing
Running all tests takes some time (~3 min.)
◦ This might discourage frequent testing by
developers
◦ => Run tests automatically on a continuous
integration server
Notify developers that commit a build that fails tests
Keep track of test result history
RealizerTester does not help in identifying the
exact location of errors
◦ It tests a Realizer as a black box
◦ Elckerlyc uses additional white box testing at a
smaller granularity to identify the exact location of
errors
>1000 unit + mid range tests, running in <10s
RealizerTester helps in identifying locations that require
more unit testing
Regular visual inspection is still valuable!
Visual regression testing
◦ Record a baseline
◦ Does it (approximately) look
like the baseline?
◦ Automatically check difference
with baseline
◦ If different
Bug: fix
New functionality: create new
baseline
Visual regression testing:
◦ checks whether the correct motion is generated
rather than whether the correct signals are sent
◦ is Realizer and character dependent
◦ does not provide acceptance testing
RealizerTester is complementary to visual
regression testing
Dealing with BML versions
Took 1 day to implement
Found some unclarities in BML (e.g. feedback
ordering) now addressed in 1.0
Found some minor implementation issues in
SmartBody
Did not find any interpretation differences in
the constraint satisfaction
Designing test cases + assertions
◦ Highlighted several cases in which the current BML
specification lacked detail or was unclear
Setting up a video corpus
◦ Highlighted some expressivity issues in the BML
standard
◦ Shows that several behaviors are executed in a
semantically equivalent manner
◦ Created a healthy competition between the
developers
◦ Motivated developers to move to better BML
compliance
The modularity proposed by SAIBA enables
the reuse of testing functionality
Designing a generic testing framework and a
shared video repository helps move the
standard forward
BMLRealizerTester provides a starting point
for a BML compliance test suite
XML format to author tests using BML and
assertions expressed?
Add more realizers (PLEASE CONTACT US!)
◦ Full BML compliance is not required
◦ Support for feedback is required
Include newest version of BML standard
discussed last Wednesday
Continue extending our repository of nice
example movies
http://realizertester.sourceforge.net/
contact: [email protected]
© Copyright 2026 Paperzz