Epilogue - Donald Bren School of Information and Computer Sciences

Platforms for document
creation & wrap up
What applications do
people use?
What formats can be created?
What applications can read it?
Latex
Most computer science papers/journals/
textbooks are written in an open-source
document preparation system called LaTex
Based on ‘Tex’ computer typesetting
system developed by Donald Knuth
Why bother with it?
1) Its free
2) Its a very common and powerful publication tool
3) Can generate (presentation) PDFs, PS, html, etc...
It takes a bit of effort to learn, but
thereafter is easier than Word / Pages / etc.
Example
Cartesian closed categories and the price of eggs
Jane Doe
September 1994
Hello world!
To produce this in most typesetting or word-processing systems, the author would have to decide what layout to
use, so would select (say) 18pt Times Roman for the title, 12pt Times Italic for the name, and so on. This has two
results: authors wasting their time with designs; and a lot of badly designed documents!
Example
Cartesian closed categories and the price of eggs
Jane Doe
September 1994
Hello world!
\documentclass{article}
\title{Cartesian closed categories and the price
of eggs}
\author{Jane Doe}
\date{September 1994}
\begin{document}
\maketitle
Hello world!
\end{document}
Or, in English:
•
•
•
•
•
This document is an article.
Its title is Cartesian closed categories and the price of eggs.
Its author is Jane Doe.
It was written in September 1994.
The document consists of a title followed by the text Hello world!
How hard would this be in Word?
\documentclass[12pt]{article}
\newcommand{\piRsquare}{\pi r^2}!
!
% This is my own macro !!!
\title{My Sample \LaTeX{} Document}!
!
!
% used by \maketitle
\author{\L\"{a}rs Schl{\oe}ff\d{o}ng\"{e}n, }!
!
% used by \maketitl
\date{July 14, 2005}!
!
!
!
!
% used by \maketitle
\begin{document}
\maketitle!
!
!
!
!
!
% automatic title!
I typed this file with a plain text editor.
(I used \textbf{pico} and \textbf{emacs}.)
End
!
of
!
!
paragraph.
This is my second paragraph.
The area of a circle is $\pi r^2$; again, that is $\piRsquare$.
My score on the last exam\footnote{May 23} was $95 \pm 5$.
\section{Formulae; inline vs. displayed}
I insert an inline formula by surrounding it with a pair of
single \$ symbols; what is $x = 3 \times 5$?
For a \emph{displayed} formula, use double-\$
before and after --- include no blank lines!
$$\mu^{\alpha+3} + (\alpha^{\beta}+\theta_{\gamma}+\delta+\zeta)$$
\subsection{Numbered formulae}
Use the \emph{equation} environment to get numbered formulae, e.g.,
\begin{equation}
!
y_{i+1} = x_{i}^{2n} - \sqrt{5}x_{i-1}^{n} + \sqrt{x_{i-2}^7} -1
\end{equation}
\begin{equation}
!
\frac{\partial u}{\partial t} + \nabla^{4}u + \nabla^{2}u +
\frac12
|\nabla u|^{2}~ =~ c^2
\end{equation}
\section{Acknowledgments}
Thanks to my buddies {\AE}schyulus and Chlo\"{e},
who helped me define the macro \verb9\piRsquare9
which is $\piRsquare$.
The end.
\end{document}
% End of document.
Philosophy
It encourages people to focus on
content rather than appearance
MS Office: WYSIWYG (what you see is what you get)
Latex: “WYSIWYM” (what you see is what you mean)
Latex looks like a markup language (like HTML)
Content vs style
1
Object Detection with Discriminatively Trained
Object Detection with Discriminatively Trained
Part Based Models
Part Based Models
Pedro F. Felzenszwalb, Ross B. Girshick, David McAllester and Deva Ramanan
1
Abstract—We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able
to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges. While
deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the
PASCAL datasets. Our system relies on new methods for discriminative training with partially labeled data. We combine a marginsensitive approach for data-mining hard negative examples with a formalism we call latent SVM. A latent SVM is a reformulation of
MI-SVM in terms of latent variables. A latent SVM is semi-convex and the training problem becomes convex once latent information is
specified for the positive examples. This leads to an iterative training algorithm that alternates between fixing latent values for positive
examples and optimizing the latent SVM objective function.
Pedro Felzenszwalb, Ross Girshick, David McAllester and Deva Ramanan
Abstract
Index Terms—Object Recognition, Deformable Models, Pictorial Structures, Discriminative Training, Latent SVM
✦
This paper describes a system that uses a mixture of multiscale deformable part models for object
detection. Our system achieved state-of-the-art detection results in the PASCAL 2007 and 2008 challenges. While deformable part models have become quite popular, their value had not been demonstrated
1
on difficult benchmarks such as the PASCAL challenge. Our system relies heavily on new methods for
Object recognition is one of the fundamental challenges
in computer vision. In this paper we consider the problem of detecting and localizing generic objects from
categories such as people or cars in static images. This
is a difficult problem because objects in such categories
can vary greatly in appearance. Variations arise not only
from changes in illumination and viewpoint, but also
due to non-rigid deformations, and intraclass variability
in shape and other visual properties. For example, people wear different clothes and take a variety of poses
while cars come in a various shapes and colors.
We describe an object detection system that represents
highly variable objects using mixtures of multiscale deformable part models. These models are trained using
a discriminative procedure that only requires bounding
boxes for the objects in a set of images. The resulting
system is both efficient and accurate, achieving state-ofthe-art results on the PASCAL VOC benchmarks [11]–
[13] and the INRIA Person dataset [10].
Our approach builds on the pictorial structures framework [15], [20]. Pictorial structures represent objects by
a collection of parts arranged in a deformable configuration. Each part captures local appearance properties of
an object while the deformable configuration is characterized by spring-like connections between certain pairs
of parts.
Deformable part models such as pictorial structures
provide an elegant framework for object detection. Yet
discriminative training. We combine a margin-sensitive approach for data mining hard negative examples
with a formalism we call latent SVM. A latent SVM, like a hidden CRF, leads to a non-convex training
problem. However, a latent SVM is semi-convex and the training problem becomes convex once latent
information is specified for the positive examples. We believe that the framework described here will
eventually make possible the effective use of models based on rich visual grammars.
I. I NTRODUCTION
We consider the problem of detecting and localizing objects of a generic category, such as
people or cars in static images. One of the most challenging aspects of this problem is in
developing models that can effectively capture the significant variations in appearance that occur
within an object category. We have developed an object detection system that uses a mixture of
multiscale deformable part models. These models are trained using a discriminative procedure
that only requires bounding box labels for the positive examples in a set of images. The resulting
system is both highly efficient and accurate, processing an image in about 2 seconds and achieving
state-of-the-art recognition rates on the PASCAL datasets.
P. Felzenszwalb is with the Department of Computer Science, University of Chicago. E-mail: [email protected]
R. Girshick is with the Department of Computer Science, University of Chicago. E-mail: [email protected]
D. McAllester is with the Toyota Technological Insitute at Chicago. E-mail: [email protected]
D. Ramanan is with the Department of Computer Science, UC Irvine. E-mail: [email protected]
May 1, 2009
DRAFT
I NTRODUCTION
• P.F. Felzenszwalb is with the Department of Computer Science, University
of Chicago. E-mail: [email protected]
• R.B. Girshick is with the Department of Computer Science, University of
Chicago. E-mail: [email protected]
• D. McAllester is with the Toyota Technological Institute at Chicago. Email: [email protected]
• D. Ramanan is with the Department of Computer Science, UC Irvine.
E-mail: [email protected]
(1-line change in latex file)
it has been difficult to establish their value in practice.
On difficult datasets deformable part models are often
outperformed by simpler models such as rigid templates
[10] or bag-of-features [44]. One of the goals of our work
is to address this performance gap.
While deformable models can capture significant variations in appearance, a single deformable model is often
not expressive enough to represent a rich object category.
Consider the problem of modeling the appearance of bicycles in photographs. People build bicycles of different
types (e.g., mountain bikes, tandems, and 19th-century
cycles with one big wheel and a small one) and view
them in various poses (e.g., frontal versus side views).
The system described here uses mixture models to deal
with these more significant variations.
We are ultimately interested in modeling objects using
“visual grammars”. Grammar based models (e.g. [16],
[24], [45]) generalize deformable part models by representing objects using variable hierarchical structures.
Each part in a grammar based model can be defined
directly or in terms of other parts. Moreover, grammar
based models allow for, and explicitly model, structural
variations. These models also provide a natural framework for sharing information and computation between
different object classes. For example, different models
might share reusable parts.
Although grammar based models are our ultimate
goal, we have adopted a research methodology under
which we gradually move toward richer models while
maintaining a high level of performance. Improving
performance by enriched models is surprisingly difficult.
Simple models have historically outperformed sophisticated models in computer vision, speech recognition,
machine translation and information retrieval. For example, until recently speech recognition and machine
translation systems based on n-gram language models
outperformed systems based on grammars and phrase
Pros/Cons
Cons
• You canʼt see the final result straight away.
• You need to know the necessary commands for LaTeX markup.
• It can sometimes be difficult to obtain a certain ʻlookʼ.
Pros (to a markup language)
•
•
•
•
The layout, fonts, tables and so on are consistent throughout.
Mathematical formulae can be easily typeset.
Indices, footnotes and references are generated easily.
Your documents will be correctly structured.
Packages
There are many packages on top of TeX or LaTeX that will let you do anything from
using odd symbols, generating diagrams like trees and proofs, writing sheet music,
to CAD drawing and details like making sure all acronyms are spelled out at least or
exactly once, or just easing usage of features that are already present.
A look back
Written and oral communication are underrated skills for computer scientists
...well, really everyone
(Department made this a computer science class to prove it!)
A look back
http://www.joelonsoftware.com/articles/CollegeAdvice.html
Extreme case: life-or-death consequences
“Morals” of class
1) Written mechanics are dull but important
People have an ‘ear’ for sentences that sound wrong
ESL students (unfortunately) have to work harder
“Morals” of class
1) Written mechanics are dull but important
People have an ‘ear’ for sentences that sound wrong
ESL students (unfortunately) have to work harder
2) Focus on understanding audience’s perspective
What works for you when listening / reading?
Explaining things in clear manner is not easy
“Morals” of class
1) Written mechanics are dull but important
People have an ‘ear’ for sentences that sound wrong
ESL students (unfortunately) have to work harder
2) Focus on understanding audience’s perspective
What works for you when listening / reading?
Explaining things in clear manner is not easy
2) Revision is a key part of the writing process
"Teach Writing as a Process Not Product"
Donald M. Murray
“Writing is a way to end up thinking something you couldn’t’ have started out thinking”
Peter Elbow
Visual
Communication
1""$%2'34)#%567("8'3%&39:4;4<=
!"#$"#%&#$'()("*#$%+,'-.%//0
Official London Underground Map
B ?43*9;4C9<4"#3%9('%-"66"#
1'")(97>4-%?'(34"#%"@%A97
Geographic version of map
B D'E3797'(3F%<'G<:"".3F%<(94#4#)%69#*9;3F%3-4'#<4@4-%797'(3F%H%
How best to illustrate a concept to an audience?
B I('9<4#)%'@@'-<48'%$'34)#3%43%<46'J-"#3*64#)
Visual
Communication
:#$&+;%$+&%#$)#<)='%$-%$()*+&,'
>#12=)+<<,>&);%22%#$.)#<)?,#?2,
:#$&+;%$+&%#$
!"#$%&$
D%;12+&#')<'#;)J@K
@2+>,).,$.#'.)&#)=,&,>&)>#$&+;%$+&%#$.
AB+&&2,)#<)&C,)5+&,')D,$.#')E,&*#'-.F >#
How best to illustrate a concept to an audience?
5C,',).C#12=)*,)?2+>,).,$.#'.)&#)G1%>-2H)=,&,
Visual communication
How does one determine good principles?
http://www.smashingmagazine.com/2008/01/31/10-principles-of-effective-web-design/
‘User studies’
The secret to getting better?
Practice
Read and listen a lot
What makes a good news article versus a bad one?
What makes a good lecturer versus a bad one?
The secret to getting better?
Practice
Read and listen a lot
What makes a good news article versus a bad one?
What makes a good lecturer versus a bad one?
Write and present when you can
Webpages, blogs, journals, Wikipedia
UCI Toastmasters
Thanks
Please give feedback on evaluation forms
What worked?
What didn’t?
*Will give 1 percent extra credit to everyone that fills one out*