Constrained Sampling
and Counting
Moshe Y. Vardi
Rice University
Joint work with: Supratik Chakraborty (IIT Bombay), Daniel Fremont (UC Berkeley),
Kuldeep Meel (Rice), and Sanjit Seshia (UC Berkeley)
1
How do we guarantee that systems work
correctly ?
Functional Verification
Formal verification
Challenges: formal requirements, scalability
~10-15% of verification effort
Dynamic verification: dominant approach
2
Dynamic Verification
Design is simulated with test vectors
Test vectors represent different verification
scenarios
Results from simulation compared to
intended results
Challenge: Exceedingly large test spaces!
3
Motivating Example
b
a
64 bit
64 bit
c = f(a,b)
64 bit
How do we test the circuit works ?
• Try for all values of a and b
• 2128 possibilities
• Sun will go nova before done!
• Not scalable
c
4
Constrained-Random Simulation
b
a
64 bit
64 bit
c = f(a,b)
64 bit
Sources for Constraints
• Designers:
1. a +64 11 *32 b = 12
2. a <64 (b >> 4)
• Past Experience:
1. 40 <64 34 + a <64 5050
2. 120 <64 b <64 230
• Users:
1. 232 *32 a + b != 1100
2. 1020 <64 (b /64 2) +64 a <64 2200
c
Test vectors: solutions of constraints
Proposed by Lichtenstein, Malka, Aharon (IAAI 594)
Constrained-Random Simulation
b
a
64 bit
64 bit
c = f(a,b)
64 bit
Sources for Constraints
• Designers:
1. a +64 11 *32 b = 12
2. a <64 (b >> 4)
• Past Experience:
1. 40 <64 34 + a <64 5050
2. 120 <64 b <64 230
• Users:
1. 232 *32 a + b != 1100
2. 1020 <64 (b /64 2) +64 a <64 2200
c
Problem: How can we uniformly sample the values of a and b
satisfying the above constraints?
6
Problem Formulation
Set of Constraints
b
a
64 bit
64 bit
c = f(a,b)
SAT Formula
Sample satisfying assignments
uniformly at random
64 bit
c
SAT Sampling
7
Roadmap
SAT Sampling
SAT Counting
Extensions
Future Directions
8
Diverse Applications
Constrained
Random
Simulation
Searchbased
Synthesis
Automatic
Problem
Generation
SAT Sampling
Probabilistic
Inference
Planning
under
uncertainity
9
Prior Work
UniGen
BDD
Guarantees
BGP
MCMC
SAT-Based
Performance
10
Core Idea
11
Partitioning into cells
Cells should be roughly equal in size and small
enough to enumerate completely
12
Partitioning into cells
Pick a random cell
Pick a random solution from this cell
13
How to Partition?
How to partition into roughly equal
small cells of solutions without
knowing the distribution of
solutions?
Universal Hashing
[Carter-Wegman 1979]
14
Universal Hashing
Hash functions: mapping {0,1}n to {0,1}m
(2n elements to 2m cells)
Random inputs => All cells are roughly equal (in expectation)
Universal family of hash functions:
Choose hash function randomly from family
For arbitrary distribution on inputs => All cells are roughly equal
(in expectation)
15
Strong Universality
H(n,m,r): Family of r-universal hash functions mapping
{0,1}n to {0,1}m (2n elements to 2m cells)
r: degree of independence of hashed inputs
Higher r => Stronger guarantee on range of size of cells
r-wise universality => Polynomials of degree r-1
Stronger universality => Higher complexity
16
Hashing-based Approaches
Solution space
n-universal hashing
All cells required to be small
Uniform Generation
BGP Algorithm (Bellare et al, 2000)
17
Scaling to ~0.8M Variables
Solution space
n-universal hashing
3-universal hashing
From tens of variables to
~0.8M variables!
Random
BGP Algorithm
All cells are small
Uniform Generation
UniGen
Only a randomly chosen
cell needs to be “small”
Almost-Uniform Generation
18
Underlying Hash Functions
A cell can be represented as the conjunction of:
Input formula F
m random XOR constraints
2m is the number of cells desired
Use CryptoMiniSAT for CNF + XOR formulas
19
Strong Theoretical Guarantees
Uniformity
Almost- Uniformity
UniGen succeeds with probability > 0.52
20
18000
case47
case_3_b14_3
case105
case8
case203
case145
case61
case9
case15
case140
case_2_b14_1
case_3_b14_1
squaring14
squaring7
case_2_ptb_1
case_1_ptb_1
case_2_b14_2
case_3_b14_2
2-3 Orders of Magnitude Faster
Timeout: 18000 seconds
1800
180
Time(s)
18
UniGen
1.8
XORSample'
0.18
Benchmarks
21
Results: Uniformity
500
450
400
Frequency
350
300
250
200
150
100
50
0
184
208
228
248
268
288
#Solutions
• Benchmark: case110.cnf; #var: 287; #clauses: 1263
• Total Runs: 4x106; Total Solutions : 16384
22
Results: Uniformity
500
US
450
UniGen
400
Frequency
350
300
250
200
150
100
50
0
184
208
228
248
268
288
#Solutions
• Benchmark: case110.cnf; #var: 287; #clauses: 1263
• Total Runs: 4x106; Total Solutions : 16384
23
Roadmap
SAT Sampling
SAT Counting
Works inspired from core ideas
Future Directions
24
What is SAT Counting?
Given a SAT formula F
RF: Set of all solutions of F
Problem (#SAT): Estimate the number of solutions of
F (#F) i.e., what is the cardinality of RF?
E.g., F = (a v b)
RF = {(0,1), (1,0), (1,1)}
The number of solutions (#F) = 3
#P: The class of counting problems for
decision problems in NP!
25
Practical Applications
Wide range of applications!
Estimating coverage achieved
Probabilistic reasoning/Bayesian inference
Planning with uncertainty
Multi-agent/ adversarial reasoning
[Roth 96, Sang 04, Bacchus 04, Domshlak 07]
26
Counting through Partitioning
27
Counting through Partitioning
Pick a random cell
Total # of solutions= #solutions in the cell
* total # of cells
28
Strong Theoretical Results
ApproxMC (CNF: F, tolerance: e, confidence:d)
Suppose ApproxMC(F,e,d) returns C. Then,
ApproxMC runs in time polynomial in log (1-d)-1,
|F|, e-1 relative to SAT oracle
29
Mean Error: Only 4% (e: 0.75)
3.6E+16
1.1E+15
3.5E+13
1.1E+12
Count
3.4E+10
Cachet*1.75
1.1E+09
3.4E+07
Cachet/1.75
1.0E+06
ApproxMC
3.3E+04
1.0E+03
3.2E+01
1.0E+00
0
10
20
30
40
50
60
70
80
90
Benchmarks
Mean error: 4% – much smaller than the
theoretical guarantee of 75%
30
Roadmap
SAT Sampling
Model Counting
Extensions
Future Directions
31
Extensions
Improved performance of UniGen
Distributed sampling after preprocessing
~20 SAT calls per sample
Weighted (non-uniform) sampling and counting
32
Roadmap
SAT Sampling
Model Counting
Extensions
Future Directions
33
Extension to More Expressive
Domains (SMT, CSP, ASP)
Efficient strongly-universal hashing schemes
Solvers to handle F + Hash efficiently
CryptoMiniSAT has fueled progress for SAT
domain
Similar solvers for other domains?
34
Key Takeaways
Constrained sampling and counting are fundamental problems
with wide variety of applications
Prior methods failed to scale or offered very weak theoretical
guarantees
UniGen: The first scalable generator with theoretical
guarantees of almost-uniformity
ApproxMC: The first scalable approximate SAT counter
Extensions of underlying techniques in different contexts
Visit: www.cs.rice.edu/~kgm2/ for papers/tools/source code!
35
Backup Slides
36
Can Solve a Large Class of Problems
70000
Time (seconds)
60000
50000
40000
ApproxMC
30000
Cachet
20000
10000
0
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
Benchmarks
Large class of problems that lie beyond the exact
counters but can be computed by ApproxMC 37
Exploring CNF+XOR
Very little understanding as of now
Eager/Lazy approach for XORs?
How to reduce size of XORs further?
38
Weighted Counting
Ref: “Distribution-Aware Sampling and Weighted Model Counting for
SAT” (In Proc. of AAAI 2014 )
39
Weighted Counting
Given
CNF Formula F
Weight Function W over assignments
Problem
What is the sum of weights of satisfying assignments?
Example
F = (a ∨ b)
W([0,1]) = W([1,0]) = 1/3 W([1,1]) = W([0,0]) = 1/6
W(F) = 1/3 + 1/3 + 1/6 = 5/6
40
Partition into (weighted) equal
“small” cells
41
Partition into (weighted) equal
“small” cells
Pick a random cell
Pick (by weight) a random solution from this cell
42
Can you always achieve
partitioning?
What if one solution dominates the entire solution
space
Tilt = wmax/wmin
Small tilt → All solutions contribute
43
How to handle large tilt?
.001
.001
.001
.001
.002
.001
.002
.002
.992
Tilt = 992
44
Handling Large Tilt
.001
.001
Can be achieved
with
Pseudo-Boolean
.002
.001
.50 ≤ wtSolver
<1
Still
not
Optimization
.001a SAT problem
.001
.002
.002
.001 ≤ wt < .002
.992
45
© Copyright 2026 Paperzz