UncertaintyQuantificationfor MachineLearningandStatisticalModels DavidJ.Stracuzzi Jointworkwith:MaxChen,MichaelDarling, StephenDauphin,MattPeterson, andChrisYoung Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND2017-2966C InverseProblems D=M(T) T=M-1(D) § Mhasaknownform § DeterminingM-1 isill-posed(unstable,non-unique) § Example:determineEarth’ssubsurfacedensity(T)from gravitationalfieldmeasurements(D) § MisbasedonNewton’slaw,F=Gm/r2 § DisalmostcertaintyinsufficienttoparameterizeM-1 § Regularizeandoptimizetogeta“best”solution § Uncertaintycomesfrom § D § Regularization+Optimization(definitionof“best”) § Modelparameters 2 MachineLearningand StatisticalModeling D=M(T) T=M-1(D) § Mhasanunknown form § Substitutestatisticsfordomainknowledge § Uncertaintycomesfrom § § § § D Regularization+Optimization(definitionof“best”) Modelform(andparameters) Inference § Lackofdomainknowledgeimpactsselectionofmodel structure,regularizationmethods,anddefinitionof“best” 3 Example:SeismicOnsetDetection Given § Waveformdatacontainingboth noiseandseismicsignals Produce § Signalonsettime § Precisioniscriticaltodownstream processing Known § Wavepropagationtheory Unknown § Pointoforigin Simplified:signalis3D. Groundtruthdoesnotexist. § Eventsizeortype § Composition/densityofmedium(rock) 4 SeismicDetectionModels Model / , § ARMA(p,q):𝑥" = 𝑐 + 𝜀" + ∑)-. 𝜙) 𝑥"*) + ∑)-. 𝜃) 𝜀"*) BasicApproach § Modelthenoise(M1)andthesignal plusnoise(M2)separately § Optimizemodelparameters𝜃, 𝜙 viamaximumlikelihood § Akaike informationcriterion(AIC) toselectp,q § Pointatwhichtwomodelsmeet isthe“bestguess”signalonset 5 SeismicDetectionUncertainty Method1:ParametricBootstrap § Constructasamplingdistributionforthe inputwaveform § SampleM1 &M2 tocreatenewwaveforms § Fitanewmodeltoeachsampleandrecordk Method2:ModelSampling § M1=ARMA(p1,q1);M2=ARMA(p2,q2) § <p1,q1,p2,q2>determinesk andalikelihood § Samplefrom<p1,q1,p2,q2>andfitthedata § Usethelikelihoodstoconstruct probabilitydistributionoverk Notclearthattheseproducesimilardistributions 6 UncertaintyQuantificationforStatisticalModels inverse uncertainty quantification problem classic machine learning problem approximate unobservable inferred state solution uncertainty inference = inference errors + model induction / calibration model-form uncertainty input indirect regularization observations + regularization measurement + errors effects Key Question: How much do sensor observations really tell us about the world? • Stats / ML research communities do not typically frame questions this way. • Most work focuses on building a better statistical model • “What” the model says emphasized over “how well” 2 ImpactonDownstreamAnalyses § Severalanalysesdependononsettime: § § § § Location(hypocenter) Eventtype(naturalorman-made)&size Subsurfacetomography Earthmodel § Provideadditionalinformationtohumananalysts § § Distributionoveronsettimes(vs.confidenceinterval) Relativereliabilityofdatapointsandsensors § Relymoreoncurrentdata,lessonhistorical dataandmodelingassumptions § Improvesensitivity? § Nearterm: § § § Modelselection(improvefitanduncertainty) Validateagainstsynthetic&knownevents Compareuncertaintyresultsagainstanalystpicks 8 Example2:MultimodalImageAnalysis § Given § Co-registeredimageryfrommultiplesources § Produce § § Segmentationwithassociatedclassprobabilitiesand uncertainties,or Detectionofspecificobjects § Motivation § § § § Optical Image Trendtowardautomatedimageanalysis Trendtoward”layered”analysisandcombining informationsources Openquestion:Howreliablearetheresults? Openquestion:Whichdataisuseful? § Currently § § § Asmanyanalyticmethodsastherearetasks Someignoreuncertainty Othersevaluateitintheirownuniquelysaneway Lidar Image 9 Approach 1. Mergedatasources § 2. Samplethedatatocreateanew“image” § 3. 4. Pixel=R+G+B+Height Non-parametricbootstrap Probabilisticallyclusterthepixels § GaussianMixture § NonparametricMixture Recordtheclassprobabilitiesforeachpixel § Keepahistogramforeachclassateachpixel 5. Bootstrap:Repeat2,3,4manytimes § Histogramsrepresenttheposteriorprobability distributions 6. Futurework:Translateunsupervised classesintosemanticlabels 10 MultimodalImageAnalysis Source Imagery Probability Most Most LikelyofClass Likely Class Class Entropy Top row: optical image only Bottom row: combined optical and lidar imagery Uncertainty 11 CloserLookattheUncertainty Optical Only Combined Imagery 12 ValueofInformation § Evaluatedatasufficiency § Howmuchdoesagivendatasourcecontributetoananalyticresult? § § Whichdatadowereallyneed? Whichdatadowewishwehad? Delta Probability Source Max Likelihood Probability Map Delta Uncertainty ResourceTaskingUnderUncertainty Relax assumption that desired information is actually obtained from a scheduled collect Uncertainty Representations Stochastic Optimization Risk Additional Value Target / Environmental Conditions Quality Distribution of Acquired Information Optimum Risk Additional Value 0% Costs Quality/Validation/Compliance 100% Sensor Performance Characteristics Goal:Maximizeutilityofavailabledataandresources Approach:Quantifyimpactofdataondetectionquality § Useuncertaintyasbasisforvalue § Definevalueasimprovedlabel,geospatial,ortemporaldiscrimination § Incorporateinformationvaluesintooptimizationobjectivefunctions 14 Summary § Uncertaintyanalysisdoesnot“buildabettermodel” Itindicateshowwellagivenmodelcapturesthedata § Researchistobridgethetheory– applicationgap § § § Identifyingwhichinformationisuseful Diggingitoutfromdeepwithinalgorithms Propagatinguncertaintythroughlayersofanalysis § Importantquestions § What’stherelationshipbetweenuncertaintiesgeneratedby § Measurementerrors&datasampling(bootstrap) § Modelselectionandinductionprocesses(modelsampling) § Inference(MCMC) § …andhowdowecombinethem?Orshouldwe? § Whatissuesarisewhenwepropagateuncertaintiesfromonestatistical inverseproblemtoanother? 15 UncertaintyQuotes “Wedemandrigidlydefinedareasofdoubtanduncertainty!” – DouglasAdams,TheHitchhiker'sGuidetotheGalaxy “Itain’t whatyoudon’tknowthatgetsyouintotrouble. It’swhatyouknowforsurethatjustain’t so.” – UncertainOrigin “Uncertaintyisanuncomfortableposition. Butcertaintyisanabsurdone.” – Voltaire "Ignorancemorefrequentlybegetsconfidencethandoesknowledge…” – CharlesDarwin “Asfarasthelawsofmathematicsrefertoreality,theyarenotcertain; andasfarastheyarecertain,theydonotrefertoreality.” – AlbertEinstein POC: David Stracuzzi [email protected]
© Copyright 2026 Paperzz