1

Cautious Classification with Data Missing Not at Random using Generative Random Forests

Missing data present a challenge for most machine learning approaches. When a generative probabilistic model of the data is available, an effective approach is to marginalize missing values out. Probabilistic circuits are expressive generative models that allow for efficient exact inference. However, data is often missing not at random, and marginalization can lead to overconfident and wrong conclusions. In this work, we develop an efficient algorithm for assessing the robustness of classifications made by probabilistic circuits to imputations of the non-ignorable portion of missing data at prediction time. We show that our algorithm is exact when the model satisfies certain constraints, which is the case for the recent proposed Generative Random Forests, that equip Random Forest Classifiers with a full probabilistic model of the data. We also show how to extend our approach to handle non-ignorable missing data at training time.

Jan 1, 2021

Two Reformulation Approaches to Maximum-A-Posteriori Inference in Sum-Product Networks

Jan 1, 2020

Prediction of Environmental Conditions for Maritime Navigation using a Network of Sensors: A Practical Application of Graph Neural Networks

Jan 1, 2020

On the Performance of Planning Through Backpropagation

Jan 1, 2020

Learning Probabilistic Sentential Decision Diagrams by Sampling

Jan 1, 2020

Finding Feasible Policies for Extreme Risk-Averse Agents in Probabilistic Planning

Jan 1, 2020

Decision-Aware Model Learning for Actor-Critic Methods: When Theory Does Not Meet Practice

Jan 1, 2020

A Contact Network-Based Approach for Online Planning of Containment Measures for COVID-19

Jan 1, 2020

Robust Analysis of MAP Inference in Selective Sum-Product Networks

Jan 1, 2019

Exploring the Space of Probabilistic Sentential Decision Diagrams

Jan 1, 2019