Scalable Bayesian models

rahulB · July 11, 2022, 11:08am

Hi all,

At our organization we use R’s Lme4 for hierarchical mixed effects models. This requires the whole data to be loaded in memory. The data is quite huge and generally the lowest granularity (which is a random effect) has to be averaged. There’s also a push to adopt Bayesian approaches to generate credible intervals as they make most sense to business.
Given these requirements I was wondering if PyMC/Bambi or tf-probability does batch processing and can scale well with big data?

Nforde · July 11, 2022, 2:50pm

I’m not sure if this is exactly what you were thinking but i saw this announced in the pymc experimental repo the other week: pymc-experimental/histogram_utils.py at main · pymc-devs/pymc-experimental · GitHub

I’ve been meaning to try it, but if i understand the idea correctly i think you can sample from a histogram of your large data. Kind of like a bag of little bootstraps design…

Nforde · August 18, 2022, 7:59am

Just following up here. I wrote up an example of using this technique here:

LarsKar · August 19, 2022, 7:30am

Hi, maybe this is what you are looking for? pymc.Minibatch — PyMC 4.1.4 documentation

Topic		Replies	Views
Bayesian Regression in Bambi and PyMC for car example Probability and Regression Review	3	521	June 27, 2022
Bayesian Synthetic Control in PyMC Causal Inference Book Club	2	334	August 9, 2022
Dynamax for Multiple Time Series problems Dynamax	3	358	April 22, 2023
Microsoft transitioning Causal Inference library to Open Source community Causal Inference Book Club	3	308	June 8, 2022
New Bayesian Causal Inference package Causal Inference Book Club	1	350	December 20, 2022

Scalable Bayesian models

Related topics