Do you think the number of Treatment Vs non-treatment samples should also balanced. Though the book does not talk about it with respect to the NSW study. In this case the CPS samples are far greater in number when compared to the NSW study and a very small proportion of the individuals would be considered eligible to be included in the NSW program (so, mostly non-treated units). So, does this cause the model to predict almost everyone in the joint dataframe as belonging to the control group - just because of the sheer number of non-treated units?
I havent run the code but checking the additional python reference I’m using when reading this book your histograms look similar. My guess is just that the bin size is different leading to very different plots.