Has anyone tried to replicate the steps for the NSW study? I am not able to get the propensity scores in the same range as shown in the table 5.15. I am getting the following instead.
Do you think the number of Treatment Vs non-treatment samples should also balanced. Though the book does not talk about it with respect to the NSW study. In this case the CPS samples are far greater in number when compared to the NSW study and a very small proportion of the individuals would be considered eligible to be included in the NSW program (so, mostly non-treated units). So, does this cause the model to predict almost everyone in the joint dataframe as belonging to the control group - just because of the sheer number of non-treated units?
I havent run the code but checking the additional python reference I’m using when reading this book your histograms look similar. My guess is just that the bin size is different leading to very different plots.
Thanks for your answer and sharing the git repo. I do have a questions though:
The range of propensity scores for the notebook you shared and what I got are roughly the same. But, both of these are very different from The mixtape book. Am I missing anything?