NSW study code replication?

Has anyone tried to replicate the steps for the NSW study? I am not able to get the propensity scores in the same range as shown in the table 5.15. I am getting the following instead.

Also, the Figure 5.3 comes out to be like this


Let me know if anyone was able to replicate it.


Do you think the number of Treatment Vs non-treatment samples should also balanced. Though the book does not talk about it with respect to the NSW study. In this case the CPS samples are far greater in number when compared to the NSW study and a very small proportion of the individuals would be considered eligible to be included in the NSW program (so, mostly non-treated units). So, does this cause the model to predict almost everyone in the joint dataframe as belonging to the control group - just because of the sheer number of non-treated units?

I havent run the code but checking the additional python reference I’m using when reading this book your histograms look similar. My guess is just that the bin size is different leading to very different plots.

That’s not how regression should work with OLS. OLS regression should return the same coefficients.

Thanks for your answer and sharing the git repo. I do have a questions though:

  1. The range of propensity scores for the notebook you shared and what I got are roughly the same. But, both of these are very different from The mixtape book. Am I missing anything?

I noticed that as well. To be honest I also don’t know, i don’t know how to effectively read Stata code alone run it…

Sorry I can’t give you a better answer. I hope someone else able to chime in