Monday, April 4, 2022

Train Horn Reflection





The Train Horn post from last month felt like a significant moment in StatistaGilfix’s history. I like most of my other longer posts, but the final product of Train Horns felt much more polished and potentially readable for a wider audience. And while some of my other posts took a ridiculous amount of time, the time that went into Train Horns was probably an order of magnitude above average. It was also the first “buddy post” (ie co-written) we’ve had here. And it was the first time I shared the blog anywhere beyond Twitter. So this seems like a good time for a step back. Plus, it’s never a bad time for reflection* 


*not true**; I constantly overthink things. 


**Also if you’re a soldier in a war you need to focus and not think too much about the guy you just killed (my fun-hating editor was not happy that this footnote made the final draft)



Soooo. Much, Time


It was shocking how long it took to research everything and turn it into a post, especially considering two of us were working on it. Very roughly, researching the history of train laws took 10 combined hours, researching and coming up with numbers for collisions and the costs of train horns took around 15 hours, the writing took around 10 hours, and the editing and memes took another 5 hours. So that’s 40 hours! And it honestly felt like much more than that, but maybe that’s because of how much “real-world time” elapsed. The idea for the blog came to me in June or July, then an outline was written in September, and then the writing started on November 26th and ended on February 8th.


Every one of the steps above felt like they should have gone at least double as quickly, but I think the reality is that it takes a long time to do a good job finding everything that might be relevant. We easily could have spent much more time going down new rabbit holes, looking for better papers, and reading the ones we found more closely!



Teamwork


Working together was certainly an experience. Pretty much every writing/research session ended in a heated argument (I don’t remember the reason for a single one of these) which was super fun. And yet we want to do more buddy posts, so I guess we think we’ve found a way to make it work better.


Aside from team chemistry issues, it was hard to decide the division of labor. In the end, the train law research was mostly split, SJ did most of the writing, I did most of the research that went into the cost-benefit numbers, and we both contributed quite a bit to editing the final product. I wonder how co-written books and articles usually do things. It seems really challenging to have different people writing different sections and have it read as one voice. Maybe we did a good enough job editing for this to not be an issue. 


My only division of labor complaint: we would have been better off with a little more teamwork on the thought process behind the cost-benefit numbers. The numbers we came up with were mostly from my research and my thoughts...which isn’t ideal since this was really the most important part of the post. 



Writing Approach


Trying to find a balance between detail and readability/engagement is a challenge. I think using memes (thanks to Nikhil Krishnan/Out Of Pocket for the motivation) is a good way to up the engagement without having to make any sacrifices to the actual text. But ultimately it’s hard to do a Serious Rigorous Analysis while keeping things short and interesting enough for someone who comes in with only mild interest. Every time I write something I end up with more and more respect for the people on my blogroll whose writing I just kind of take for granted.



Cost-Benefit Calculations


Coming up with a bunch of different parameter estimates was really fun. I’ve always liked Fermi problems so I suppose I shouldn’t have been surprised. It definitely made me want to do more analyses like this though, and I think there are a few things I would improve on with more experience. For example, I had never thought about creating confidence intervals outside of statistical models and that ended up being interesting in a couple ways. 


First, it’s really hard to come up with “real” confidence intervals for each parameter. When a parameter estimate is based on a paper or series of papers (ex. our estimate of property values decreases from train honks), you can probably take the confidence intervals from those papers and come up with some reasonable joint distribution based on how much you trust each paper. Even that is pretty subjective and fuzzy. For parameters that are mostly based on “original” research (ex. how many people live “close” to a train crossing), there’s really no obvious method to use here. I think doing lots of Fermi problems and forecasting/prediction market stuff is a pretty good way to get intuition on how to calibrate your confidence interval width. But even if that helps you construct better confidence intervals, it still leaves you with no explanation for your readers beyond “trust me on this one, my Brier score on Metaculus is ELITE“. We didn’t spend much time thinking about this, but in future iterations it would be worth coming up with a better system.


Secondly, the correct way to combine parameter estimates is unclear. We multiplied the lower bound estimates of several different parameters to get our “overall” lower bound on the costs/benefits of train honking. This gave us really wide confidence intervals, which I thought was just the unavoidable reality of multiplying several estimates. Now that I think more about it I’m not sure that multiplying the 95th percentile estimates on parameter A and B together (which is what we did) gives you an overall 95th percentile estimate. 


If A and B were completely independent, it seems like this would actually give you a 99.75th percentile estimate (1-0.05^2) because the chances of both A and B happening to fall in the top 5% of their distributions is much smaller than the chance of one of them falling there. If there is some independence between A and B, you should be able to take smaller values of these to get the 95th percentile. Definitely something to think about and look into when we do something like this again. Also, this discussion is a great example of the kind of thing that is difficult to include in a engaging blog.


I’m also curious how many applications there are for similar cost-benefit analyses. It seems useful to try to put some bounds on almost any big decision a person/company/government is making, but I would guess that in most cases the parameter estimates would be so uncertain and there would be enough multiplication (though see above) that you’d end up with some useless final numbers. The clearest cases I can think of where this would be useful are evaluating other “straightforward” laws and evaluating different charities ala Effective Altruism/GiveWell.



What’s The Point


I am constantly thinking about what I hope to get out of blogging and why I’m doing it. On the most basic level, I think as long as I am enjoying it that’s all that really matters. The other things competing for my time also don’t matter (reading and watching sports, mostly), so it’s not like I’m missing anything important by spending time writing. Some of the writing and research is fun (Train Horns was especially fun), but editing and making things flow better is definitely not. So hopefully there is more to this than just doing it for personal enjoyment.


One big goal of the blog is to improve my writing and thinking. I still haven’t decided if the point of improving these things is just that they will be helpful in life (career-wise, relationship-wise, epistemology-wise), or if I’m hoping I can improve enough to feel “worthy” of a wider audience. If I’m being honest, I do think it would be cool to have people reading and commenting on things I write. I think I can come up with enough ideas and analyses that would garner some interest, but I’m fairly doubtful that I will ever get to a point where my writing is at the level of other people I read online (especially since I’m only getting writing feedback from one person…). Maybe the buddy posts can help get around that to some extent (assuming SJ gets better at writing too). 


While writing to random people online would be fun, I would also be very happy to just have a few friends that consistently read and comment on stuff. Having friends who write for themselves would also be fun, motivating, good for bouncing ideas around, and would give me more perspective on this “what is the point” conundrum (peers are really important!). I’ve sent a handful of articles to a couple friends and gotten some feedback on those which has been nice. Sadly, I sent the Train Horn blog to 3 people and didn’t get any reads, which was pretty frustrating.


I also posted the link on an ACX open thread, which is the first time I’ve ever put the link in a place for randos to find it (I’ve posted a few on Twitter but only gotten one comment across 3 posts). Surprisingly I got 3 people commenting on that which was really cool (see 2 of them below)! In the future I think I’ll try the SSC sub-reddit and/or other similar ones for posts that I think deserve a real audience.



 



My Next Noise Nightmare


It’s been almost 6 months since I moved away from the SLP train tracks to Salt Lake City. And now I have another noise problem! We live next to a fairly busy road and every 15 minutes or so an extremely loud car drives by. At first this reallllllyyyyy bothered me. Now it’s just very annoying and stress-inducing. And it has me thinking…why are we letting people drive around in intentionally loud cars?


I haven’t decided if I want to write a post on this yet, but we could apply the same cost-benefit framework that we used for Train Horns to this problem. I think it would be much more challenging, though. Instead of using property values to figure out the benefits of banning loud cars (and enforcing that ban), we would actually have to estimate the costs to quality of life and health from loud cars. 


The costs would be much more complicated as well. With Train Horns, we just had to estimate the number of lives saved by horns and the value of a life. Here, we would need to estimate the costs of enforcing the ban on loud cars. Enforcement could mean:


There are other potential costs:

  • Whatever utility the psychopaths get out of having loud cars

  • More interactions with police could mean more dangerous interactions for both parties

  • The enforcement mechanism means taking money from people who might be poor and unable to pay for their car to be fixed


If I go through the effort of another post I will again attempt to remain unbiased in my calculations. But I would be extremely shocked if the number crunching leads to anything other than a policy recommendation of: “If car goes brrrrrrr, car owner’s money printer go brrrrrr”.



No comments:

Post a Comment