• A Trip To Green Bank

    I took a trip out to Green Bank recently, home of the Green Bank Telescope, to do some upgrades on the onsite computer systems that process incoming data from the telescope, but also to take a cool tour and learn more about the workings of radio telescopes.

    It’s really big! The highest point is about 480 feet off the ground, and the dish is 300 feet across. It feels like going up the Eiffel Tower - you’re usually surrounded by metal frameworky parts as you go up, and it’s hard to take it all in. Imagine a 35 story building, but the top half can rotate around to point at different things in the sky.

    Here’s what it looks like from the ground nearby.

    gbt from below

    This picture is from the next morning - to go up to the top, the telescope has to be in “maintenance position”, where they point it so that the top tower part is vertical, so you can go up the elevator.

    Basically, the way it works is it needs two degrees of freedom so that it can point anywhere in the sky. The bottom is built on 16 legs that rest on a circular train-like thing that rotates the whole thing around. Then the “C” shape in the middle is like the edge of a big gear that rotates the dish up and down.

    When there’s too much snow on the dish, they just tilt the thing sideways to dump the snow off.

    Radio waves come in from places in the universe, bounce off the big dish, and then bounce off the little dish you can see way up at the top, called the “subreflector”. The platform right below the subreflector has the receivers, which are the sensors that measure incoming radio waves.

    gbt receivers

    The teacups are the receivers. The big circle on the floor is like the part on a microscope that you can rotate to use different lenses - it rotates so that you can put different receivers in the beam. Different receivers measure different ranges of frequency of radio waves.

    In the room below this, each of these circle things is connected to basically a refrigerator. Sometimes it’s a refrigerator with another refrigerator inside it, with another refrigerator inside that. The sensors have to be really cold because they are so sensitive, they pick up background heat as radio noise. No pictures in that room because of secret stuff.

    me over gbt

    Here’s a selfie of me standing at the lower level, by the big dish. Yeah I gotta trim my beard. You can just walk out onto the dish. But it doesn’t have a railing or anything, occasionally a panel falls off, and it’s hundreds of feet off the ground. Usually I consider myself “not afraid of heights” but this is the sort of trip that makes you reflect on where your limits are.

    Once the receivers measure the radio signals, they output an analog signal, it goes to a datacenter nearby, and then the things that happen there are pretty normal things, from a software engineering point of view. The telescope produces data at a much faster rate than there is bandwidth out of the facility, so you have to do your heavy processing onsite.

    It was a fun trip. I really appreciated the opportunity to see the telescope and to have a captive audience of engineers and astronomers that I could pepper with questions.

  • Books On Africa

    I’ve been reading various things related to Africa in one way or another recently, I have three strong book recommendations on the topic, and I thought I’d write a bit about them.

    Dancing In The Glory Of Monsters

    It’s rare that I read a book and end up feeling that I liked the book, I tried to pay close attention, and yet all the information flowed right through my head and trying to hold onto it was like trying to catch a river in my fingers.

    There are just so many details. So many ethnic groups, so many rebel groups, so many acronyms, so many refugee camps, so many cities where an atrocity happened. And yet the overall picture steadily becomes a bit clearer. Millions of people died in this series of wars in the Congo. It’s like a fractal story of chaos and loss.

    The book is gripping. Like Game of Thrones, new characters enter, fight for power, characters exit with gory violence and death. It’s hard to stop reading.

    I have three takeaways from this book. One is simply that Westerners underrate the amount of disaster and evil that has happened in central Africa because it’s so distant. Two is that many organizations in the Congo have an intertwined mix of elections, war-violence, and making money. In Western countries these things are much more separate.

    The third takeaway is how impossible it is for NGOs to operate in the Congo without themselves becoming intertwined in the status quo. Paying taxes and fees in the Congo isn’t like paying taxes and fees in Denmark. Sometimes it is directly funding some local warlord who is spending the money ramping up violence. The outcome of spending money in war-torn regions of Africa is very non-obvious.

    If you want even more of this sort of book, try The War That Doesn’t Say Its Name. Similar stuff, more academic, less un-put-downable.

    Being Good In A World Of Need

    This book is aiming at the philosophical question of, should we donate aid to Africa via NGOs? I found it to be a pretty compelling critique of the Effective Altruism idea that it’s possible to find a few specific charities that are more effective than others, whether it be malaria nets, deworming, or other causes.

    The fundamental problem is that the GiveWell analysis of “this charity saves one life for every $3000 you spend” is not believable. They do a couple statistical studies that show malaria nets saved lives in one case. But overall, aid to Africa doesn’t really seem to be lifting African countries out of poverty. We know that many, many studies in social sciences don’t replicate. Why should we believe the small amount of research that suggests these particular charities work over the research that shows the opposite?

    GiveWell pretty much ignores the political side effects of funding NGOs in Africa. Corrupt governments end up taking in a lot of money through taxing and charging fees on NGO activities. These organizations aren’t just funding health care, they are also funding violent dictatorships.

    There is also a question of substitution. This isn’t really in the book, it’s just something I’ve noticed after working with nonprofits for a little while. Often there are donors who really want their money to be spent with particular conditions. But the people running the nonprofit have their own ideas of what’s important. Often you have a set of nonprofits that are essentially run by the same group, and they raise money in different ways, but it all goes essentially into the same budget. Your most restricted money, you allocate that first. The money you have for the overall organization, you allocate that last, for whatever areas you need to fill in the gaps.

    In this world, donors simply don’t get to choose where the marginal dollar goes. Constraints on a funding source that’s less than half the budget just waste management time. The whole idea of making a choice about the marginal dollar is a construct designed to encourage people to donate. When you’re donating money to nonprofit causes, you should understand that. Either fund a new organization, or let the leadership allocate money.

    I worry that NGOs in a particular country essentially merge to become the same nonprofit organization. They have to work together, right? What sense does it make for multiple organizations to distribute malaria nets to the same small town? But that means you can’t really have an influence on how they behave via donations. You have to trust the overall group of cooperating NGOs to be doing the right thing.

    So these two books together worry me that so much of the Against Malaria Foundation work is basically sending more resources into the Congo.

    A Bend In The River

    Different sort of book. Fiction! This book is a masterpiece.

    Salim grows up in a coastal region of Africa, and the book follows his attempts to run a business in central Africa. It’s a different angle toward a similar theme, to me, which is that so many different cultures and interactions between the cultures happen within the region that I often just lump as “Africa” in my mind.

    Chaos, politics, love, violence, just like the nonfiction this book left me feeling like I didn’t necessarily understand anything any better, but I started to realize how much I didn’t know. The range of emotions Salim feels is completely real but as if the author tapped into a different set of emotions than you will read about elsewhere.


    If you are interested in the history of Africa, read the first book. If you are interested in donating money to African causes, read the second book. If you aren’t interested in either of those things, read the third book. Enjoy!

  • On 'Talent' and Software Engineers

    Sometimes a person sticks in your mind at the age you last knew them well. In my imagination, sometimes I remember my youngest sister as 11 years old, the age she was when I left for college, even though she’s an adult now, lives near me, we see each other and do adult things.

    Similarly, to me Daniel Gross is a 19-year-old getting ready for Demo Day, full of nervous energy, pitching Greplin with as much confidence as he can muster. But, as time goes on, this memory is getting further and further away from the reality, in which Daniel is a successful investor, cowriting a book with Tyler Cowen.

    The Book

    It’s a very interesting book. Talent. Nominally it is about how to “discover talent”. Part interviewing people to determine whether they have talent, part discovering them in the first place.

    The striking thing to me is that the sort of talent Dani and Tyler are looking for is a somewhat unique sort. Daniel is looking for start founders, and Tyler is looking for different positions but often for Emergent Ventures which is sort of like venture capital but not necessarily so capitalistic. They want unusual, creative, leaders who have the potential to do something that nobody else has done, the potential to change the world in some way.

    So, personally I have done a lot of interviewing, and I have a lot of opinions about it. And as I read this book, I compared it to my own experience, and I found myself wanting to be that sort of talent rather than wanting to be able to identify that sort of talent.

    What browser tabs do I have open? That’s an interview question? Well, it seems like a fair one. Maybe, in my life, I should occasionally evaluate myself on how interesting the browser tabs are, that I have open.

    Can I imagine the world being very different than it is today? I want to be able to imagine the world being very different than it is today.

    Software Engineers

    Nevertheless, reading this book makes me contrast it to hiring software engineers, the sort of hiring I have done the most of. When you hire a software engineer, it’s really important to evaluate whether they can code pretty well. You ask them to code in some way, and evaluate how well they do it. All these other issues - do they engage with other cultures, are they thoughtful, do they spend their spare time obsessing about work - I have met great software engineers on both sides of these issues and I feel like they are secondary.

    Great software engineers are just not very alike in personality. Some of them are quiet, don’t engage much in issues outside of software engineering, and simply write good code. Some of them are intellectually engaging, loud, quick-talking, disagreeable, and write good code. Many of them do not speak English well or are nervous talking to strangers and it will be very hard to assess any of their personality traits in an interview. There is some sense in which you want maturity - you don’t want someone whose interpersonal issues will tank a team - but most people are just fine on that front, your job as an interviewer is more to filter out the occasional psychopath.

    There are two points about hiring software engineers that I think are underrated. One is that how your team interviews quickly becomes more important than how you personally interview. Once you have a few software engineers, most of your interview-time will be performed by people who are not you. So you have to spend a lot of time teaching, encouraging people to share good practices, helping out the people who are not great at interviewing, and figuring out a process for combining the interview feedback from different interviewers. Those things quickly become more important than how you yourself interview.

    The other point about hiring software engineers is that for most great software engineers, the interview is not that important, because it’s pretty obvious you’re going to hire them. I would say 6/8 of the best 8 engineers I have ever hired, it was just an easy consensus from all the interviewers that this person was great and we should hire them. Instead, the important thing is “top of funnel”. You need to get those great engineers talking to you and interested in applying for a job with you in the first place.

    It isn’t easy to do this, and I am not the best at doing this, and this post is getting long, so I’ll just say… good luck.

    Hiring Talent-Style

    This book seems pretty relevant if you are a CEO or a VC. If you’re looking for a company founder to invest in, you are betting on a future where they do something unique. You want to bet on someone doing something unique.

    Similarly, as a CEO you are often hiring someone to be the head of some functionality in your company, and you want people who can take control of weird new initiatives. Your first product manager, or a head of a functional group of the company, you need unusual sorts of leadership here. Maybe all sorts of product management could use some of this.

    And it’s worth reading anyway if you are not hiring for any of these roles. Because maybe you want to be the sort of person who can achieve a weird new thing. Maybe reflecting on these qualities will help you realize that you want some of them.

    By the way, I have two tabs open right now. The GitHub repo for this blog, and how to install Jekyll on Ubuntu. Take that for what it’s worth!

  • Reason And Revelation

    Earlier I was reading this biography of Abraham Lincoln and the section about Deism was interesting. Lincoln was attacked by his opponents at some points for being an atheist, because he briefly dabbled with considering himself a “Deist”. I remembered reading sometime before that some founding father types were Deists - as I check up via Google it sounds like Benjamin Franklin, Thomas Jefferson and George Washington were. So there is a bit of American president tradition there.

    So I was curious, why bother? Because deism doesn’t seem like a big thing nowadays; people simply call themselves atheists or just don’t really talk about it and answer surveys “no religion” while also not really calling themselves atheists. I thought I’d read a pro-deism book to see.

    The Age Of Reason

    Man, this book is really something. This is the most anti-Bible book I ever read. “New atheism” isn’t even talking about the Bible any more, Dawkins and Sam Harris aren’t really arguing against the Bible, they’re arguing against people who feel abstractly like “religion is good” but even so are loosely affiliated.

    The Age Of Reason is much more like, hey we need to stop treating this book like it’s the word of God! The book of Mark disagrees with the book of Luke in these seven points! You can tell by an analysis of pronouns used that the gospel of Matthew was not actually written by Matthew! Moses advocates the rape and slaughter of thousands of innocent children in Numbers 31:13-47! Thomas Paine doesn’t literally use a lot of exclamation points but it feels like he wants to.

    This isn’t really a debate I can even engage with. It’s no longer in the Overton window, at least where I live in California and with the set of people I pay attention to on Twitter. It’s more like a portal into the past, giving me a glimpse of a world where the most intelligent American and European statesmen and scientists would frequently refer to the Bible as a source of factual knowledge.

    There is an interesting idea of “reason vs revelation”. Revelation is another concept I rarely think of nowadays. Thomas Paine is primarily making the point that revelation to a character in history is no longer revelation for someone else; now it’s trusting the word of another.

    Sometimes you read two seemingly unrelated books and a concept ties them together. Later on I read…


    Wilson’s thesis in this book is roughly that math, science, the social sciences, arts, ethics, religion, all of these things should fuse into one unified endeavour. It’s an interesting book and I recommend it to people interested in science but not only interested in science.

    The part that reminded me of the “reason vs revelation” concept was a discussion of why scientists liked doing science. Science is fundamentally about reason, about using reason to solve problems and figure out the natural world. But why you do science can’t be justified in the same way. In practice different scientists have very different motivations for wanting to do it in the first place. Some are religious, some aren’t. But the combination basically works.

    Perhaps some amount of revelation is necessary. With reason alone, can you ever get an “ought” from an “is”? Even if you rationally piece together why some course of action is correct, you find yourself working back to axioms where you get stuck. How much should we worry about animal welfare? You can boil it down to a question of one human life is equal to X chicken lives. What is X? How can this possibly be answered with reason alone?

    Aumann’s Agreement Theorem

    This “theorem” has always bugged me. I notice that I blogged about it seven years ago. It states that rational people should never disagree, and tries to prove it with some Bayesian reasoning and mathematical arguments.

    In practice, groups of smart people very often disagree with each other. A group of 100 people where everyone agrees on everything, that isn’t a gathering of smart, rational people, those people are groupthinking in some way.

    Perhaps this too is related to the reason vs revelation breakdown. If a person cannot operate entirely on reason, if they need at least a little bit of revelation as well, then you cannot entirely agree with other people. This little bit you disagree with them on, you don’t know if it’s because they’re smart and rational and they know something you don’t, or if their revelation is at work, putting its finger on the scale. And they don’t really know either. And it’s okay. You just need to figure out some pluralist way to work together despite not entirely agreeing on everything.

    Startups need a little bit of revelation. You need to be a little too excited, to believe in your idea a little too much, and it doesn’t work to rationally think “Startups require you to believe in your idea a little too much, so rationally I’m going to take my belief level and kick it up a notch.” It’s like Pascal’s wager; real belief doesn’t work that way.

    So What?

    Do you need more revelation? Do you want more revelation? It’s hard to just sit down and get excited about something. But sometimes it’s contagious, sometimes it’s worth seeking out other people who are a little too excited about their weird ideas. Sometimes it will spread to you.

  • A Math AI Minisurvey

    I’m curious about the idea of using AI to discover math proofs, so I thought I’d write up my thoughts on some of the relevant papers. This is very much not a complete list, but I feel like these papers define my “mental frontier” in some way.

    If there are other papers you think are equally cool and relevant, I’m interested to hear about it.


    The general strategy here is to hook up a text-based transformer to a proof assistant. They chose Metamath for the proof assistant, and it’s interesting that rather than use an existing Metamath implementation they basically wrote their own in Python. There’s a search 128 steps into the graph, with each step guided by the transformer.

    The “graph” here is like the move tree in chess or go. Each step is something like, prove statement X given all this stuff we know to be true. So you don’t really change the state of the search as time goes on, you just add to it. The tree in chess is also not really a tree once you consider that some boards you can reach via multiple move orders, but it’s logically more tree-like.

    Choosing Metamath is tough. Metamath is kind of like the assembly language of proof assistants. A single step in a language like Lean can take many steps in Metamath. The core problem here is that it’s annoying and buggy to hook up any external logic to a proof assistant. All the proof assistants are using their own weird languages, their own weird compilers, these really strange toolchains. So this does make a ton of sense as a practical decision - Metamath was simple enough they could just implement their own.

    Still, if you can only search for 128 steps, there are single Lean tactics that will expand to more Metamath steps than that.

    Polu 2022

    This one is a cooler successor. It’s using Lean instead of Metamath - presumably someone did a bunch of grunt work to hook it up. Props.

    There’s a transformer here trained on CommonCrawl and the “WebMath” corpus. I’m impressed that anything like this works at all. Maybe it’s just fine to train on a huge blob of data that contains a nonsense superset, there’s a bunch of extra space used but if you have the space then logically I guess the capability is the same.

    They train the model to find the next tactic used in a proof, and to minimize the size of the resulting proof. Makes sense to minimize the size of a proof - proofs have a weird property where technically you can add all sorts of extraneous statements into a proof, and the proof is equally valid. If you aren’t sure what to do, you do have some incentive to just write down a slew of true statements. (I got a 3/7 partial credit on an IMO geometry problem for doing this, and accidentally doing something useful.) When learning you really should prefer a shorter, more beautiful proof, though.

    Fine-tuning used a dataset which I believe is derived from lean’s mathlib. The lean mathlib community is quite impressive. 20 commits per day in March. I’m not quite sure how it compares to the other theorem proving communities, but the other ones tend to be less “one open source project with a swarm of activity” and more “many different groups doing different things” so I feel like Lean is the most likely to win out.

    It’s tough that they had to manually formalize some training problems. Getting reasonable training data is really hard for these math proofs.

    One aspect of this prover that I think is under-recognized is how much it uses complex Lean tactics. nlinarith, field_simp, norm_num, these are each really complicated manual theorem-proving tactics. There might be more code in nlinarith than in the AI-training components of this system. In one sense, this is good because as people make more tactics it will make this sort of approach stronger. In another sense, this approach seems like it will be restricted to the areas where we have really good tactics.

    Transformers overall

    I feel like transformers for proof generation are really hamstrung by a lack of training data. Transformers would love to have billions of proofs to learn patterns from. This seems achievable in some cases, like solving some systematic variety of problem by generating a lot of them. But many problems don’t work this way.

    I am curious whether a more reductionist approach would be useful, where you specifically try to generate “interesting problems”, in a sub-area like algebraic inequalities or Euclidean geometry. In practice, mathematicians do seem capable of differentiating interesting problems from boring problems. You generally want to study interesting problems to get better at math.

    The Gowers Manifesto

    Sort of a different angle on the problem entirely. Gowers poses a few questions in the blog post that I think would have a “standard machine learning” answer. I imagine a dialogue like this.

    Question. What is it about the proofs that mathematicians actually find that makes them findable in a reasonable amount of time?

    Answer. The state of a partial proof has many different features that each contribute a little bit to this. We can model these properties with deep learning.

    Question. Humans seem to be able to find proofs with a remarkably small amount of backtracking. How do they prune the search tree to such an extent?

    Answer. A mathematician’s intuition tells them what a good next step is, given the current state of the proof. We can approximate this intuition with a deep learning model.

    Problem. Come up with a precise definition of a “proof justification”.

    Answer. A proof justification is a deep learning model that takes a partial proof state and gives you possibilities for a proof tactic. It justifies any proof for which the combination of the scores for each of its tactics is high.

    To me, the problem with these answers is that deep learning alone has trouble with the minimal amount of training data that human mathematicians get. We need to be able to bootstrap. We need a lot of good proofs to learn what good proofs look like. More than human mathematicians are going to provide, Lean community notwithstanding. So I think we really do need better answers to these interwoven-but-related questions that Gowers is posing.

    At the heart of it is a basic principle that shorter proofs are more elegant. But that can’t be everything - it isn’t elegant to prove that 234 + 678 = 912 even if the proof is very short.

    The Lottery Ticket Hypothesis

    I hope this hypothesis is true because it makes my mental model of how deep learning works a lot simpler. Why are transformers so good? Because the previous methods for handling sequences had no way to model a function that jumps around the input like

    # Let a be an array of a bunch of smallish integers
    i = 0
    for _ in range(100):
      i = a[i]
    return i

    If that’s your function, the ideal circuit to solve it just isn’t a subset of the deep learning networks that you build if you use something like LSTM. With a transformer you can in fact make a network that accesses different locations. Voila.

    How much can a transformer model? I feel like a transformer is still limited, by its depth. Can you teach a transformer how to multiply two numbers? I think there is some sort of algorithmic limit, for functions that cannot be embedded in a transformer network, and perhaps these mathematical heuristics cannot be embedded in a transformer network. I’m not sure though.


    What is the simplest function of sequential data that transformers cannot learn?

    Is there any heuristic for whether a math problem is interesting or not?

    Let me know if you have a good answer…