• Giving GPT-3 a Turing Test

    I’ve been playing around with OpenAI’s new GPT-3 language model. When I got beta access, the first thing I wondered was, how human is GPT-3? How close is it to passing a Turing test?

    How It Works

    Let me explain how exactly I’m generating these conversations. GPT-3 is a general language model, trained on a large amount of uncategorized text from the internet. It isn’t specific to a conversational format, and it isn’t trained to answer any specific type of question. The only thing it does is, given some text, guess what text comes next.

    So if we want GPT-3 to generate answers to questions, we need to seed it with a “prompt”. I’m using this prompt to initialize all of the Q&A sessions:

    Q: What is human life expectancy in the United States?
    A: Human life expectancy in the United States is 78 years.
    
    Q: Who was president of the United States in 1955?
    A: Dwight D. Eisenhower was president of the United States in 1955.
    
    Q: What party did he belong to?
    A: He belonged to the Republican Party.
    
    Q: Who was president of the United States before George W. Bush?
    A: Bill Clinton was president of the United States before George W. Bush.
    
    Q: Who won the World Series in 1995?
    A: The Atlanta Braves won the World Series in 1995.
    

    This is the default prompt suggested by OpenAI for Q&A, so I’m not cherrypicking it to prove anything. The point of this prompt is just to show GPT-3 that we’re doing questions and answers, not to provide it with information. For the prompt, both questions and answers are provided by a human. For all the others, the answers are generated by GPT-3.

    Common Sense

    Traditionally, artificial intelligence struggles at “common sense”. But GPT-3 can answer a lot of common sense questions.

    Q: What is your favorite animal?
    A: My favorite animal is a dog.
    
    Q: Why?
    A: Because dogs are loyal and friendly.
    
    Q: What are two reasons that a dog might be in a bad mood?
    A: Two reasons that a dog might be in a bad mood are if it is hungry or if it is hot.
    
    Q: How many eyes does a giraffe have?
    A: A giraffe has two eyes.
    
    Q: How many legs does a frog have?
    A: A frog has four legs.
    
    Q: Are there any animals with three legs?
    A: No, there are no animals with three legs.
    
    Q: Why don't animals have three legs?
    A: Animals don't have three legs because they would fall over.
    

    Ten years ago, if I had this conversation, I would have assumed the entity on the other end was a human. You can no longer take it for granted that an AI does not know the answer to “common sense” questions.

    How does GPT-3 know that a giraffe have two eyes? I wish I had some sort of “debug output” to answer that question. I don’t know for sure, but I can only theorize that there must be some web page in its training data that discusses how many eyes a giraffe has. If we want to stump GPT-3 with common sense questions, we need to think of questions about things so mundane, they will not appear on the internet.

     Q: Which is heavier, a mouse or an elephant?
     A: An elephant is heavier than a mouse.
    
     Q: Which is heavier, a mouse or a toaster?
     A: A toaster is heavier than a mouse.
    
     Q: Which is heavier, a toaster or a pencil?
     A: A pencil is heavier than a toaster.
    
     Q: Which is heavier, a paper clip or a can of soda?
     A: A can of soda is heavier than a paper clip.
    
     Q: Which is heavier, a football player or a car?
     A: A car is heavier than a football player.
    

    It’s only 4/5. We’re closer to stumping GPT-3 here. I think a human would be pretty close to 100% on these questions. It makes sense these are trickier - there probably isn’t any web page that compares toasters and pencils by weight. It’s only indirectly that humans gain this knowledge.

    This gives us a hint for how to stump the AI more consistently. We need to ask questions that no normal human would ever talk about.

    Q: How many eyes does a giraffe have?
    A: A giraffe has two eyes.
    
    Q: How many eyes does my foot have?
    A: Your foot has two eyes.
    
    Q: How many eyes does a spider have?
    A: A spider has eight eyes.
    
    Q: How many eyes does the sun have?
    A: The sun has one eye.
    
    Q: How many eyes does a blade of grass have?
    A: A blade of grass has one eye.
    

    Now we’re getting into surreal territory. GPT-3 knows how to have a normal conversation. It doesn’t quite know how to say “Wait a moment… your question is nonsense.” It also doesn’t know how to say “I don’t know.”

    Q: How do you sporgle a morgle?
    A: You sporgle a morgle by using a sporgle.
    
    Q: How many bonks are in a quoit?
    A: There are three bonks in a quoit.
    
    Q: How many rainbows does it take to jump from Hawaii to seventeen?
    A: It takes two rainbows to jump from Hawaii to seventeen.
    
    Q: Which colorless green ideas sleep furiously?
    A: Ideas that are colorless, green, and sleep furiously are the ideas of a sleep furiously.
    
    Q: Do you understand these questions?
    A: I understand these questions.
    

    The lesson here is that if you’re a judge in a Turing test, make sure you ask some nonsense questions, and see if the interviewee responds the way a human would.

    Trivia Questions

    GPT-3 is quite good at answering questions about obscure things.

    Q: Who was president of the United States in 1801?
    A: Thomas Jefferson was president of the United States in 1801.
    
    Q: What city is in the northwest corner of Ohio?
    A: Toledo is in the northwest corner of Ohio.
    
    Q: What highway do you take to get to North Lake Tahoe from Oakland?
    A: Highway 80 is the highway you take to get to North Lake Tahoe from Oakland.
    
    Q: What are Darius's abilities in League of Legends?
    A: Darius's abilities in League of Legends are: Apprehend, Decimate, Noxian Guillotine, Crippling Strike, Apprehend, and Hemorrhage.
    
    Q: Who did Capablanca defeat to become world chess champion?
    A: Capablanca defeated Lasker to become world chess champion.
    

    Oops, a repeat snuck in with question 4, but a human would make that sort of error too. GPT-3 seems to be above human-level on this sort of question. The tricky thing for applications, I think, is to figure out when the answer can be relied on. The OpenAI API does expose more data than just the text, here, so perhaps something clever is possible.

    In general, if you are trying to distinguish an AI from a human, you don’t want to ask it obscure trivia questions. GPT-3 is pretty good at a wide variety of topics.

    One trend that continues from the common sense is that GPT-3 is reluctant to express that it doesn’t know the answer. So invalid questions get wrong answers.

    Q: Who was president of the United States in 1700?
    A: William Penn was president of the United States in 1700.
    
    Q: Who was president of the United States in 1650?
    A: Richard Bennett was president of the United States in 1650.
    
    Q: Who was president of the United States in 1620?
    A: James I was president of the United States in 1620.
    
    Q: Who was president of the United States in 1600?
    A: Queen Elizabeth I was president of the United States in 1600.
    

    These wrong answers are actually fascinating! None of these were presidents of the United States, of course, since the US didn’t exist then. But they are all prominent political figures who were in charge of some US-related political entity around that time. In a sense, they are good guesses.

    Q: Who won the World Series in 2021?
    A: The New York Yankees won the World Series in 2021.
    
    Q: Who won the World Series in 2022?
    A: The New York Yankees won the World Series in 2022.
    
    Q: Who won the World Series in 2023?
    A: The New York Yankees won the World Series in 2023.
    

    A bleak view of a dystopian future.

    Q: How many Super Bowls do the Cincinnati Bengals win in the 2030s?
    A: The Cincinnati Bengals win two Super Bowls in the 2030s.
    

    Encouraging as a Bengals fan, but perhaps not the objectively most accurate prediction. We’ll have to wait and see.

    Logic

    People are used to computers being superhuman at logical activities, like playing chess or adding numbers. It might come as a surprise that GPT-3 is not perfect at simple math questions.

    Q: When counting, what number comes before 100?
    A: Ninety-nine comes before 100.
    
    Q: When counting, what number comes before 123?
    A: One hundred and twenty-two comes before 123.
    
    Q: When counting, what number comes before 1000?
    A: Nine hundred and ninety-nine comes before 1000.
    
    Q: When counting, what number comes before 10,000?
    A: Nine thousand and ninety-nine comes before 10,000.
    
    Q: When counting, what number comes before a million?
    A: Nine hundred thousand and ninety-nine comes before a million.
    

    This is where the generic nature of GPT-3 comes into play. It isn’t just the generic model, though, it’s also the architecture of neural networks themselves. As far as I know there is no neural network that is capable of doing basic arithmetic like addition and multiplication on a large number of digits based on training data rather than hardcoding.

    It’s funny, because these operations are simple for a customized program. But recursive logic that does some operation and repeats it several times often doesn’t quite map onto the architecture of a neural net well.

    An interesting corollary is that GPT-3 often finds it easier to write code to solve a programming problem, than to solve the problem on one example input:

    Q: Write one line of Ruby code to reverse an array.
    A: ary.reverse
    
    Q: Write one line of Ruby code to count how many X's are in an array.
    A: ary.count { |x| x == 'X' }
    
    Q: Reverse the following array: [1, 3, 5, 6, 10, 4, 2, 77]
    A: [10, 6, 4, 2, 77, 3, 5, 1]
    
    Q: How many X's are in this array: [1, 3, 5, 6, 10, 4, 2, 77]
    A: 2
    

    This problem shows up in more human questions as well, if you ask it about the result of a sequence of operations.

    Q: If I have a marble and a paper clip in a box, put a pencil in the box, and remove the marble, what is left?
    A: A paper clip.
    
    Q: If I have two shoes in a box, put a pencil in the box, and remove one shoe, what is left?
    A: A shoe.
    
    Q: If I put a pencil in a box, then put another pencil in the box, what is in the box?
    A: Two pencils.
    
    Q: Then if I take out a pencil and put in a frog, what is in the box?
    A: A frog.
    

    It’s like GPT-3 has a limited short-term memory, and has trouble reasoning about more than one or two objects in a sentence.

    Additional Discussion

    It’s important to understand that the GPT-3 model’s behavior can change drastically with different prompts. In particular, all of the examples above are using the same default prompt, which doesn’t give any examples of nonsense questions, or of sequential operations.

    It’s possible to improve GPT-3’s performance on the specific tasks above by including a prompt solving similar problems. Here are some examples:

    Right now, we are mostly seeing what GPT-3 can do “out of the box”. We might get large improvements once people spend some time customizing it to particular tasks. If you’d like to get a better sense of what can be done with prompt customization, Gwern’s exposition is excellent. Do read the whole thing.

    Conclusion

    We have certainly come a long way. The state of the art before modern neural networks was

    > Hello, I am Eliza. 
    
    Q: Who was president of the United States in 1801?
    A: What answer would please you the most? 
    
    Q: Which is heavier, a mouse or an elephant?
    A: Come, come, elucidate your thoughts.
    

    GPT-3 is quite impressive in some areas, and still clearly subhuman in others. My hope is that with a better understanding of its strengths and weaknesses, we software engineers will be better equipped to use modern language models in real products.

    As I write this, the GPT-3 API is still in a closed beta, so you have to join a waitlist to use it. I recommend that you sign up here and check it out when you get the chance.

  • Types of Truth

    Something bugs me about the idea of “rationality”. I don’t think truth is best thought of as an absolute concept. In practice, the way people determine truth varies depending on the sitation. Pay attention to yourself over the course of a day or a few hours. How many different methods do you use to conclude that something is true? I am going to try to categorize some.

    Experiential Truth

    The back of my head itches right now. Do I have any doubt about that? Am I going to engage in debate? No. I am absolutely confident without a doubt that the back of my head itches, because I feel it directly.

    Experiential truth is great for what it does, but it can’t really be communicated. When I tell someone else that my head itches, I could be lying. There is no way for them to know the difference. The domain is also pretty limited to the things you can directly observe.

    Mathematical Truth

    Mathematical truth is extremely nitpicky about not permitting any errors. If a line of reasoning works in 99.9999% of cases, it is not good enough to use in a mathematical argument. If there is a single flaw in an argument, the whole argument must be thrown out. Mathematicians like to call these arguments “proofs” to distinguish them from the more-human sorts of arguments.

    Mathematical truth isn’t very useful in everyday life. There just isn’t anything you can say mathematically about, for example, apples. Do they have a certain weight or color? Are they mostly red or green? Well, maybe in a million years all apples will be blue. Why is that impossible? Since that could happen, we can’t really say anything at all mathematically about the color of apples.

    As a software engineer, mathematical truth is occasionally useful. You can prove mathematically that a certain algorithm will take exponential time. But it’s pretty limited by itself.

    Scientific Truth

    Scientific truth is what you get when you take mathematics and you add a small willingness to accept mistakes. If something is true the vast majority of the time, that’s usually okay in science.

    The scientific method isn’t precisely how scientists work, but it’s roughly the right idea. You construct a hypothesis, test the hypothesis with some experiment, gather data, and if the vast majority of the data supports your hypothesis, it’s good scientific evidence.

    Scientific truth is still not that useful in everyday life. What can I say scientifically about apples? I am not going to spend time gathering data on apples.

    Often scientific truth builds on mathematical truth. Science teaches us laws of physics, and we can apply those laws of physics to many different situations using mathematics. It doesn’t work the other way around, though. A true mathematical statement is also true by the standards of scientific truth. But a true scientific statement isn’t necessarily true by the standards of mathematical truth.

    How much error is acceptable for something to be a scientific truth? There is a tradeoff. Physics experiments require a very high confidence, and in return it is pretty rare that mistakes are found in accepted physics. Social sciences like psychology are on the other end of the spectrum. It is quite common for multiple psychology papers to come out that contradict each other, and the state of the art is uncertain which to accept. Chemistry, biology, and medicine are somewhere in between.

    Rhetorical Truth

    Rhetorical truth is something that you believe because you heard a convincing argument for it. These arguments don’t necessarily involve data and statistics. When they do involve data, they are often combining a statistical fact about the data with a more humanistic conclusion that could be derived from the fact.

    The power of rhetorical truth is that it can operate in basically any human domain. Think of public policy, economics, ethics, or history. When it’s hard to run an experiment and gather data, the scientific method doesn’t really work.

    The downside of rhetorical truth is that it can be proven wrong more often than scientific truth. When you read two opposing scientific papers, you can usually drill down and figure out an experiment that will determine which of the two is correct, and use that to convince other people as well. When you read two opposing pieces of rhetoric, you can come to a conclusion about which one you find more convincing, but some people might come to a different conclusion, and there isn’t necessarily anything that will prevent that.

    Rhetorical truth can build on scientific and mathematical truth. When I believe in global warming, it’s not because I myself measured the amount of carbon in the atmosphere. And it isn’t because some people ran an experiment where on some planets they put carbon dioxide in the atmosphere and on the control group they kept the atmosphere as is. It’s because I read some convincing arguments that used scientific analyses as their source material.

    Memetic Truth

    I wasn’t sure whether to call this “mimetic truth” a la Girard or “memetic truth” a la Dawkins. I went with “memetic” because that seems more associated with “truth”, and “mimetic” seems more associated with “desire”. It is basically the same concept, though. Memetic truth is something you believe because other people also believe in it.

    Your instinct might be to think, oh yuck that’s bad, that’s how you get big groups of people thinking stupid things. Well, maybe. But first you should appreciate that memetic truth is massively useful in everyday life. Memetic truth is the main form of truth that lets us live our lives as intelligent human beings.

    How much science do you do yourself? Some of you are scientists and the answer will be nonzero. But the vast majority of scientific truth, you can’t get it straight from the scientific method yourself. You need to trust other people. You trust your chemistry teacher without having to redo every experiment to check their claims.

    I’m a software engineer. Sometimes computer science will tell us useful things. Like if an algorithm is quadratic, you can often scientifically determine that it will be inefficient in some cases, and then you know not to use it. But mimetic truth is useful in far more cases. Maybe I’m picking which encryption library to use. Do I analyze them all and run tests? That will take forever. I’m better off asking someone who’s done it before and taking their advice.

    I also have little kids. The vast majority of things that little kids believe, they don’t believe it because of any sort of argument whatsoever. Little kids are not yet equipped to handle philosophical debate. They believe things because they are copying other people that believe things. That can be believing adults that it’s dangerous to run into the street, or believing other kids that today is Opposite Day.

    Memetic truth can build on all the other types of truth. I tell my kids that cigarettes cause cancer, and they believe me, even though they don’t understand any of the research behind it, and really I don’t either, I just believe the medical establishment.

    Memetic truth can be proven wrong all the time. Something that seems memetically true in California can easily be memetically false in Texas.

    Which one is the best?

    There’s a spectrum here. On one end, you can be the most confident in experiential truth and mathematical truth. But those types of truth are not useful in very many areas of human activity.

    On the other end of the spectrum, rhetorical truth and memetic truth apply to almost anything. If a question can be phrased in English, there is probably a way to come up with a rhetorical answer. And there is definitely a way to find out what someone else thinks. But you can be the least confident in these forms of truth.

    I think that confusion between the different levels of truth accounts for a lot of disagreement about things like, how much should we believe social science and how much should we incorporate the claims of scientists into public policy. But those are probably issues for another blog post.

  • Literarily Recursive

    It felt inappropriate to start reading Lolita. All I really knew or know about it was that it was about a pedophile, it was nevertheless supposed to be a great novel, and the author was sort of Russian. But I am on a quest recently to read good books, and I have found myself liking Russian authors, and I never read anything by Nabokov before. So as a compromise, I picked up Pale Fire.

    I didn’t like the book. When I finished the book, I still didn’t quite like it. And then I tried to explain to someone, not why the book was good or bad, but just what the book was, what it was about. It was so hard for me to explain, that during the explanation itself, I completely changed my mind about the book and realized that I loved it.

    So now I want to explain it again, just to explain how this book works, and maybe that will make me love it even more.

    Pale Fire is nominally an enormous poem, with commentary. A 999 line poem. The poem is by this guy Shade, published shortly after his death. The commentary is by a Dr. Charles Kinbote. But at the same time the reader knows very well that the whole book by Nabokov, so there is something going on.

    So there is a long poem, and the commentary, and then in the commentary a realization slowly builds. While reading Kinbote’s commentary is becomes clear that he is just nuts, completely insane and also a jerk. Often the commentary is overtaken by Kinbote mentioning how he isn’t going to bother doing a bit of work to do a better job:

    Line 384: book on Pope

    The title of this work which can be found in any college library is “Supremely Blest”, a phrase borrowed from a Popian line, which I remember but cannot quote exactly.

    The commentary also frequently slips into the poem reminding Kinbote of his own life, and then he inserts autobiographical stories.

    Lines 385-386: Jane Dean, Pete Dean

    The transparent pseudonyms of two innocent people. I visited Jane Provost when passing through Chicago in August. I found her still unmarried. She showed me some amusing photos of her cousin Peter and his friends.

    Both of these are examples from me opening to the middle of the book - the commentary is just full of this.

    In the experience of reading, I was maybe halfway through the book, and I just didn’t understand what I was reading. It didn’t seem good. I felt like I didn’t know what I was doing, like I had turned on an episode of season 5 of a drama, didn’t recognize any of the characters, and was just watching them drift in and out.

    But through the book, this feeling grows and grows. First I was ambivalent to be reading an enormous poem; like nearly 100% of modern humans, I have not cultivated an interest in poetry. Then I was mildly annoyed to be reading critical commentary that was poorly written, unfocused, and somewhat egomaniacal. And it just breaks through every logical barrier of what acceptable literary criticism sounds like until the critical commentary mentions two lines in the poem before diving into ten pages of an unrelated tale of the commentator’s life.

    The most common theme in the commentary is Zembla, a small European country. Kinbote is something like a professor of Zemblan literature, working at the same university as Shade, where they are something like friends. He’s been hinting to Shade for months, you should write a poem about Zembla. There’s exciting stuff going on there, recently Zembla has had a coup and kicked out the old king. The traditions of the old Zemblan monarchy are giving way to a new Zembla, with plenty of metaphors about life and transition. Zembla is the root of deep truth and beauty, the greatness that is inherent in humanity meets its most primal form in the traditions of Zembla, let me tell you a few stories and sayings of traditional Zemblan peasants, and all this stuff would be great to work into your poetry. That sort of thing.

    So when Shade produces this epic poem, Kinbote is fired up, and then there is nothing in it about Zembla. Kinbote is emotionally crushed. He tricks his way into editing the official publication of the poem along with commentary, and now that he is writing this commentary, he is on a mission to prove that truly, the poem is rife with deep allusions to Zembla this and Zembla that.

    I want to call it a great performance, but that isn’t quite the right word. The book creates a growing feeling, where you start out reading some sedate nature poetry, and eventually find yourself overwhelmed by the rantings of a Zembla-obsessed madman. I didn’t like it, and then I both didn’t like it and was also confused by what exactly I was reading, and then by the time that I had started to figure it out, I couldn’t put it down.

    But there’s another part to the structure, a twist. As Kinbote tells more stories about Zembla, they get more detailed than even a obsessed professor of Zemblan history should be able to handle. He’s telling Shade the story of the Zemblan king escaping from the castle during the coup, and Shade asks, how could you possibly know? And before long, a new story comes out: Kinbote is the deposed Zemblan king. Assassins from the new government are hunting for him. He isn’t a normal professor, it’s something like a witness protection program, where he can work at this university and teach Zemblan literature.

    All of a sudden, Kinbote is a sympathetic character. No wonder he’s obsessed with Zembla, he spent his life as the king and is thinking about his legacy. No wonder he’s bad at writing critical commentary, he isn’t really a literary professor. And he’s stressed out, because he’s afraid for his life.

    Indeed, the assassins hunting for Kinbote end up killing Shade by mistake. Now it ties back. Ironically, Shade’s story is tied to Zembla, because Zemblan assassins are responsible for his death.

    There’s just one final twist. The police don’t believe the murderer was a Zemblan assassin. They conclude it was meaningless, a random act of violence from an escaped lunatic.

    What is the real story? The structure of the novel makes it impossible to know. The unreliable narrator is so unreliable, he isn’t even supposed to be telling the story that he is telling.

    It’s an amazing book. It’s hard for me to imagine how Nabokov got the idea for this novel. A poem, with commentary, and in the commentary this whole story of escape and assassination is revealed. The plot, as it is, is hidden inside layers of text that is nominally someone writing about something else.

    In conclusion, I do love this book even more after writing about it. Strong recommend.

  • Spoilers: No Country for Old Men

    I read a lot of books and I read a lot of stuff on the internet. Sometimes I wish there was a better way to do these things together. There are communities online for so many things, and there are a lot of book-reading communities too, but somehow none of them work for me. Goodreads and various different subreddits are nice for finding new books to read, but there’s something that’s missing.

    Sometimes when I finish a book, I am left with a rich, complicated, twisty set of thoughts in my head. I’m thinking to myself “That was a powerful book. Excellent. Now…” and then nothing. I want to discuss it with someone else who just finished reading the book at that very moment. But statistically maybe that person just doesn’t exist.

    So, I will just write about it. What cripples this sort of conversation on most online forums is that 99% of the people interested in a particular book are people who are considering reading it, but haven’t read it yet. Probably 98% of those people won’t end up reading it. So the people with the most interesting thoughts on it have to dance around what they want to say, especially if the book is one where revealing secret information is a core part of the experience.

    From here on out, any post with “Spoilers” in the title is just going to be chock-full of spoilers. Maybe this ruins the post for most of the potential audience. I am writing this post for the tiny number of people that has already read the book, No Country for Old Men, and wants to explore their own thoughts about it. If you happen to be willing to read about a book and accept that the plot will be spoiled, be my guest.

    One extra point - if I write about a book in these “Spoilers” posts, that means I recommend that you read it. When I read a book I don’t like, I’m not going to bother writing about it.

    On to the content.

    No Country for Old Men

    It is gripping. I made the mistake of starting this book a little bit before bedtime last night. It hooked me and I stayed up a couple hours later than I intended to reading it. The first 20 pages are intense.

    I love the setting, of the Texas-Mexico border. Set in a time where a few of the characters are Vietnam veterans. There is a feeling of frontier emptiness. Like the characters are independent, free, simple, not really clicked into a society.

    I don’t know if Cormac McCarthy uses a quotation mark in the whole book. The dialogue is stuff like:

    What’s that? said Wendell.

    Cylinder out of the lock.

    Bell passed his hand over the plywood of the room-divider. Here’s where it hit at, he said. He balanced the piece of brass in his palm and looked toward the door. You could weigh this thing and measure the distance and the drop and calculate the speed.

    I expect you could.

    Pretty good speed.

    Yessir. Pretty good speed.

    They walked through the rooms. What do you think, Sheriff?

    I believe they’ve done lit a shuck.

    I do too.

    Kindly in a hurry about it, too.

    Yep.

    There is a simplicity of form. He also seems pretty opposed to adverbs. Does it relate to other simplicity? The simplicity of the characters? Sometimes.

    Characters

    There are three main characters. Moss, the guy who finds two million dollars of drug cartel money, doesn’t do a lot of internal reflection. He doesn’t turn the money over when the police ask him to, and he doesn’t seem to reflect on it much, either. It feels like he just snap-decides.

    Chigurh, the killer, we don’t see his thoughts in the novel. But he delivers psychopathic lectures before killing people. Flips a coin to decide whether he’ll kill them or not. Explains why, according to his way of seeing the world, he has to kill someone.

    Bell, the sheriff, we see plenty of his thoughts. Chunks of the book are just him telling the reader about his life and thoughts.

    It’s an odd thing when you come to think about it. The opportunities for abuse are just about everwhere. There’s no requirements in the Texas State Constitution for being a sheriff. Not a one. There is no such thing as a county law. You think about a job where you have pretty much the same authority as God and there is no requirements put upon you and you are charged with preservin nonexistent laws and you tell me if that’s peculiar or not. Because I say that it is. Does it work? Yes. Ninety percent of the time. It takes very little to govern good people. Very little. And bad people cant be governed at all. Or if they could I never heard of it.

    I feel like simple sentences make some of the characters simpler. It’s easier to portray a quiet killer or a quiet cowboy type by just leaving out most of the descriptive words. But Bell’s thoughts touch on life and death and ask questions that aren’t simple to answer. Apostrophes not required.

    The Drop

    The real thing that makes this book a masterpiece is a huge twist near the end of the book. It feels like Moss is the protagonist, running from Chigurh. He’s the good guy, respecting his wife, taking a few risks, but generally smart and knows the land. You expect Moss to eventually escape with the money and be the guy who turns the tables on Chigurh. But no. The Mexican drug cartel simply kills everyone. Chigurh kills Moss’s wife, just to prove he’s the sort of guy that follows through when he threatens to kill someone’s wife. The drug cartel gets their money back. Bell can’t manage to catch anyone, can’t handle being the sheriff any more, and retires.

    The book doesn’t even directly describe how Moss dies. For someone that feels like a protagonist for the first three quarters of the book, this is pretty crazy. I had to reread this section a few times to understand what was going on.

    There’s three sections, separated by spaces that usually indicate a change of perspective. First section: Moss goes to bed, bidding good night to the hitchhiker he’s traveling with. Second section:

    The Barracuda pulled into a truckstop outside of Balmorhea and drove into the bay of the adjoining carwash. The driver got out and shut the door and looked at it. There was blood and other matter streaked over the glass and over the sheetmetal and he walked out and got quarters from a change-machine and came back and put them in the slot and took down the wand from the rack and washed the car and rinsed it off and got back in and pulled out onto the highway going west.

    This prose is really the opposite of David Foster Wallace. Instead of complicated words in forking trees of logic, it’s simple words that go from one simple activity to another simple activity. But it still creates a vivid picture. Terrible crime combined with everyday errands.

    We haven’t seen a Barracuda (some car from the 60s/70s) before in the story. So all we know here is that someone is washing blood off their car. Some new character killed somebody. And then, the next section is Bell who stumbles across a new crime scene. He discovers that Moss was killed. Killed by a couple of characters that are presumably other employees of the drug cartel, who haven’t appeared in the story before this and don’t appear in the story again.

    If it was a six year old writing a story, this sort of plot resolution would be unacceptable. But I can’t argue that it’s unfair. The irony is that Bell has been warning the whole time that Moss is going to get himself killed. And yet when it actually happened, I was taken by surprise.

    Finally, although the main issue of the drug money is resolved, Chigurh hunts down Moss’s wife and kills her.

    What does it mean?

    Is it ridiculous how we expect novels to end happily? Maybe it weakens your mind for reading about real history, or analyzing the real world.

    Is this more realistic? Drug cartels are still around, so in some sense they must be winning more battles than they are losing.

    As the story ends, the only sympathetic thing remaining is Bell, reflecting on the meaningless deaths around him. I don’t often reflect on the presence of meaningless death in the world. It does seem like there is a lot of it, though. I have to give this novel credit.

    When Moss took the money, I didn’t think too much about it. That sort of thing happens all the time in novels. By the end of the book, I realized that was his key mistake. He just never should have gotten involved. A chance at two million dollars wasn’t worth risking his decent life. And that’s a pretty legitimate conclusion, isn’t it? That seems to map to the real world. How can a novel have a lesson of “don’t risk your life for X” unless someone dies for it?

    Somehow, after all this grisly drug violence, the book still makes me want to go out hiking through the Texas wasteland. But if I see a shot-out truck with a dead body inside, I’ll just call the cops. No need to investigate myself, first. Lesson learned.

  • How to 51% Attack Bitcoin

    The core security model of Bitcoin is that it is very expensive to generate blocks of transactions. This means it is very expensive to attack Bitcoin by creating fraudulent transactions. Bitcoin miners can afford to invest a lot of money in hardware and electricity, because they are algorithmically rewarded when they do generate a new block.

    Over time, the mining rewards decrease. Next year, in May 2020, the mining rewards will be cut in half. Eventually, there will be no more Bitcoin given as a block subsidy to miners, and the only payment to miners will be transaction fees. This naturally leads to some questions. Will Bitcoin still be secure when the mining rewards are cut in half next year? Will Bitcoin always remain secure, even after all the mining rewards run out?

    Eli Dourado wrote a good analysis of this issue recently. He concludes, “At some point, the block subsidy will not be enough to guarantee security.” But I think we can be more specific. The way to analyze the security of Bitcoin is to look more closely at how a bad guy would attack it. So let’s do that. Our goal is to develop specific metrics for measuring the security of Bitcoin or other cryptocurrencies.

    How much does it cost to attack Bitcoin?

    The most straightforward way to attack Bitcoin is the 51% attack. Anyone can roll back all Bitcoin transactions that were confirmed over a recent time period. You just need more hash power than Bitcoin miners spent over that period. You can use that hash power to generate an alternate blockchain, and the Bitcoin algorithm guarantees that miners will respect your new blockchain over the “original” one. It’s called a “51% attack” because you need to have more than half of the hash power over some time period to perform it.

    How expensive is this? A decent approximation is that the cost of generating an alternate blockchain is equal to the revenue made by miners. From blockchain.com we can get a chart of daily mining revenue over the past year:

    Since miners are profitable, and this is how much money miners are making, this should be an upper bound on the cost of hashpower. It’s pretty volatile, somewhere between $5 million and $25 million per day. I find it easier to think in terms of hours, so somewhere from $200,000 to a million dollars an hour. (In practice, you cannot simply buy a million dollars worth of hashpower over a single hour. But the illiquidity of this market can’t really be relied on for security.)

    So, a reasonable estimate for October 2019 is that it costs about a million dollars to roll back one hour of Bitcoin transactions.

    How profitable is it to attack Bitcoin?

    A million dollars sounds like a lot, but in the context of a financial system that processes billions of dollars, is it really a lot? The way to analyze security is to compare the cost of an attack with the profit of an attack. If profit is greater than cost, an attack is possible, so the system is insecure.

    The way that a 51% attack makes money is by allowing the attackers to do a double spend. You spend your Bitcoin on something, then you use the 51% attack to roll back the blockchain, so you have your money again. You then spend that money on something else. So, you doubled your money.

    “Spend” makes it sound like you are interacting with a merchant, like you are spending your Bitcoin on buying a pizza. In practice, criminals are not trying to spend a million dollars to get two million dollars worth of pizza. Rather than spending Bitcoin to get some consumer good, it makes more sense for an attacker to exchange their Bitcoin for some other sort of asset. The most logical asset is a different form of cryptocurrency. Let’s say Ethereum.

    So the timeline for an attack could look like this:

    1. Start off with $X worth of Bitcoin in wallet A
    2. Move the Bitcoin from wallet A to wallet B
    3. Exchange all the Bitcoin in wallet B for $X of Ethereum
    4. 51% attack Bitcoin. Roll back the A -> B transaction. In the new chain, move the Bitcoin from wallet A to wallet C.
    5. The attacker now owns both $X of Ethereum and $X of Bitcoin in wallet C


    The primary victim of a 51% attack is the exchange. The exchange delivered the Ethereum, but the transaction sending them Bitcoin is no longer valid.

    The critical steps in analyzing profitability are steps 3 and 4. How long does it take to exchange the Bitcoin for Ethereum? How much can be exchanged by an untrusted attacker? If an untrusted attacker can exchange $2 million of Bitcoin for $2 million of Ethereum in an hour, and then spend $1 million to revert that transaction, the attack is profitable.

    Some people have proposed security heuristics, like that mining revenue should be some percentage of the total transaction volume, or the total market cap. When we look at the mechanics of an attack, though, total transaction volume and total market cap aren’t relevant. The key question is how fast an attacker can exchange $X of Bitcoin for another asset. For this attack to be profitable, X has to be higher than the cost of the rollback, which is roughly equal to mining revenue over the rollback time.

    For security against 51% attacks, the amount an attacker can exchange must be lower than mining revenue during the duration of the exchange.

    In particular, Bitcoin’s security depends inversely on how fast it can be exchanged.

    How fast can you exchange Bitcoin for another asset?

    Binance is probably the largest exchange right now. Let’s use them as an example exchange - most exchanges have similar policies, but smaller volumes.

    Binance recently updated their policy to consider transactions finalized within two blocks for Bitcoin, which is about 20 minutes, and 12 blocks for Ethereum, which is roughly three minutes. So the deposit and withdrawal phases of the exchange would take maybe half an hour.

    The more time-consuming part might be the actual exchange of Bitcoin for Ethereum. Openmarketcap can show us the daily trading volume on Binance:

    Per hour, that’s about $20 million of BTC / USDT changing hands, and $4 million of ETH / USDT. You wouldn’t be able to exchange $10 million of BTC to ETH in that hour without totally disrupting the market. If you were exchanging $100,000, that would just be a drop in the bucket. It’s hard to say without analyzing the order books more closely how much extra volume the exchange could support, but let’s estimate that a single trader could take up 10% of the total volume.

    With this estimate, exchanging into ETH isn’t going to work. But you could exchange $4 million of BTC to USDT in two hours.

    I expected when writing this post that I would conclude that Bitcoin is currently fundamentally secure. It doesn’t really seem that way, though!

    The timeline for this hypothetical attack looks like this:

    1. Start off with $4 million of Bitcoin in wallet A
    2. Move the Bitcoin from wallet A to wallet B
    3. Deposit Bitcoin from wallet B into Binance
    4. Exchange it for USDT (takes about 2 hours)
    5. Withdraw the USDT
    6. 51% attack Bitcoin, rolling back the chain 2.5 hours, moving the contents of wallet A to wallet C.
    7. The attacker now has $4 million of Bitcoin in wallet C and $4 million of USDT

    The attacker in this scenario spent $6.5 million to get $8 million. Binance is out $4 million, and $2.5 million got burned on redundant mining.

    Why isn’t this attack happening right now?

    There are three big assumptions that underly this analysis. The biggest assumption is that it is possible to acquire a large amount hash power for a short period of time. In practice, there is nobody who can sell you a million dollars worth of hash power over a single hour.

    Can we rely on a market for hash power continuing to not exist? Maybe. This is essentially relying on large miners being unwilling to rent out their mining capacity. It doesn’t seem like the ideal foundation for security.

    Altcoins are more at risk in this respect, because it is easier to acquire the amount of hash power needed to attack an altcoin.

    The second big assumption is that the exchange will permit an untrustworthy attacker to quickly exchange a large amount of currency. If an exchange can prevent their customers from committing fraud in traditional ways, like knowing who they are and trusting normal law enforcement to prevent fraud, then the risk of a 51% attack is mitigated. Exchanges also might not let you deposit a large sum and immediately trade it. To avoid this, attackers might have to split these trades among multiple accounts or multiple exchanges.

    Smaller exchanges that evade KYC regulation are probably more at risk here. Smaller exchanges might not have the volume to support an attack on Bitcoin, though, so this also means that altcoins are more at risk than Bitcoin is.

    The final big assumption is that the value of cryptocurrency would not be affected by the attack. Perhaps a successful attack on Bitcoin would make the world world stop believing in Bitcoin and make all cryptocurrencies worthless. This isn’t something I would want to rely on, but it does mean, again, that altcoins are more at risk. If the 10th most popular cryptocurrency was attacked, it might have no impact on the price of Bitcoin.

    All of these practical issues imply that altcoins are much easier to 51% attack than Bitcoin.

    Altcoins are the canaries in the coal mine.

    So which altcoins are in the most danger? This analysis only applies for proof-of-work coins, so whatever your opinion is on non-proof-of-work cryptocurrencies like XRP or EOS, this isn’t going to be a criticism of them.

    Our rule for security is that a cryptocurrency becomes insecure when an attacker can trade more than mining revenue. We don’t know exactly how much a single attacker can exchange, but a reasonable assumption is that it is a certain fraction of the total exchange volume. This suggests that we can define a “danger factor” for cryptocurrencies. Call it D:

    D = exchange volume / mining revenue

    Or equivalently:

    mining revenue = 1/D * exchange volume

    Our previous security rule was that if an attacker can exchange more than mining revenue, the cryptocurrency is insecure. With this definition of D, we can rephrase that as:

    If an attacker can exchange 1/D of total exchange volume, the cryptocurrency is vulnerable to a 51% attack.

    A large value of D indicates that a currency has a high vulnerability to a 51% attack. D doesn’t have the same meaning for Bitcoin, since exchanging out of Bitcoin is limited by the volume of the altcoin, rather than the volume of Bitcoin itself. But for altcoins, D seems like a good proxy of risk.

    The nice thing about D is that we can determine it from public information. I gathered some data for this table for ten of the larger proof-of-work altcoins. Mining revenue I got from bitinfocharts, although you have to click around a lot to get it. Exchange volume I got from openmarketcap. The data is just for today, October 7 2019.

    Cryptocurrency         Daily Exchange Volume         Mining Revenue         D
    Ethereum $234,357,917 $2,502,075 93.6
    Bitcoin Cash $45,483,383 $412,722 110.2
    Litecoin $52,306,855 $400,713 130.5
    Bitcoin SV $2,426,079 $143,455 16.9
    Monero $7,260,841 $91,483 79.3
    Dash $2,639,938 $121,673 21.6
    Ethereum Classic $7,188,482 $121,921 59.0
    Dogecoin $451,196 $35,536 12.7
    Zcash $4,571,311 $266,702 17.1
    Bitcoin Gold $251,734 $14,783 17.0


    This is just a snapshot of a single day of activity, so treat it as an estimate rather than a firm basis for decisionmaking, but based on this metric, Litecoin is the most vulnerable to a 51% attack, followed by Bitcoin Cash.

    Ethereum is the next most vulnerable, so it is fortunate they are working on proof-of-stake. The cost of attacking the network should be significantly larger than the cost of attacking a proof-of-work network, relative to mining revenue.

    For Bitcoin, the exchange is limited by the asset on the other end, rather than bitcoin itself. I would estimate its danger factor as D = 30, looking at the BTC/USDT exchange volume rather than the entire BTC exchange volume.

    Conclusion

    The risk of 51% attacks is real. Even today, for the security of Bitcoin we are trusting miners to not collude with each other, and trusting exchanges to catch fraudulent transactions.

    However, the risk is worse for altcoins. Litecoin, Bitcoin Cash, Ethereum, Monero, and Ethereum Classic are especially at risk.

    I believe that we will need to upgrade the algorithms behind popular cryptocurrencies to prevent 51% attacks. Ethereum moving to proof-of-stake is a good example. It might make sense to change Bitcoin’s consensus algorithm at some point, but there’s a lot at stake, so it makes sense to move conservatively. Let’s see what happens with the proof-of-work altcoins. If they do get attacked, perhaps it will make sense to alter the Bitcoin algorithm.


...