The Virtuosi

New Home

2014-05-12T10:53:00.003-04:00

We've moved to http://thephysicsvirtuosi.com.

Tragedy of Great Power Politics? Modeling International War

2013-06-23T12:25:00.000-04:00

Recently I finished reading John Mearsheimer's excellent political science book The Tragedy of Great Power Politics. In this book, Mearscheimer lays out his ``offensive realism'' theory of how countries interact with each other in the world. The book is quite readable and well-thought-out -- I'd recommend it to anyone who has an inkling for political history and geopolitics. However, as I was reading this book, I decided that there was a point of Mearsheimer's argument which could be improved by a little mathematical analysis.

The main tenant of the book is that states are rational actors who act to to maximize their standing in the international system. However, states don't seek to maximize their absolute power, but instead their relative power as compared to the other states in the system. In other words, according to this logic the United Kingdom position in the early 19th century -- when its army and navy could trounce most of the other countries on the globe -- was better than it is now -- when many other countries' armies and navies are comparable to that of the UK, despite the UK current army and navy being much better now than they were in the early 19th century. According to Mearsheimer, the main determinant of state's international actions is simply maximizing its relative power in its region. All other considerations -- capitalist or communist economy, democratic or totalitarian government, even desire for economic growth -- matter little in a state's choice of what actions it will take. (Perhaps it was this simplification of the problem which made the book really appeal to me as a physicist.)

Most of Mearsheimer's book is spent exploring the logical corollaries of his main tenant, along with some historical examples. He claims that his idea has three different predictions for three different possible systems. 1) A balanced bipolar system (one where two states have roughly the same amount of power and no other state has much to speak of) is the most stable. War will probably not break out since, according to Mearsheimer, each state has little to gain from a war. (His example is the Cold War, which didn't see any actual conflict between the US and the USSR.) 2) A balanced multipolar system (N>2 states each share roughly the same amount of power) is more prone to war than a bipolar system, since a) there is a higher chance that two states are mismatched in power, allowing the more powerful to push the less around, and b) there are more states to fight. (One of his examples is Europe between 1815 and 1900, when there were several great-power wars but nothing that involved the entire continent at once.) 3) An unbalanced multipolar system (N>2 states with power, but one that has more power than the rest) is the most prone to war of all. In this case, the biggest state on the block is almost able to push all the other states around. The other states don't want that, so two or more of them collude to stop the big state from becoming a hegemon -- i.e. they start a war. Likewise, the big state is also looking to make itself more relatively powerful, so it tries to start wars with the little states, one at a time, to reduce their power. (His examples here are Europe immediately before and leading up to the Napoleonic Wars, WWI, and WWII.) There is another case, which is unipolarity -- one state has all the power -- but there's nothing interesting there. The big state does what it wants.

While I liked Mearsheimer's argument in general, something irked me about the statement about bipolarity being stable. I didn't think that the stability of bipolarity (corollary 1 above) actually followed from his main hypothesis. After spending some extra time thinking in the shower, I decided how I could model Mearsheimer's main tenant quantitatively, and that it actually suggested that bipolarity was actually unstable!!

Let's see if we can't quantify Mearsheimer's ideas with a model. Each state in the system has some power, which we'll call P_i. Obviously in reality there are plenty of different definitions of power, but in accordance with Mearsheimer's definition, we'll define power simply in a way that if State 1 has power P_1 > P_2, the power of State 2, then State 1 can beat State 2 in a war. [1] Each state does not seek to maximize their total power P_i, but instead their relative power R_i, relative to the total power of the rest of the states, So the relative power R_i would be
\[
R_i = P_i / \left( \sum_{j=1}^N P_j \right) \qquad ,
\]
where we take the sum over the relevant players in the system. If there was some action that changed the power of some of the players in the system (say a war), then the relative power would also change with time t:
\[
\frac {dR_i}{dt} = \frac {dP_i} {dt} \times \left( \sum_{j=1}^N P_j \right)^{-1} - P_i \times \left( \sum_{j=1}^N P_j \right)^{-2} \times \left(\sum_{j=1}^N \frac {dP_j}{dt} \right) \qquad (1)
\]
A state will pursue an action that increases its relative power R_i. So if we want to decide whether or not State A will go to war with State B, we need to know how war affects a state's individual powers. While this seems intractable, since we can't even precisely define power, a few observations will help us narrow down the allowed possibilities to make definitive statements on when war is beneficial to a state:

1. War always reduces a state's absolute power. This is simply a statement that in general, war is destructive. Many people die and buildings are bombed, neither of which is good for a state. Mathematically, this statement is that in wartime, dP_i/dt < 0 always. Note that this doesn't imply that that dR_i/dt is always negative.

2. The change in power of two states A & B in a war should depend only on how much power A & B have. In addition, it should be independent of the labeling of states. Mathematically, dP_a/dt = f(P_a, P_b), and dP_b/dt = f(P_b, P_a) with the same function f. [2]

3. If State A has more absolute power than State B, and both states are in a war, then State B will lose power more rapidly than State A. This is almost a re-statement of our definition of power. We defined power such that if State A has more absolute power than State B, then State A will win a war against State B. So we'd expect that power translates to the ability to reduce another state's power, and more power means the ability to reduce another state's power more rapidly.

4. For simplicity, we'll also notice that the decrease of a State A's absolute power in wartime is largely dependent on the power of State B attacking it, and is not so much dependent on how much power State A has.

In general, I think that assumptions 1-3 are usually true, and assumption 4 is pretty reasonable. But to simplify the math a little more, I'm going to pick a definite form for the change of power. The simplest possible behavior that capture all 4 of the above assumptions is:
\[
\frac {dx}{dt} = -y \qquad \frac {dy}{dt} = -x \qquad (2) \quad ,
\]
where x is the absolute power of State X and y is the absolute power of State y. (I'm switching notation because I want to avoid using too many subscripts.) [3] Here I'm assuming that the rate of change of State X's power is directly proportional to State Y's power, and depends on nothing else (including how much power State Y actually has). We'll also call r the relative power of State X, and s the relative power of State Y. [4] Now we're equipped to see when war is a good idea, according to our hypotheses.

Let's examine the case that was bothering me most -- a balanced bipolar system. Now we have only two states in the system, X and Y. For starters, let's address the case where both states start out with equal power (x = y). If State X goes to war with State Y, how will the relative powers r =x/(x+y) & s=y/(x+y) change? Looking at Eq. (1), we see that by symmetry both states have to lose absolute power equally, so x(t) = y(t) always, and thus r(t) = s(t) always. In other words, from a relative power perspective it doesn't matter whether the states go to war! For our system to be stable against war, we'd expect that a state will get punished if it goes to war, which isn't what we have! So our system is a neutral equilibrium at best.

But it gets worse. For a real balanced bipolar system, both states won't have exactly the same power, but will instead be approximately equal. Let's say that the relative power between the two states differs by some small (positive) number e, such that x(0) = x0 and y(0) = x0 + e. Now what will happen? Looking at Eq. (2), we see that, at t=0,
\[
\frac {dr}{dt} = -(x_0 + e) / (2x_0 + e) + x_0(2x_0 + e) / (2x_0 + e)^2 = -e/(x_0 + e) \] \[
\frac {ds}{dt} = -(x_0) / (2x_0 + e) + (x_0+e)(2x_0 + e) / (2x_0 + e)^2 = + e/(x_0 + e) \qquad .
\]
In other words, if the power balance is slightly upset, even by an infinitesimal amount, then the more powerful state should go to war! For a balanced bipolar system, peace is unstable, and the two countries should always go to war according to this simple model of Mearsheimer's realist world.

Of course, we've just considered the simplest possible case -- only two states in the system (whereas even in a bipolar world there are other, smaller states around) who act with perfect information (i.e. both know the power of the other state) and can control when they go to war. Also, we've assumed that relative power can change only through a decrease of absolute power, and in a deterministic way (as opposed to something like economic growth). To really say whether bipolarity is stable against war, we'd need to address all of these in our model. A little thought should convince you which of these either a) makes a bipolar system stable against war, and b) makes a bipolar system more or less stable compared to a multipolar system. Maybe I'll address these, as well as balanced and unbalanced multipolar systems, in another blog post if people are interested.

[1] P_i has some units (not watts). My definition of power is strictly comparative, so it might seem that any new scale of power p_i = f(P_i) with an arbitrary monotonic function f(x) would also be an appropriate definition. However, we would like a scale that facilitates power comparisons if multiple states gang up on another. So we would need a new scale such that p_(i+j) = f(P_i + P_j) = f(P_i) + f(P_j) = p_i + p_j for all P_i, P_j . The only function that behaves like this is a linear function of P: p_i = A*P_i , where A is some constant. So our definition of power is basically fixed up to what ``units'' we choose. Of course, defining P_i in terms of tangibles (e.g. army size or GDP or population size or number of nuclear warheads) would be a difficult task. Incidentally, I've also implicitly assumed here that there is a power scale, such that if P_1 > P_2 , and P_2 > P_3, then P_1 > P_3. But I think that's a fairly benign assumption.

[2] This implicity assumes that it doesn't matter which state attacked the other, or where the war is taking place, or other things like that.

[3] Incidentally this form for the rate-of-change of the power also has the advantage that it is scale-free, which we might expect since there is no intrinsic ``power scale'' to the problem. Of course there are other forms with this property that follow some or all of the assumptions above. For instance, something of the form dx/dt = -xy = dy/dt would also be i) scale-invariant, and ii) in line with assumptions 1 & 2 and partially inline with assumption 3. However I didn't use this since a) it's nonlinear, and hence a little harder to solve the resulting differential equations analytically, and b) the rate of decrease of both state's power is the same, in contrast to my intuitive feeling that the state with less power should lose power more rapidly.

[4] Homework for those who are not satisfied with my assumptions: Show that any functional form for dP_i/dt that follows assumptions 1-3 above does not change the stability of a balanced bipolar system.

Re-evaluating the values of the tiles in Scrabble™

2013-01-20T22:52:00.002-05:00

Recently I have seen quite a few blog posts written about re-evaluating the points values assigned to the different letter tiles in the Scrabble™ brand Crossword Game. The premise behind these posts is that the creator and designer of the game assigned point values to the different tiles according to their relative frequencies of occurrence in words in English text, supplemented by information gathered while playtesting the game. The points assigned to different letters reflected how difficult it was to play those letters: common letters like E, A, and R were assigned 1 point, while rarer letters like J and Q were assigned 8 and 10 points, respectively. These point values were based on the English lexicon of the late 1930’s. Now, some 70 years later, that lexicon has changed considerably, having gained many new words (e.g.: EMAIL) and lost a few old ones. So, if one were to repeat the analysis of the game designer in the present day, would one come to different conclusions regarding how points should be assigned to various letters?

I’ve decided to add my own analysis to the recent development because I have found most of the other blog posts to be unsatisfactory for a variety of reasons.* One article calculated letters’ relative frequencies by counting the number of times each letter appeared in each word in the Scrabble™ dictionary. But this analysis is faulty, since it ignores the probability with which different words actually appear in the game. One is far less likely to draw QI than AE during a Scrabble™ game (since there’s only one Q in the bag, but many A's and E's). Similarly, very long words like ZOOGEOGRAPHICAL have a vanishingly small probability of appearing in the game: the A’s in the long words and the A’s in the short words cannot be treated equally. A second article I saw calculated letter frequencies based on their occurrence in the Scrabble™ dictionary and did attempt to weight frequencies based on word length. The author of this second article also claimed to have quantified the extent to which a letter could “fit well” with the other tiles given to a player. Unfortunately, some of the steps in the analysis of this second article were only vaguely explained, so it isn’t clear how one could replicate the article’s conclusions. In addition, as far as I can tell, neither of these articles explicitly included the distribution of letters (how many A’s, how many B’s, etc) included in a Scrabble™ game. Also, neither of these articles accounted for the fact that there are blank tiles (that act as wild cards and can stand in for any letter) that appear in the game.

So, what does one need to do to improve upon the analyses already performed? We’re given the Scrabble™ dictionary and bag of 100 tiles with a set distribution, and we’re going to try to determine what a good pointing system would be for each letter in the alphabet. We’re also armed with the knowledge that each player is given 7 letters at a time in the game, making words longer than 8 letters very rare indeed. Let’s say for the sake of simplicity that words 9 letters long or shorter account for the vast majority of words that are possible to play in a normal game.

Based on these constraints, how can one best decide what points to assign the different tiles? As stated above, the game is designed to reward players for playing words that include letters that are more difficult to use. So, what makes an easy letter easy, and what makes a difficult letter difficult? Sure, the number of times the letter appears in the dictionary is important, but this does not account for whether or not, on a given rack of tiles (a rack of tiles is to Scrabble™ as a hand of cards is to poker), that letter actually can be used. The letter needs to combine with other tiles available either on the rack or on the board in order to form words. The letter Q is difficult to play not only because it is used relatively few times in the dictionary, but also because the majority of Q-words require the player to use the letter U in conjunction with it.

So, what criterion can one use to say how useful a particular tile is? Let’s say that letters that are useful have more potential to be used in the game: they provide more options for the players who draw them. Given a rack of tiles, one can generate a list of all of the words that are possible for the player to play. Then, one can count the number of times that each letter appears in that list. Useful letters, by this criterion, will combine more readily with other letters to form words and so appear more often in the list than un-useful letters.

(I would also like to take a moment to preempt criticism from the competitive Scrabble™ community by saying that strategic decisions made by the players need not be brought into consideration here. The point values of tiles are an engineering constraint of the game. Strategic decisions are made by the players, given the engineering constraints of the game. Words that are “available to be played” are different from “words that actually do get played.” The potential usefulness of individual letter tiles should reflect whether or not it is even possible to play them, not whether or not a player decides that using a particular group of tiles constitutes an optimal move.)

To give an example, suppose I draw the rack BEHIWXY. I can generate** the full list of words available to be played given this rack: BE, BEY, BI, BY, BYE, EH, EX, HE, HEW, HEX, HEY, HI, HIE, IBEX, WE, WEB, WHEY, WHY, WYE, XI, YE, YEH, YEW. Counting the number of occurrences of each letter, I see that the letter E appears 18 times, while the letter W only appears 7 times. This example tells me that the letter E is probably much more potentially useful than the letter W.

The example above is only one of the many, many possible racks that one can see in a game of Scrabble™. I can use a Monte Carlo-type simulation to estimate the average usefulness of the different letters by drawing many example racks. Monte Carlo is a technique used to estimate numerical properties of complicated things without explicit calculation. For example, suppose I want to know the probability of drawing a straight flush in poker.*** I can calculate that probability explicitly by using combinatorics, or I can use a Monte Carlo method to deal a large number of hypothetical possible poker hands and count the number of straight flushes that appear. If I deal a large enough number of hands, the fraction of hands that are straight flushes will converge upon the correct analytic value. Similarly here, instead of explicitly calculating the usefulness of each letter, I use Monte Carlo to draw a large number of hypothetical racks and use them to count the number of times each letter can be used. Comparing the number of times that each tile is used over many, many possible racks will give a good approximation of how relatively useful each tile is on average. Note that this process accounts for the words acceptable in the Scrabble™ dictionary, the number of available tiles in the bag, as well as the probability of any given word appearing.

In my simulation, I draw 10,000,000 racks, each with 9 tiles (representing the 7 letters the player actually draws plus two tiles available to be played through to form longer words). I perform the calculation two different ways: once with a 98-tile pool with no blanks, and once with a 100-tile pool that does include blanks. In the latter case, I make sure to not count the blanks used to stand in for different letters as instances of those letters appearing in the game. The results are summarized in the table below.

There are two key observations to be made here. First, it does not seem to matter whether or not there are blanks in the bag! The results are very similar in both cases. Second, it would be completely reasonable to keep the tile point values as they are. Only the Z, H, and U appear out of order. It’s only if one looks very carefully at the differences between the usefulness of these different tiles that one might reasonably justify re-pointing the different letters.

For fun, I have included in the table my own suggestions for what these tiles’ values might be changed to based on the simulation results. (Note: here's where any pretensions of scientific rigor go out the window.) I have kept the scale of points between 1 and 10, as in the current pointing system. I have assigned groups of letters the same number of points based on whether they have a similar usefulness score.

Here are the significant changes: L and U, which are significantly less useful than the other 1-point tiles may be bumped up to 2 points, comparable to the D and G. The letter V is clearly less useful than any of the other three 4-point tiles (W, Y, and F, all of which may be used to form 2-letter words while the V forms no 2-letter words), and so is undervalued. The H is comparable to the 3-point tiles, and so is currently overvalued. Similarly, the Z is overvalued when one considers how close to the J it is. Unlike in the previous two articles that I mentioned, I don't find any strong reason to change the value of the letter X compared to the other 8 point tiles. I suppose one could lower its value from 8 points to 7, but I have (somewhat arbitrarily) chosen not to do so.

One may also ask the question whether or not the fact that a letter forms 2- or 3-letter words is unfairly biasing that letter. In particular, is the low usefulness of the C and V compared to comparably-pointed tiles due to the fact that they form no 2-letter words? Performing the simulation again without 2-letter words, I found no changes in the results in any of the letters except for C, which increased in usefulness above the B and the H. The letter V's ranking, however, did not change at all, indicating that unlike the C the V is difficult to use even when combining with letters to make longer words. Repeating the simulation yet again without 2- or 3-letter words yielded the same results.

As a final note, I would like to respond directly to to Stefan Fatsis's excellent article about the so-called controversy surrounding re-calculating tile values and say that I am fully aware that this is indeed a "statistical exercise," motivated mostly by my desire to do the calculation made by others in a way that made sense in the context of the game of Scrabble. Similarly, I realize that these recommendations are unlikely to actually change anything. Given that the original points values of the tiles are still justifiably appropriate by my analysis, it's not like anybody at Hasbro is going to jump to "fix" the game. Lastly, my calculations have nothing to do with the strategy of the game whatsoever, and cannot be used to learn how to play the game any better. (If anything, I've only confirmed some things that many experienced Scrabble players already know about the game, such as that the V is a tricky tile, or that the H, X, and Z tiles, in spite of their high point values, are quite flexible.)

* To state my own credentials, I have played Scrabble™competitively for 4 years, and am quite familiar with the mechanics of the game, as well as contemporary strategy.

** Credit where credit is due: Alemi provided the code used to generate the list of available words given any set of tiles. Thanks Alemi!

*** Monte Carlo has a long history of being used to estimate the properties of games. As recounted by George Dyson in Turing’s Cathedral, in 1948 while at Los Alamos the mathematician Stanislaw Ulam suffered a severe bout of encephalitis that resulted in an emergency trepanation. While recovering in the hospital, he played many games of solitaire and was intrigued by the question of how to calculate the probability that a given deal could result in a winnable game. The combinatorics required to answer this question proved staggeringly complex, so Ulam proposed the idea of generating many possible solitaire deals and merely counting how many of them resulted in victory. This proved to be much simpler than an explicit calculation, and the rest is history: Monte Carlo is used today in a wide variety of applications.

Additional References:

The photo at top of a Scrabble™ board was taken during the 2012 National Scrabble™ Championship. Check out the 9-letter double-blank BINOCULAR.

For anyone interested in learning more about the fascinating world of competitive Scrabble™, check out Word Freak, also by Stefan Fatsis. This book has become more or less the definitive documentation upon this subculture. If you don't have enough time to read, check out Word Wars, a documentary that follows many of the same people as Fatsis's book. (It still may be available streaming on Netflix if you hurry.)

The Skeleton Supporting Search Engine Ranking Systems

2013-01-01T19:09:00.000-05:00

A lot of the research I’m interested in relates to networks – measuring the properties of networks and figuring out what those properties mean. While doing some background reading, I stumbled upon some discussion of the algorithm that search engines use to rank search results. The automatic ranking of the results that come up when you search for something online is a great example of how understanding networks (in this case, the World Wide Web) can be used to turn a very complicated problem into something simple.

Ranking search results relies on the assumption that there is some underlying pattern to how information is organized on the WWW- there are a few core websites containing the bulk of the sought-after information surrounded by a group of peripheral websites that reference the core. Recognizing that the WWW is a network representation of how information is organized and using the properties of the network to detect where that information is centered are the key components to figuring out what websites belong at the top of the search page.

Suppose you look something up on Google (looking for YouTube videos of your favorite band, looking for edifying science writing, tips on octopus pet care, etc): the search service returns a whole spate of results. Usually, the pages that Google recommends first end up being the most useful. How on earth does the search engine get it right?

First I’ll tell you exactly how Google does not work. When you type in something into the search bar and hit enter, a message is not sent to a guy who works for Google about your query. That guy does not then look up all of the websites matching your search, does not visit each website to figure out which ones are most relevant to you, and does not rank the pages accordingly before sending a ranked list back to you. That would be a very silly way to make a search engine work! It relies on an individual human ranking the search results by hand with each search that’s made. Maybe we can get around having to hire thousands of people by finding a clever way to automate this process.

So here’s how a search engine does work. Search engines use robots that crawl around the World Wide Web (sometimes these robots are referred to as “spiders”) finding websites, cataloguing key words that appear on those webpages, and keeping track of all the other sites that link into or away from them. The search engine then stores all of these websites and lists of their keywords and neighbors in a big database.

Knowing which websites contain which keywords allows a search engine to return a list of websites matching a particular search. But simply knowing which websites contain which keywords is not enough to know how to order the websites according to their relevance or importance. Suppose I type “octopus pet care” into Google. The search yields 413,000 results- far too many for me to comb through at random looking for the web pages that best describe what I’m interested in.

Knowing the ways that different websites connect to one another through hyperlinks is the key to how search engine rankings work. Thinking of a collection of websites as an ordinary list doesn’t say anything about how those websites relate to one another. It is more useful to think of the collection of websites as a network, where each website is a node and each hyperlink between two pages is a directed edge in the network. In a way, these networks are maps that can show us how to get from one website to another by clicking through links.

Here is an example of what a network visualization of a website map of a large portion of the WWW looks like. (Original full-size image here.)

Here is a site map for a group of websites that connect to the main page of English Wikipedia. (Original image from here.) This smaller site map is closer to the type of site map used when making a search using a search engine.

So, how does knowing the underlying network of the search results help one to find the best website on octopus care (or any other topic)? The search engine assumes that behind the seemingly random, hodgepodge collection of files on the WWW, there is some organization in the way they connect to one another. Specifically, the search engine assumes that finding the websites most central to the network of search results is the same as finding the search results with the best information. Think of a well-known, trusted source of information, like the New York Times. The NY Times website will have many other websites referencing it by linking to it. In addition, the NY Times website, being a trusted news source, is likely to refer to the best references for other sources that it wants to refer to, such as Reuters. High-quality references will also probably have many incoming links from websites that cite them. So not only does a website like the NY Times sit at the center of many other websites that link to it, but it also frequently connects to other websites that themselves are at the center of many other websites. It is these most central websites that are probably the best ones to look at when searching for information.

When I search for “octopus pet care” using Google I am necessarily assuming that the search results are organized according to this core-periphery structure, with a group of important core websites central to the network surrounded by many less important peripheral websites that link to the core nodes. The core websites may also connect to one another. There may also be websites disconnected from the rest, but these will probably be less important to the search simply because of the disconnection. Armed with the knowledge of the connections between the different relevant websites and the core-periphery network structure assumption, we may now actually find which of the websites are most central to the network (in the core), and therefore determine which websites to rank highly.

Let’s begin by assigning a quantitative “centrality” score to each of the nodes (websites) in the network, initially guessing that all of the search results are equally important. (This, of course, is probably not true. It’s just an initial guess.) Each node then transfers all* of its centrality score to its neighbors, dividing it evenly between them. (Starting with a centrality score of 1 with three neighbors, each of those neighbors receives 1/3.) Each node also receives a some centrality from each neighbor that links in to it. Following this first step, we find that nodes with many incoming edges will have higher centrality than nodes with few incoming edges. We can repeat this process of dividing and transferring centrality again. Nodes with many incoming links will have more centrality to share with their neighbors, and nodes with many incoming links will themselves also receive more centrality.

After repeating this process many times, we begin to see a difference between which nodes have the highest centrality scores: nodes with high centrality are the ones that have many incoming links, or have links to other central nodes, or both. This algorithm therefore differentiates between the periphery and the core of the network. Core nodes receive lots of centrality because they link to one another and because they have lots of incoming links from the periphery. Peripheral nodes have fewer incoming links and so receive less centrality than the nodes in the core. Knowing the centrality scores of search results on the WWW makes it pretty straightforward for us to quantitatively rank which of those websites belongs at the top of the list.

Of course, there are more complex ways that one can add to and improve this procedure. Google’s algorithm PageRank (named for founder Larry Page, not because it is used to rank web pages) and the HITS algorithm developed at Cornell are two examples of more advanced ways of ranking search engine results. We can go even further: a search engine can keep track of the links that users follow whenever a particular search is made. (This is almost the same as the company hiring someone to order sought-after web pages automatically whenever a search is made, except all the company lets the user do it for free.) Over time, search engines can improve their methods for helping us find what we need by learning directly from the way users themselves prioritize which search results they pursue. Still, these different search engine ranking systems may operate using slightly different methods, but all of them depend on understanding the list of search results within the context of a network.

* It's not always all - there are other variations where nodes only transfer a fraction of their centrality score at each step.

Sources (and further reading):
I wanted to include no mathematics in this post simply because I cannot explain the mathematics behind these algorithms and their convergence properties better than my sources can. For those of you who want to see the mathematical side of the argument for yourselves (which involves treating the network adjacency matrix as a Markov process and finding its nontrivial steady state eigenvector), do consult the following two textbooks:

• Easley, David, and Jon Kleinberg. Networks, Crowds, and Markets: Reasoning about a Highly Connected World. Cambridge University Press, 2010 (Chapter 14 in particular)
• Newman, Mark. Networks: an Introduction. Oxford University Press, 2010 (Chapter 7 in particular)

A popular book on the early development of network science that contains a lot of information on the structure of the WWW:

• Barabasi, Albert-Laszlo. Linked: How Everything is Connected to Everything Else and What It Means. Plume, 2003.

A book on the history of modern computing that contains an interesting passage on how search engines learn adaptively from their users (that deserves a shout-out in this blog post).

• Dyson, George. Turing's Cathedral. Pantheon, 2012.

When will the Earth fall into the Sun?

2012-11-29T22:58:00.000-05:00

The time I spent making this poster could have been spent doing research.

Since December 2012 is coming up, I thought I'd help the Mayans out with a look at a possible end of the world scenario. (I know, it's not Earth Day yet, but we at the Virtuosi can only go so long without fantasizing about wanton destruction.) As the Earth zips around the Sun, it moves through the heliosphere, which is a collection of charged particles emitted by the Sun. Like any other fluid, this will exert drag on the Earth, slowly causing it to spiral into the Sun. Eventually, it will burn in a blaze of glory, in a bad-action-movie combination of Sunshine meets Armageddon.

Before I get started, let me preface this by saying that I have no idea what the hell I'm talking about. But, in the spirit of being an arrogant physicist, I'm going to go ahead and make some back-of-the-envelope calculations, and expect that this post will be accurate to within a few orders of magnitude.

Well, how long will the Earth rotate around the Sun before drag from the heliosphere stops it? This seems like a problem for fluid dynamics. How do we calculate what the drag is on the Earth? Rather than solve the fluid dynamics equations, let's make some arguments based on dimensional analysis.

What can the drag of the Earth depend on? It certainly depends on the speed of the Earth v -- if an object isn't moving, there can't be any drag. We also expect that a wider object feels more drag, so the drag force should depend on the radius of the Earth R. Finally, the density of the heliosphere might have something to do with it. If we fudge around with these, we see that there is only 1 combination that gives units of force:
\[
F_{drag} \sim \rho v^2 R^2
\]

Now that we have the force, the energy dissipated from the Earth to the heliosphere after moving a distance d is E_lost = F*d. If the Earth moves with speed v for time t, then we can write E_lost = F*v*t. So we can get an idea of the time scale over which the Earth starts to fall into the Sun by taking E_lost = E_Earth ~ 1/2 M_Earth v^2. Rearranging and dropping factors of 1/2 gives
\[
T_\textrm{Earth burns} \sim M_{Earth} v^2 / (F_{drag}\times v) \\ \qquad \sim M_{Earth} / (\rho R^2 v)
\]
Using the velocity of the Earth as 2*pi * 1 Astronomical unit/year, Googlin' for some numbers, and taking the density of the heliosphere to be 10^-23 g/cc we get... \[ T \approx 10^{19} \textrm{ years} \] Looks like this won't be the cause of the Mayan apocalypse. (By comparison, the Sun will burn out after only ~10^9 years.)

Creating an Earth

2012-10-27T19:07:00.000-04:00

A while ago I decided I wanted to create something that looks like the surface of a planet, complete with continents & oceans and all. Since I've only been on a small handful of planets, I decided that I'd approximate this by creating something like the Earth on the computer (without cheating and just copying the real Earth).

Where should I start? Well, let's see what the facts we know about the Earth tell us about how to create a new planet on the computer.

Observation 1: Looking at a map of the Earth, we only see the heights of the surface.

So let's describe just the heights of the Earth's surface.

Observation 2 : The Earth is a sphere.

So (wait for it) we need to describe the height on a spherical surface.

Now we can recast our problem of making an Earth more precisely mathematically. We want to know the heights of the planet's surface at each point on the Earth. So we're looking for field (the height of the planet) defined on the surface of a sphere (the different spots on the planet).

Just like a function on the real line can be expanded in terms of its Fourier components, almost any function on the surface of a sphere can be expanded as a sum of spherical harmonics Ylm. This means we can write the height h of our planets surfaces as

\[
h(\theta, \phi) = \sum A_{lm}Y_l^m(\theta, \phi) \quad (1)
\]
If we figure out what the coefficients A of the sum should be, then we can start making some Earths! Let's see if we can use some other facts about the Earth's surface to get get a handle on what coefficients to use.

Observation 3: I don't know every detail of the Earth's surface, whose history is impossibly complicated.

I'll capture this lack-of-knowledge by describing the surface of our imaginary planet as some sort of random variable. Equation (1) suggests that we can do this by making the coefficients A random variables. At some point we need to make an executive decision on what type of random variable we'll use. For various reasons, [1] I decided I'd use a Gaussian random variable with mean 0 and standard deviation alm:
\[
A_{lm} = a_{lm} N(0,1)
\]
(Here I'm using the notation that N(m,v) is a normal or Gaussian random variable with mean m and variance v. If you multiply a Gaussian random variable by a constant a, it's the same as multiplying the variance by a^2, so a*N(0,1) and N(0,a^2) are the same thing.)

Observation 4: The heights of the surface of the Earth are more-or-less independent of their position on the Earth.

In keeping with this, I'll try to use coefficients alm that will give me a random field that is is isotropic on average. This seems hard at first, so let's just make a hand-waving argument. Looking at some pretty pictures of spherical harmonics, we can see that each spherical harmonic of degree l has about l stripes on it, independent of m. So let's try using alm's that depend only on l, and are constant if just m changes. [2]. Just for convenience, we'll pick this constant to be l to some power -p:
\[
a_{lm} = l^{-p} \quad \textrm{ or}
\]\[
h(\theta, \phi) = \sum_{l,m} N_{lm}(0,1) l^{-p} Y_l^m(\theta, \phi) \quad (2)
\]

At this point I got bored & decided to see what a planet would look like if we didn't know what value of p to use. So below is a movie of a randomly generated "planet" with a fixed choice of random numbers, but with the power p changing.

As the movie starts (p=0), we see random uncorrelated heights on the surface [5]. As the movie continues and p increases, we see the surface smooth out rapidly. Eventually, after p=2 or so, the planet becomes very smooth and doesn't look at all like a planet. So the "correct" value for p is somewhere above 0 (too bumpy) and below 2 (too smooth). Can we use more observations about Earth to predict what a good value of p should be?

Observation 5: The elevation of the Earth's surface exists everywhere on Earth (duh).

So we're going to need our sum to exist. How the hell are we going to sum that series though! Not only is it random, but it also depends on where we are on the planet!

Rather than try to evaluate that sum everywhere on the sphere, I decided that it would be easiest to evaluate the sum at the "North Pole" at theta=0. Then, if we picked our coefficients right, this should be statistically the same as any other point on the planet.

Why do we want to look at theta = 0? Well, if we look back at the wikipedia entry for spherical harmonics, we see that
\[
Y_l^m = \sqrt{ \frac{2 l + 1}{4\pi}\frac{(l-m)!}{(l+m)!}} e^{im\phi}P^m _ l(\cos\theta) \quad (3)
\]

That doesn't look too helpful -- we've just picked up another special function Plm that we need to worry about. But there is a trick with these special functions Plm: at theta = 0, Plm is 0 if m isn't 0, and Pl0 is 1. So at theta = 0 this is simply:
\[
Y_l^m(\theta = 0) = \bigg \{ ^{\sqrt{(2l+1)/4\pi},\, m=0}_{0,\,m \ne 0}
\]

Now we just have, from every equation we've written down:
\[
h(\theta = 0) = \sum_l \times l^{-p} \times \sqrt{(2l+1)/4\pi }\times N(0,1) \]\[
\quad \qquad = \times \frac 1 {\sqrt{4\pi}} \times \sum_l N(0,l^{-2p}(2l+1)) \]\[
\quad \qquad = \times \frac 1 {\sqrt{4\pi}} \times N(0,\sum_l l^{-2p}(2l+1) ) \] \[
\quad \qquad = \times \frac 1 {\sqrt{4\pi}} \sqrt{\sum_l l^{-2p}(2l+1)}
\times N(0,1) \] \[
\quad \qquad \sim \sqrt{\sum_l l^{-2p+1}} N(0,1) \qquad (4)
\]

So for the surface of our imaginary planet to exist, we had better have that sum converge, or -2p+1 < -1 (p > 1).

And we've also learned something else!!! Our model always gives back a Gaussian height distribution on the surface. Changing the coefficients changes the variance of distribution of heights, but that's all it does to the distribution. Evidently if we want to get a non-Gaussian distribution of heights, we'd need to stretch our surface after evaluating the sum.

Well, what does the height distribution look like from my simulated planets? Just for the hell of it, I went ahead and generated ~400 independent surfaces at ~40 different values for the exponent p, looking at the first 22,499 terms in the series. From these surfaces I reconstructed the measured distributions; I've combined them into a movie which you can see below.

As you can see from the movie, the distributions look like Gaussians. The fits from Eq. (4) are overlaid in black dotted lines. (Since I can't sum an infinite number of spherical harmonics with a computer, I've plotted the fit I'd expect from just the terms I've summed. ) As you can see, they are all close to Gaussians. Not bad. Let's see what else we can get.

Observation 6: According to some famous people, the Earth's surface is probably a fractal whose coastlines are non-differentiable.

This means that we want a value of p that will make our surface rough enough so that its gradient doesn't exist (the derivative of the sum of Eq. (2) doesn't converge). At this point I'm getting bored with writing out calculations, so I'm just going to make some scaling arguments.

From Eq. (3), we know that each of the spherical harmonics Ylm is related to a polynomial of degree l in cos(theta). So if we take a derivative, I'd expect us to pick up another factor of l each time. Following through all the steps of Eq. (4) we find
\[
\vec{\nabla}h \sim \sqrt{\sum_l l^{-2p+3}}\vec{N}(0,1) \quad ,
\]
which converges for p > 2. So for our planet to be "fractal," we want 1<p<2. Looking at the first movie, this seems reasonable.

Observation 7: 70% of the Earth's surface is under water.

On Earth, we can think of the points underwater as all those points below a certain threshold height. So let's threshold the heights on our sphere. If we want 70% of our generated planet's surface to be under water, Eq (4) and the cumulative distribution function of a Gaussian distribution tells us that we want to pick a critical height H such that
\[
\frac 1 2 \left[ 1 + \textrm{erf}(H/\sqrt{2\sigma^2}) \right] = 0.7 \quad \textrm{or}
\] \[
H = \sqrt{2\sigma^2}\textrm{erf}^{-1}(0.4)
\] \[
\sigma^2 = \frac 1 {4\pi} \sum_l l^{-2p}(2l+1) \quad (5)\, ,
\]
where erf() is a special function called the error function, and erf-1 is its inverse. We can evaluate these numerically (or by using some dirty tricks if we're feeling especially masochistic).

So for our generated planet, let's call all the points with a height larger than H "land," and all the points with a height less than H "ocean." Here is what it looks like for a planet with p=0, p=1, and p=2, plotted with the same Sanson projection as before.

Top to bottom: p=0, p=1, and p=2. I've colored all the "water" (positions with heights < H as given in Eq. (5) ) blue and all the land (heights > H) green.

You can see that the the total amount of land area is roughly constant among the three images, but we haven't fixed how it's distributed. Looking at the map above for p=0, there are lots of small "islands" but no large contiguous land masses. For p=2, we see only one contiguous land mass (plus one 5-pixel island), and p=1 sits somewhere in between the two extremes. None of these look like the Earth, where there are a few large landmasses but many small islands. From our previous arguments, we'd expect something between p=1 and p=2 to look like the Earth, which is in line with the above picture. But how do we decide which value of p to use?

Observation 8: The Earth has 7 continents

This one is more vague than the others, but I think it's the coolest of all the arguments.

How do we compare our generated planets to the Earth? The Earth has 7 continents that comprise 4 different contiguous landmasses. In order, these are 1) Europe-Asia-Africa, 2) North- and South- America, 3) Antartica, and 4) Australia, with a 5th Greenland barely missing out. In terms of fractions of the Earth's surface, Google tells us that Australia covers 0.15% of the Earth's total surface area, and Greenland covers 0.04%. So let's define a "continent" as any contiguous landmass that accounts for more than 0.1% of the planet's total area. Then we can ask: What value of p gives us a planet with 4 continents?

I have no idea how to calculate exactly what that number would be from our model, but I can certainly measure it from the simulated results. I went ahead and counted the number of continents in the generated planets.

The results are plotted above. The solid red line is the median values of the number of continents, as measured over 400 distinct worlds at 40 different values of p. The red shaded region around it is the band containing the upper and lower quartiles of the number of continents. For comparison, in black on the right y-axis I've also plotted the log of the total number of landmasses at the resolution I've used.

The number of continents has a resonant value of p -- if p is too small, then there are many landmasses, but none are big enough to be continents. Conversely, if p is too large, then there is only one huge landmass. Somewhere in the middle, around p=0.5, there are about 20 continents, at least when only the first ~23000 terms in the series are summed.

Looking at the curve, we see that there are roughly two places where there are 4 continents in the world -- at p=0.1 and at p=1.3. Since p=0.1 doesn't converge, and since p=0.1 will have way too many landmasses, it looks like a generated Earth will look the best if we use a value of p=1.3

And that's it. For your viewing pleasure, here is a video of three of these planets below, complete with water, continents, and mountains. [4]

[1] Since I wanted a random surface, I wanted to make the mean of each coefficient 0. Otherwise we'd get a deterministic part of our surface heights. I picked a distribution that's symmetric about 0 because on Earth the bottom of the oceans seem roughly similar in terms of changes in elevation. I wanted to pick a stable distribution & independent coefficients because it makes the sums that come up easier to evalutate. Finally, I picked a Gaussian, as opposed to another stable distribution like a Lorentzian, because the tallest points on Earth are finite, and I wanted the variance of the planet's height to be defined.

[2] We could make this rigorous by showing that a rotated spherical harmonic is orthogonal to other spherical harmonics of a different degree l, but you don't want to see me try.

[3]Actually p=0 should correspond to completely uncorrelated delta-function noise. (You can convince yourself by looking at the spherical harmonic expansion for a delta-function.) The reason that the bumps have a finite width is that I only summed the first 22,499 terms in the series (l=150 and below). So the size of the bumps gives a rough idea of my resolution.

[4] For those of you keeping score at home, it took me more than 6 days to figure out how to make these planets.

A Curious Footprint

2012-08-04T04:56:00.000-04:00

Lasers! Credit: JPL/Caltech

In less than two days, NASA's Mars Science Laboratory (MSL) / Curiosity rover will begin its harrowing descent to the Martian surface. If everything goes according to the kind-of-crazy-what-the-heck-is-a-sky-crane plan, this process will be referred to as "landing" (otherwise, more crashy/explodey gerunds will no doubt be used).

The MSL mission is run through NASA's Jet Propulsion Laboratory where, by coincidence, I happen to be at the moment. Now, I'm not working on this project, so I don't have a lot to add that isn't available elsewhere. BUT I do feel an authority-by-proximity kind of fallacy kicking in, so how about a post why not?

Preliminaries

Before we get started, I feel obligated to link to NASA's Seven Minutes of Terror video. If you haven't seen it yet, I highly recommend watching it right now (my favorite part is the subtitles). It has over a million views on YouTube and seems to have done a pretty good job at generating interest in the mission. Although, it's a shame they had to interview the first guy in what appears to be a police interrogation room. Oh well.

About the Rover

This thing is big. It's the size of a car and is jam-packed with scientific equipment. There's a couple different spectrometers, a bunch of cameras, a drill for collecting rock samples, and radiation detectors. Probably the coolest instrument onboard Curiosity is called the ChemCam. The ChemCam uses a laser to vaporize small regions of rock, which allows it to study the composition of things about 20 feet away.

In addition to the scientific payload, Curiosity also needs some way to generate power. Previous rovers had been powered by solar panels, but there don't appear to be any here. Instead, Curiosity is powered by the heat released from the radioactive decay of about 10 pounds of plutonium dioxide. This source will power the rover for ~~about a Martian year~~ well beyond the currently planned mission duration of one Martian year (about 687 Earth days) [Thanks to Nathan in the comments for pointing this out!].

To summarize, the rover is a nuclear-powered lab-on-wheels that shoots lasers out of its head. This is pretty cool.

In non-SI units, the MSL is roughly one handsome man (1 hm) tall

A Curious Footprint

There's been a lot of preparation at JPL this week for the upcoming landing. All the shiny rover models have been taken out of the visitor's center and put in a tent outside, presumably so there will be a pretty backdrop for press reports and the like.

Anyway, I was out taking pictures of the rovers at the end of the day today when someone pointed out something cool about the tires on Curiosity.

Here's a close-up:

Hole-y Tires

Each tire on the rover is has "JPL" punched out in Morse code! Makes sense, though. If you're going to spend $2.5 billion on something, you might as well put your name on it.

Watch the Landing

If you want to watch the landing, check out the NASA TV stream. The landing is scheduled for Sunday night at 10:31 pm PDT (1:31 am EDT). Until then, it looks like they are showing a lot of interviews and other cool behind-the-scenes kind of stuff.

A Homemade Viscometer I

2012-07-24T22:32:00.000-04:00

Stirring a bowl of honey is much more difficult than stirring a bowl of water. But why? The mass density of the honey is about the same as that of water, so we aren't moving more material. If we were to write out Newton's equation, ma would be about the same, but yet we still need to put in much more force. Why? And can we measure it?

The reason that honey is harder to stir is of course that the drag on our spoon depends on more than just the density of the fluid. The drag also depends on the viscosity of the fluid -- loosely speaking, how thick it is -- and the viscosity of honey is about 400 times that of water, depending on the conditions. In fact, a quick perusal of the Wikipedia article on viscosity shows that viscosities can vary by a fantastic amount -- some 13 orders of magnitude, from easy-to-move gases to thick pitch that behaves like a solid except on long time scales. The situation is even more complicated than this, as some fluids can have a viscosity that changes depending on the flow.

I wanted to find a way to measure the viscosities of the stuff around me, so I made the viscometer pictured below for about $1.75 (the vending machines in Clark Hall are pretty expensive). How? Well, I

My homemade viscometer, taking data on the viscosity of water.

Enjoyed the crisp, refreshing taste of Diet Pepsi from a 20 oz bottle (come on, sponsorships).
Cut the top and bottom off the bottle, so all that was left was a straight tube.
Mounted the bottle with on top of a small piece of flat plastic.
Mounted a single-tubed coffee stirrer horizontally out of the bottle (I placed the end towards the middle of the bottle to avoid end effects).
Epoxied or glued the entire edge shut.
Marked evenly-spaced lines on the side of the bottle.

I can load my "sample" fluid in the top of the Pepsi bottle, and time how long it takes for the sample level to drop to a certain point. A more viscous fluid will take more time to leave the bottle, with the time directly proportional to the viscosity. (This is a consequence of Stokes flow and the equation for flow in a pipe. It will always be true, as long as my fluid is viscous enough and my apparatus isn't too big.)

So we're done! All we need to do is calibrate our viscometer with one sample, measure the times, and then we can go out and measure stuff in the world! No need to stick around for the boring calculations! We can do some fun science over the next few blog posts!

But this is a physics blog written by a bunch of grad students, so I'm assuming that a few of you want to see the details. (I won't judge you if you don't though.) If we think about the problem for a bit, we basically have flow of a liquid through a pipe (i.e. the coffee stirrer), plus a bunch of other crap which hopefully doesn't matter much.

We first need to think about how the fluid moves. We want to find the velocity of the fluid at every position. This is best cast in the language of vector calculus -- we have a (vector) velocity field u at a (vector) position x. There are two things we know: 1) We don't (globally) gain or lose any fluid, and 2) Newton's laws F=ma hold. We can write these equations as the Navier-Stokes equations: \[ \vec{\nabla}\cdot \vec{u} = 0 \quad (1) \] \[ \ \rho \left( \frac {\partial \vec{u}} {\partial t} + (\vec{u}\cdot\vec{\nabla})\vec{u} \right) = - \vec{\nabla}p + \eta \nabla^2 \vec{u} \quad (2) \] The first equation basically says that we don't have any fluid appearing or disappearing out of nowhere, and the second is basically ma=F, except written per unit volume. (The fluid's mass-per-unit-volume is rho, the rate of change of our velocity is du/dt, and our force per unit volume is grad(p), plus a viscous term laplacian(u). The only complication is that du/dt is a total derivative, which we need to write as du/dt + du/dx*dx/dt.)

I won't drag you through the gory details, unless you want to see them, but it turns out that for my system the height of the fluid h (measured from the coffee stirrer) versus time t is \[ h(t) = h(0)e^{- t/T}, \quad T= 60.7 \textrm{sec} \times [\eta / \textrm{1 mPa*s}] \times [\textrm{ 1 g/cc} / \rho] \] [For my viscometer, the coffee stirrer has length 13.34 cm and inside diameter 2.4 mm, and the Pepsi bottle has a cross-sectional area of 36.3 square centimeters (3.4 cm inner radius). You can see how the timescale scales with these properties in the gory details section.]

A run with measured heights vs times & error bars. The majority of the uncertainty turns out to come from not knowing the exact proportions of the viscometer. I don't know exactly why the heights are systematically deviating from the fit, but I suspect it's that my gridlines aren't perfectly lined up with the bottom of my viscometer (it looks like ~5 mm off would do it, which I can totally believe looking at the picture of my viscometer). However, because of the linearity of the equations for steady flow in a pipe, we know that the time scales linearly with the viscosity, so we should be able to accurately measure relative viscosities.

Well, how well does it work? Above is a plot of the height of water in my viscometer versus time, with a best-fit value from the equations above. To get a sense of my random errors (such as how good I am at timing the flow), I measured this curve 5 separate times. If I take into account the uncertainties in my apparatus setup as systematic errors, I find a value for my viscosity as \[ \eta \approx 1.429 \textrm{mPa* s} \pm 0.5 \% \textrm{Rand.} \pm 55\% \textrm{Syst.} \] The actual value of the viscosity of water at room temperature (T=25 C) is about 0.86 mPa*s, which is more-or-less within my systematic errors. So it looks like I won't be able to measure absolute values of viscosity accurately without a more precise apparatus. But if I look at the variation of my measured viscosity, I see that I should probably be able to measure changes in viscosity to 0.5% !! That's pretty good!

Hopefully over the next couple weeks I'll try to use my viscometer to measure some interesting physics in the viscosity of fluids.

Batman, Helicopters, and Center of Mass

2012-06-26T11:10:00.000-04:00

A couple weeks ago, I came home after a long day at work looking for a break. I thought to myself, “What’s more fun than physics? Batman*.” I sat down to play the latest Batman videogame, in which Batman’s current objective was to use his grappling hook to jump onto an enemy helicopter to steal an electronic MacGuffin. As awesome as this was, it occurred to me that something was very wrong about the way the helicopter moved while Batman zipped through the air.

See if you can spot it too. (Watch for about 5 seconds after the video starts. Ignore the commentary. Note: The grunting noises are the sounds that Batman makes if you shoot him with bullets.)

What occurred to me was this: If the helicopter’s rotors provided enough lift to balance the force of gravity, wouldn’t Batman’s sudden additional weight cause the helicopter to fall out of the sky? Also, to get lifted up into the air, the helicopter must be pulling up on Batman: shouldn’t Batman also be pulling down on the helicopter? By how much should we expect to see the helicopter’s altitude change?

To address the first question, let's go to Newton's second law:
\[ \sum \vec{F} = m\vec{a} \]
Let’s assume that the helicopter is hovering stationary, minding its own business, when Batman jumps onto it. Let's also assume the helicopter pilots are totally oblivious to Batman and make no flight corrections after Batman jumps onto it. In order to hover, the lift from the helicopter's rotors exactly matched the pull of gravity.

\[ \sum \vec{F} = \vec{F}_{rotors} - \vec{F}_{gravity} = 0 \]

Batman's sudden additional weight would cause the helicopter to start falling, as the forces would no longer balance:

\[ \sum \vec{F} = \vec{F}_{rotors} - \vec{F}_{gravity} - \vec{F}_{Batman} < 0 \]

So the helicopter does accelerate (and move) when Batman jumps onto it. How much does it move? Let’s assume there are no crazy winds or other external forces acting on the helicopter or Batman while Batman grapples onto the helicopter. “No external forces” means that momentum of helicopter + Batman does not change during Batman's flight.

Let's make things a little simpler and assume that neither Batman nor the helicopter had any vertical momentum before Batman used his grappling hook. (I can choose to approach this problem from a reference frame where the center of mass is stationary. Choosing a frame where the center of mass moves won't change the results, it will just make the calculation more complicated.) Because the momentum of helicopter + Batman does not change, then the center of mass does not move while Batman zips through the air:

\[ \frac{d}{dt} y_{COM} = \frac{d}{dt} \frac{m_{Bat} y_{Bat} + m_{Copter} y_{Copter}}{m_{Bat} + m_{Copter}} = \frac{p}{m_{Bat} + m_{Copter}} = 0 \]

The center of mass must remain stationary, so we can find how much the helicopter's height changes by if Batman starts on the ground (y = 0) and both end up at the same height with Batman hanging from the helicopter:

\[ y_{COM} = \frac{m_{Copter} y_{Copter} + m_{Bat} (0)}{m_{Bat} + m_{Copter}} = \frac{m_{Copter} y_{final} + m_{Bat} y_{final}}{m_{Bat} + m_{Copter}} \]
\[ \Delta y = y_{Copter} - y_{final} = \frac{m_{Bat}}{m_{Bat} + m_{Copter}} y_{Copter}\]

Now, some numbers: The police helicopters in the game are pretty small, probably about 1500 kg. Batman is a big guy who works out and probably weighs around 100 kg (220 lb). Plus, he’s wearing body armor (hence surviving when bullets hit him) and a utility belt and all of those other Bat-gadgets, which probably adds about 30 kg (~30 lb for the gadgets, ~30 lb for the armor). If Batman has to grapple onto a helicopter 30 meters above him, then the helicopter should drop out of the air by about 2.4 m. This is greater than the height of Batman himself, and would be noticeable if the helicopter physics in the game were perfect.

Of course, if the helicopters appearing in the game were the giant army helicopters (they do carry rockets, after all), their mass would be much larger (~5000-10000 kg) so the effect of Batman’s additional weight would be much smaller.

None of these considerations detracted from the fun I had playing the game, but it did seem odd that the helicopters appeared to be nailed to the sky instead of moving freely through the air. I’ll be writing the game developers a strongly-worded letter directly.

* The DC superhero, not the city or the fish.

Tales from the Transit of Venus

2012-06-05T00:50:00.002-04:00

Sad Old Sun

Today is the transit of Venus, which, aside from being a totally rad astronomical event, is also the perfect excuse to tell my favorite story of an unlucky Frenchman (I have many).

This is by no means new and, if you've ever taken an astronomy course, you've probably already heard it. It is perhaps the closest thing Astronomy has to a ghost story, told though the glow of a flashlight on moonless nights to scare the children.

This is the story of Guillaume Le Gentil, a dude that just couldn't catch a break.

------

Guillaume Joseph Hyacinthe Jean-Baptiste Le Gentil de la Galaisière was a Frenchman with an incredibly long name. He was also an astronomer, though he hadn't started out that way. Monsieur Le Gentil (as his friends called him and so, then, shall we) had originally intended to enter the priesthood. However, he soon began sneaking away to hear astronomy lectures and quickly switched from studies of Heaven to those heavens more readily observed in a telescope.

Le Gentil happened to get into the astronomy game at a very exciting time. The next pair of Venus transits was imminent and astronomers were giddy with anticipation. Though the previous transit of 1639 had been predicted, it was met with little fanfare and only a few measurements. But the transits of 1761 and 1769 would be different. People would be ready.

And the stakes were higher this time, too. Soon after the 1639 transit, Edmund Halley (he of the-only-comet-people-can-name fame) calculated that with enough simultaneous measurements, the distance from the Earth to the Sun (the so-called astronomical unit, or AU) could be measured fairly accurately. Since almost all other astronomical distances were known in terms of the AU, knowing its precise value would essentially set the scale for the cosmos.

Brand new telescopes in hand, the astronomers of Europe set sail for locations all over the world.

Le Gentil had been assigned to observe the transit from Pondicherry, a French holding on the eastern side of India. On March 26th, 1760, he began his long sea voyage around the Cape of Good Hope towards India.

The voyage from France to India was a bit too long for the ship Le Gentil hitched a ride on and he only made it as far as Mauritius (a small island off Madagascar). Dropped off with all his equipment, Le Gentil was left waiting for any ship at all to take him to Pondicherry.

Perhaps it was the Curse of the Dodo or perhaps it was just bad luck, but while he was waiting, Le Gentil learned that war had broken out between the French and the British, making a trip to British India very difficult for a Frenchman.

Then the monsoon season started, meaning that even if he could find a ship, it would have to take a much longer route to India than initially planned and that it would be very difficult to make the journey before the transit occurred.

Then, he caught dysentery for the first time.

Finally, after months of waiting, Le Gentil (barely recovered from his illness) left Mauritius for India in February of 1761. Though time appeared to be running out, the captain of the ship he was on promised he would be there to observe the transit in June. About halfway to India, the winds switched directions and the ship was forced to turn back to Mauritius.

Le Gentil dutifully observed the transit of Venus in 1761 from a rocky ship in the middle of the Indian Ocean. The data were useless and he never attempted any analysis.

Although he missed the first transit, these things come in pairs separated by eight years. There was still another chance. And with all this time to prepare, there was no way he was going to miss the second one.

In fact, there was a bit too much time. But as a world-traveling 18th century man of science, Le Gentil had plenty of other interests to fill his days. He was particularly interested in surveying the region around Madagascar.

So he made a really nice map of Madagascar. And then he ate some bad kind of some kind of animal and came down with a terrible sickness. He describes this illness and its "cure" in his journals:

This sickness was a sort of violent stroke, of which several very copious blood-lettings made immediately on my arm and my foot, and emetic administered twelve hours afterwards, rid me of it quite quickly. But there remained for seven or eight days in my optic nerve a singular impression from this sickness; it was to see two objects in the place of one, beside each other; this illusion disappeared little by little as I regained my strength...

After recovering from both his sickness and the treatment, Le Gentil decided to begin his preparations for the 1769 transit of Venus. He calculated that either Manila or the Mariana Islands would be the ideal spot to observe. The Sun would be relatively high in the sky at both places when Venus passed by, meaning that the view would be through less atmosphere with a reduced chance of clouds passing through the line of sight.

Le Gentil packed up his stuff and headed off to Manila, where he could catch another ship to get to the Mariana Islands. Arriving in Manila in 1766, the astronomer found himself exhausted from months of sickness and sea-voyage. So, when he was offered passage on a ship heading to the Mariana Islands, he quickly declined.

That he chose not to depart Manila at that time was perhaps his one stroke of good luck in the entire journey. The ship sunk. Writing in his journal, Le Gentil appears to have developed that particular sense of humor that generally accompanies constant disappointment:

It is true that only three or four people were drowned, those who were the most eager to save themselves, which is what almost always happens in shipwrecks. I cannot answer that I would not have increased the number of persons eager to save themselves.

In any case, Le Gentil was in Manila with plenty of time to prepare for the next transit.

Unfortunately, the astronomer may have over-prepared. Having arrived three years before the event, he now had three years to worry and second-guess his decision. It didn't help that the Spanish governor of Manila was kind of a crazy person. Not wanting to miss the observation of a lifetime owing to the whims of mildly insane strong man, Le Gentil packed up his stuff and headed to Pondicherry.

Finally in Pondicherry, Le Gentil worked tirelessly to construct his observatory and make plenty of astronomical observations in preparation for the event. He had state of the art equipment and had fully calibrated and double checked everything. It was now nine years since his journey began and only a few days until the transit was scheduled to occur at sunrise on June 4th.

The entire month of May was beautiful weather and pristine observing conditions, as were the first few days of June. Le Gentil likely went to bed on the 3rd of June fully confident that the next morning would be no different.

He woke up early in the morning to begin preparations for his sunrise observations only to find clouds on the horizon. The clouds remained, obscuring the sun, all through the duration of the transit. A few hours after the end of the transit, the sun broke through the clouds and remained visible for the rest of the day. Le Gentil had missed his second transit in Pondicherry.

He sums it up in his journal:

That is the fate which often awaits astronomers. I had gone more than ten thousand leagues; it seemed that I had crossed such a great expanse of seas, exiling myself from my native land, only to be the spectator of a fatal cloud which came to place itself before the sun at the precise moment of my observation, to carry off from me the fruits of my pains and of my fatigues

In Manila, the Sun rose in perfectly clear skies.

Distraught, Le Gentil remained in bed for some weeks afterward. He soon caught a fever and missed the ship that was supposed to take him home.

He recovered, but then came down with dysentery again.

Barely recovered from his various illnesses, he managed to get a ride back to Mauritius. He caught a ship leaving the island in November of 1770.

The ship was struck by a hurricane and almost completely destroyed. It managed to limp back to Mauritius.

The second attempt proved more successful and Le Gentil finally "set foot on France at nine o'clock in the morning, after eleven years, six months and thirteen days of absence."

Though he had finally made it home, he was not out of the woods quite yet. In his absence, Le Gentil's heirs had tried to declare him dead to gain their inheritance, his accountant had mishandled (and lost) a large chunk of his holdings, and the Academy of Sciences, which had sent him on his 11 year mission, had given his seat to someone else.

It was not quite the welcome home he had hoped for.

Despite his seemingly never-ending misfortune, things did turn around for Le Gentil. He married, had a daughter, and was reinstated into the Academy of Sciences. Presumably, he lived out the rest of his days in relative happiness.

Le Gentil died in 1792. Keeping true to his style, this man who missed two of the most important astronomical events of his time fortunately managed to also miss the most important (and violent) political event of his time.

------

References:

I have mainly used a very nice series of historical papers of Le Gentil's misadventures with the transit of Venus written by Helen Sawyer Hogg. The papers were originally published in the Journal of the Royal Astronomical Society of Canada and can be accessed through NASA's ADS ( Part 1, Part 2, Part 3, Part 4).

More Transit of Venus:

If you want to see the Transit of Venus without having to go on an eleven year voyage (or even leaving your room), check out the NASA live-feed from Mauna Kea.

How Cold is the Ground II

2012-05-26T21:28:00.000-04:00


Images from Wikipedia

Last week (ok, it was a little more than a few days ago....) I used dimensional analysis to figure out how the ground's temperature changes with time. But although dimensional analysis can give us information about the length scales in the problem, it doesn't tell us what the solution looks like. From dimensional analysis, we don't even know what the solution does at large times and distances. (Although we can usually see the asymptotic behavior directly from the equation.) So let's go ahead and solve the the heat equation exactly:
\[
\frac {\partial T}{\partial t} = a \frac {\partial ^2 T}{\partial x^2} \quad (1)
\]

Well, what type of solution do we want to this equation? We want the temperature at the Earth's surface x=0 to change with the days or the seasons. So let's start out modeling this with a sinusoidal dependence -- we'll look for a solution of the form
\[
T(x,t) = A(x)e^{i wt}
\]
for some function A(x), then we can take the real part for our solution.

Plugging this into Eq. (1) gives A'' = iw/a * A, or

\[
A(x) = e^{ \pm \sqrt{w/2a } (1+i) x}
\]
Since we have a second-order ordinary differential equation for A, we have two possible solutions, which are like exp(+x) or exp(-x). Which one do we choose? Well, we want the temperature very far away from the surface of the ground to be constant, so we need the solution that decays with distance, A~exp(-x). Taking the real part of this solution, we find [1]
\[T(x,t) = T_0 \cos (wt + \sqrt{w/2a}\times x ) e^{-\sqrt{w/2a}x} \quad (2)
\]
Well, what does this solution say? As we expected from our scaling arguments last week, the distance scale depends on the square root of the time scale -- if we decrease our frequency by 4 (say, looking at changes over a season vs over a month), the ground gets cooler only 2x deeper. We also see that the temperature oscillation drops off quite rapidly as we go deeper into the ground, and that there is a "lag" the farther you go into the ground. In particular, we see that at distances deep into the ground, the temperature drops to its average value at the surface. You can see this all in the pretty plot below (generated with Python):

Let's recap. To model the temperature of the ground, we looked for a solution to the heat equation which had a sinusoidally oscillating temperature at x=0, and decayed to 0 at large x. We found a solution such a solution, and it shows that the temperature decays rapidly as we go far into the ground. At this point, there are two questions that pop into mind: 1) Is the solution that we found unique? Or are there other possible solutions? 2) This is all well and good, but what if our days or seasons aren't perfect sines? Can we find a solution that describes this behavior?

I'll give one (1) VirtuosiPoint to the first commenter who can prove to what extent the above solution is unique [2]. But how about the second point? Can we solve this for non-sinusoidal time variations?

Well, at this point most of the readers are rolling their eyes and shouting "Use a Fourier series and move on." So I will. Briefly, it turns out that (more or less) any periodic function can be written as a sum of sines & cosines. So we can just add a bunch of sines and cosines together and construct our final solution.

So just for fun, here is a plot of the temperature of the ground in Ithaca (data from Wikipedia) over a year. (I used a discrete Fourier transform to compute the coefficients.)

The temperature (colorbar) is in degrees C, assuming a=0.5 mm^2/s.

Looks pretty boring, but I swear that all the frequencies are in that plot. It just turns out that the seasons in Ithaca are pretty sinusoidal. So about 20 meters below Ithaca, the temperature is a pretty constant 8 C.

While I was postponing writing this, I wondered what the temperature on Mercury's rocks would be. If we dig deep enough, can we find an area with habitable temperatures? Some quick Googlin' shows that the daytime and nighttime temperatures on Mercury are ~550-700 K and ~110 K at the "equator." While I don't think that Mercury's temperature varies symmetrically, let's assume so for lack of better data.[3] Then we'd expect that deep into the surface, the temperature would be fairly constant in time, at the average of these two extremes. Plugging in the numbers (assuming a~0.52 mm^2/s and using a Mercurial solar day as 176 days), we get

T=94 C, at 2.75 meters into the surface.

[1] More precisely, since the heat equation is linear and real, if T(x,t) is a solution to the equation, then so are 1/2(T+T*) or 1/2i(T-T*).

[2] Hint: It's not unique. For instance, here is another solution that satisfies the constraints, with no internal heat sources or sinks (I'll call it the "freshly buried" solution):

Can you prove that all the other solutions decay to the original solution? Or is there a second or even a spectrum of steady state solutions?

[3] If someone provides me with better data of the time variation of Mercury's surface at some specific latitude, I'll update with a full plot of the temperature as a function of depth and time.

How Cold is the Ground?

2012-05-18T00:20:00.000-04:00

It snowed in Ithaca a few weeks ago. Which sucked. But fortunately, it had been warm for the previous few days, and the ground was still warm so the snow melted fast. Aside from letting me enjoy the absurd arguments against global warming that snow in April birthed, this got me thinking: How cold is the ground throughout the year? At night vs. during the day? And the corollary: How cold is my basement? If I dig a deeper basement, can I save on heating and cooling? (I'm very cheap.)

Well, we want to know the temperature distribution T of the ground as a function of time t and position x. So some googlin' or previous knowledge shows that we need to solve the heat equation. For our purposes, we can treat the Earth as flat (I don't plan on digging a basement deep enough to see the curvature of the Earth), so we can assume the temperature only changes with the depth into the ground x:
\[
\frac {\partial T}{\partial t} = a \frac {\partial^2 T} {\partial x^2} \qquad (1)
\]
where a is the thermal diffusivity of the material, in units of square meters per second. It looks like we're going to have to solve some partial differential equations!

Or will we? We can get a very good estimate of how much the temperature changes with depth just by dimensional analysis. Let's measure our time t in terms of a characteristic time of our problem w (it could be 1 year if we were trying to see the change in the ground's temperature from summer to winter, or 1 day if we were looking at the change from day to night). Then we can write:
\[
\frac {\partial T } {\partial t} = \frac 1 w \frac {\partial T} {\partial t/w}
\]
... plugging this in Eq. (1), rearranging, and calling l= sqrt(w*a) gives....
\[
\frac {\partial T}{\partial (t/w)} = \frac {\partial ^2 T} {\partial (x/ l )^2}
\]
Now let's say we didn't know how to or didn't want to solve this equation. (Don't worry, we do & we will). From rearranging this equation, we see right away there is only one "length scale" in the problem, l. So if we had to guess, we could guess that the ground changes temperature a distance l into the ground. A quick look at Wikipedia for thermal diffusivities gives us the following table, for materials we'd find in the ground:

Material	a, mm^2/s	l (cm), w = 1 day	l (cm), w = 1 year
Polycrystalline Silica (glass, sand)	0.83	27 cm	5.1 meters
Crystalline Silica (quartz)	1.4	35	6.6
Sandstone	1.15	32	6.0
Brick	0.52	21	4.0
Soil	0.3-1.25	16-33	3.1-6.3

So we would expect that the temperature of the ground doesn't change much on a daily basis a foot or so below the ground, and doesn't change ever about 15-20 feet into the ground.

Just to pat ourselves on the back for our skills at dimensional analysis, a quick check shows that permafrost penetrates 14.6 feet into the ground after 1 year. So our dimensional estimates looks pretty good!

In the next few days I'll solve this equation exactly and throw up a few pretty graphs, and maybe talk a little about PDE's and Fourier series in the process.

End of the Earth VII: The Big Freeze

2012-04-22T19:34:00.000-04:00

http://tinyurl.com/7rdj996

It is traditional here at The Virtuosi to plot the destruction of the earth. We also are making secret plans for our volcano lair and death ray. However, since it is earth day, we will only share with you the plans for the total doom of the earth, not the cybernetically enhanced guard dogs we're building for our moon base. The plan I reveal today is elegant in its simplicity. I intend to alter the orbit of the earth enough to cause the earth to freeze, thus ending life as we know it.

According to the internet at large, the average surface temperature of the earth is ~15 C. This average surface temperature is directly related to the power output of the sun. More precisely, it is directly related to the radiated power from the sun that the earth absorbs. Assuming that the earth's temperature is not changing (true enough for our purposes), the then power radiated by the earth must be equal to the power absorbed from the sun. More precisely

\[ P_{rad,earth}=P_{abs,sun}\]

Now, the radiated power goes as

\[P_{rad}=\epsilon \sigma A_{earth} T^4 \]

where A_earth is the surface area of the earth, T is the temperature of the earth, and epsilon and sigma are constants. I'll be conservative and say that I want to cool the temperature of the earth down to 0 C. The ratio of the power the earth will emit is

\[\frac{P_{new}}{P_{old}}=\frac{T_{new}^4}{T_{old}^4} \approx .81\]

Note that the temperature ratio must be done in Kelvin.

The power radiated by the sun (or any star) drops off as the inverse square of the distance from the sun to the point of interest:

\[P_{sun} \sim \frac{1}{r^2} \]

To reduce the power the earth receives from the sun to 81% of the current value would require

\[\frac{P_{sun,new}}{P_{sun,old}}=\frac{r_{old}^2}{r_{new}^2}=.81 \]

This tells us that the new earth-sun distance must be larger than the old (a good sanity check). In fact, it gives

\[r_{new}=1.11 r_{old} \]

So I'll need to move the earth by 11% of the current distance from the earth to the sun. No small task!

The earth is in a circular orbit (or close enough). To change to a circular orbit of larger radius requires two applications of thrust at opposite points in the orbit It turns out that the required boost in speed (the ratio of the speeds just before and after applying thrust) for the first boost of an object changing orbits is given by

\[\frac{v_{f}}{v_{i}}=\sqrt{\frac{2R_{f}}{R_i+R_f}}=1.026\]

To move from the transfer orbit to the final circular orbit requires

\[\frac{v_{f}}{v_{i}}=\sqrt{\frac{R_{i}+R_f}{2R_i}}=1.027\]

Note that despite the fact that we boost the velocity at both points, the velocity of the final orbit is less than that of the initial.

Now, how could we apply that much thrust? Well, the change in momentum for the earth from each stage is roughly (ignoring the slight velocity increase of the transfer orbit)

\[\Delta p = .03M_E v_E \]

The mass of the earth is ~6*10^24 kg, the orbital velocity is ~30 km/s, so

\[\Delta p = 5\cdot 10^{27} kg*m/s\]

A solid rocket booster (the booster rocket used for shuttle launches, when those still happened) can apply about 12 MN of force for 75 s (thank you wikipedia). That's a net momentum change of ~900 *10^9 kg*m/s (900 billion!). So we would only need

\[\frac{2*5\cdot 10^{27}}{9\cdot 10^{11}}=12 \cdot 10^{15}\]

That's right, only 12 million billion booster rockets! With those I can freeze the earth. I assure you that this plan is proceeding on schedule, and will be ready shortly after we have constructed our volcano lair.

Earth Day 2012: Escape to the Moon

2012-04-22T15:12:00.000-04:00

It is now Earth Day 2012, and, according to the Mayan predictions, The Virtuosi will destroy the earth. In a futile attempt to fight my own mortality, I decided to send something to the Moon. It seems, for a poor graduate student trying to get to the Moon, the most difficult part is the Earth holding me back. So first I'll focus on escaping the Earth's gravitational potential well, and if that's possible, then I'll worry about more technical problems, such as actually hitting the moon. Moreover, in honor of the destructive spirit of The Virtuosi near Earth Day, I'll try to do this in the most Wiley-Coyote-esque way possible.

Preliminaries

If we want to get to the Moon, we need to first figure out how much energy we'll need to escape the Earth's gravitational pull. "That's easy!" you say. "We need to escape a gravitational well, and we know from Newton's law that the potential from a spherical mass ME 's gravity for a test mass m is :
\begin{equation}
\Phi = - \frac {G M_{E} m}{r}
\label{eqn:gravpot}
\end{equation}
We're currently sitting at the radius of the Earth RE, so we simply need to plug this value in and we'll find out how much energy we need." This is all well and good, but i) I can never remember what the gravitational constant G is, and ii) I have no idea what the mass of the Earth ME is. So let's see if we can recast this in a form that's easier to do mental arithmetic in.

Well, we know that the force of gravity is the related to the potential by:
\[
\vec{F}(r) = - \vec{\nabla} \Phi = - \frac {d\Phi}{dr} \hat{r} \\
\vec{F} = - \frac {G m M_E } {r^2}
\label{eqn:gravforce}
\]
Moreover, we all know that the force of gravity at the Earth's surface is F(r=RE)=-mg. Substituting this in gives:
\[
\frac {G m M_E} {R_E^2} = m g \quad \textrm{, or}
\]
\begin{equation}
\frac {G m M_E}{R_E} = m g {R_E} \quad .
\label{eqn:betterDef}
\end{equation}
So the depth of the Earth's potential well at the Earth's surface is mgRE. If we use g = 9.8 m/s^2 ~ 10 m/s^2 and RE = 6378 km ~ 6x10^6 m, then we can write this as
\begin{equation}
\Delta \Phi = m g {R_E} \approx m \times 6 \times 10^7 \textrm{m}^2/\textrm{s}^2 \quad \textrm{(1)},

\end{equation}
give or take.

How fast do we need to go if we're going to make it to the Moon? Well, at the minimum, we need the kinetic energy of our object to be equal to the depth of the potential well [1], or
\[
\frac 1 2 m v^2 = 6 m \times 10^7 \textrm{m}^2/\textrm{s}^2 \quad \textrm{or} \\
v \approx 1.1 \times 10^4 \textrm{ m/s (2)} .
\]
So we need to go pretty fast -- this is about Mach 33 (33 times the speed of sound in air). At this speed, we'd get from NYC to LA in under 7 minutes. Looks difficult, but let's see just how difficult it is.

Attempt I: Shoot to the Moon

What goes fast? Bullets go fast. Can we shoot our payload to the moon? Let's make some quick estimates. First, can we shoot a regular bullet to the moon? Well, we said that we need to go about Mach 33, and a fast bullet only goes about Mach 2, so we won't even get close. Since energy is proportional to velocity squared, we'll only have (2/33)^2 ~ 0.4 % of the necessary kinetic energy. [2]

So let's make a bigger bullet. How big does it need to be? Well, loosely speaking, we have the chemical potential energy of the powder being converted into kinetic energy of the bullet. Let's assume that the kinetic energy transfer ratio of the powder is constant. If a bullet receives kinetic energy 1/2mbvb^2 from a mass mP of powder, then for our payload to have kinetic energy 1/2 M V^2, we need a mass of powder MP such that
\begin{equation}
\frac {M_P} {m_P} = \frac M {m_b} \times \frac {V^2}{v_b^2}
\end{equation}
A quick reference to Wikipedia for a 7.62x51mm NATO bullet shows that ~25 grams of powder propels a ~10 gram bullet at a speed of ~Mach 2.5. We need to get our payload moving at Mach 33, so (V/vb)^2 ~ 175. If we send a 10 kg payload to the Moon, we have M/mb ~ 1000. So we'll need about 1.75 x 10^5 the amount of powder of a bullet to get us to the Moon, or about 4400 kg, which is 4.8 tons (English) of powder.

That's a lot of gunpowder to get us to the Moon. For comparison, if we are going to construct a tube-like "case" for our 10 kg bullet-to-the-Moon, it will have to be about half a meter in diameter and 17 feet tall. So I'm not going to be able to shoot anything to the Moon anytime soon.

Attempt II: Charge to the Moon

OK, shooting something to the Moon is out. Can we use an electric field to propel something to the Moon? Well, we would need to pass a charged object through a potential difference such that
\begin{equation}
q \Delta \Phi_E = m g R_E = 6 m \times 10^7 \textrm{m}^2/\textrm{s}^2 \quad .
\label{eqn:chargepot}
\end{equation}
After the humiliation of the last section, let's start out small. Can we send an electron to the Moon? We could plug numbers into this equation, but I'm too lazy to look up all those values. Instead, we know that we need to get our electron (rest mass 511 keV) to a speed which is (Eq. 2)
\[v \approx 1.1 \times 10^4 \textrm{m/s} \approx 4 \times 10^{-5} c.
\]
So an electron moving at this velocity will have a kinetic energy of
\[ \textrm{KE} = m c^2 \times \frac 1 2 \frac {v^2}{c^2} = 511 \textrm{ keV} \times \frac 1 2 \frac {v^2}{c^2} \\
\qquad \approx 511 \textrm{ keV} \times 0.8 \times 10^{-9} \approx 0.4 \times 10^{-3} eV.
\]
So we can give an electron enough kinetic energy to get to the moon with a voltage difference of 0.4 mV, assuming it doesn't hit anything on the way up (it will).

We can send an electron to the Moon! How about a proton? Well, the mass of a proton is 1836x that of an electron, but with the same charge, so we'd need 1836 * 0.4 mV ~ 0.73 V to get a proton to the Moon -- again, pretty easy. Continuing this logic, we can send a singly-charged ion with mass 12 amu (i.e. C-) with a 9V battery, and a singly-charged ion with mass 150 amu (something like caprylic acid) using a 110V voltage drop. (Again, assuming these don't hit anything on the way up.)

How about our 10 kg object? Let's say we can miraculously charge it with 0.01 C of charge. [3] Then from Eq. (1), we'd need
\[ 0.01 C \times \Delta \Phi_E \approx 6 \times 10^8 \textrm{ J ,}
\]
or a potential difference of
\[
\Delta \Phi_E = 6 \times 10^{10} \textrm{ V. }
\]
That is a HUGE potential drop. For comparison, if we have 2 parallel plates with a surface charge of 0.01 C/m^2 (again, a huge charge density), they'd have to be a distance
\[
d = 6 \times 10^{10} \textrm{V} \times \epsilon_0 / (0.01 \textrm{C/m}^2) \approx 53 \textrm{ meters apart}
\]

It looks like I won't be able to send something to the Moon using tools from my basement anytime soon.

[1] We'll ignore both air resistance and the Moon's gravitational attraction for simplicity.

[2] Since the potential U ~ - 1/r, if we increase our potential energy by 0.4%, this is (to 1st order) the same as increasing r by 0.4%. So we'll get 0.004 * 6378 km ~ 25 km above the Earth's surface. Of course air resistance slows it down a lot.

[3] According to Wikipedia, this is 0.04% of the total charge of a thundercloud. And if our object is uniformly charged with a radius of 1 m, it will have an electrical self-energy of
\[
U = \frac 1 2 \int \epsilon_0 E^2 dV \approx 36 \textrm{kJ}
\]

Money for (almost) Nothing

2012-03-29T01:27:00.001-04:00

Five Hundred Mega Dollars, to be precise.
(Image from Wikipedia)

I am not typically interested in lotteries. They seem silly and I am seriously beginning to question their usefulness in bringing about a good harvest. But this morning I read in the news that the Mega Millions lottery currently has a world record jackpot up for grabs. In fact, the jackpot is so big...

Tonight Show Audience: HOW BIG IS IT?

It is so big that I decided to do a little bit of analysis on the expected returns. Zing!

Some Background

First, a little background. The Mega Millions lottery is an aptly named lottery in which numbered ping pong balls are pulled from a giant rotating tub of randomization. Five of these are drawn from one tub of 56 balls, with no replacement. The sixth ball (the so-called "Mega Ball") is drawn from a separate tub of 46 balls.

To play, one picks 5 different numbers (1-56) for the regular draws and one number (1-46) for the Mega Ball. The first five can match in any order, but the last ball has to match with the Mega Ball. Prizes are given out based on how many numbers you match.

Stolen from the Mega Millions website, the prizes and odds are given in the table below. The current jackpot is listed at $500 million (if taken in annuity) or $359 million if taken in an up-front lump sum. It costs $1 dollar to play.

Don't worry about the asterisk. It just says CA is lame.
(Source: Mega Millions)

Hot diggity daffodil, we're ready to get going!

Expected Winnings

Alright, so it costs $1 to play and we could potentially win $500 million. It sure feels like it is worth it to play (what's the harm?). But we can do better than feelings, we have... MATH!

Since we have an exhaustive list of outcomes and their probabilities (which is just the inverse of the big number in the "chances" column), we can calculate the expectation value for our winnings. The expectation value is just the sum over all the possible prize values times the probability of winning that prize. In other words,

\[\langle W \rangle = \sum_i W_i \times p_i, \]

where we denote our expected winnings in angled brackets.

In essence, this value represents the average prize you would win if you played this lottery over and over and over again (or played all the combinations of numbers).

Setting the jackpot to $500 million, we can now compute the expected winnings as

\[ \langle W \rangle = \frac{\$ 500,000,000}{175,711,536} + \frac{\$ 250,000}{3,904,701} +\frac{\$ 10,000}{689,065} + \frac{\$ 150}{15,313} + \frac{\$ 150}{13,781} \]

\[+ \frac{\$ 10}{844} + \frac{\$ 7}{306} + \frac{\$ 3}{141} + \frac{\$ 2}{75}\]

A few flicks of the abacus later, we find that the expectation value of our prize is

\[\langle W \rangle = \$ 3.02,\]

which means that after we subtract the dollar we paid for the ticket, our expected return is $2.02.

But what if we had chosen to take our winnings as a lump sum of $359 million instead of the $500 million paid out over a span of 26 years? In that case we find

\[\langle W \rangle = \$ 2.22,\]

which results in a $1.22 gain when we subtract the dollar we paid for the ticket.

At least in a statistical sense for this particular jackpot, one is better off playing than not playing. But are we forgetting anything?

The Taxman

If you win a $500 million jackpot, do you really get a $500 million jackpot? Well, no. For winnings in a lottery over $5000, the IRS withholds 25% in federal income taxes. Additionally, the winnings are subject to state taxes as well. For example, if I were to win, the great state of New York would be entitled to about 6.8% (apparently also just for winnings above $5000).

After applying federal and state taxes to the prizes above $5000, we now have an expected winnings of

\[ \langle W \rangle = \left[1-(0.25 + 0.068)\right]\times\left(\frac{\mbox{Jackpot}}{175,711,536} + \frac{\$ 250,000}{3,904,701} +\frac{\$ 10,000}{689,065}\right) \]

\[+ \frac{\$ 150}{15,313} + \frac{\$ 150}{13,781}+ \frac{\$ 10}{844} + \frac{\$ 7}{306} + \frac{\$ 3}{141} + \frac{\$ 2}{75},\]

which gives an expected net win (minus the $1 for the ticket) of $1.10 for the $500 million annuity prize and $0.55 for the $359 million up-front lump sum.

We're still in the black, but it's slowly slipping away. Is there anything else we need to factor in? Well, yes. For one thing, winning the jackpot qualifies us for the top tax bracket, so most of the winnings would be taxed at the top marginal tax rate of 35%. Welcome to the 1%, kids! [1].

Changing the federal tax rate on the jackpot from 25% to 35% and recalculating, we find net expected winnings of $0.81 for the $500 million annuity and $0.34 for the $359 million lump sum. Surprisingly, it is still worth it in a statistical sense.

Is it always like this?

One thing to keep in mind as we make these estimates is that this is a historically large jackpot. So even though it may be favorable to play this time, this will not always be the case. In fact, we can find the minimum jackpot value for which this is the case.

The condition in which our expected return is a gain (rather than a loss) is

\[ \langle W \rangle - \$1.00 > 0. \]

For simplicity, let's ignore the top marginal tax rate and just factor in the 25% withholding and the 6.8% state tax. Solving for the minimum jackpot using the expression for we found in the last section, we see that

\[ \mbox{Jackpot}_{min} = \$217~\mbox{million}.\]

Technically, this would have to be the amount actually awarded by the payment method of your choice. The stated jackpot is always the annuity method (because it looks higher). The lump sum offering is at most about 70% of the stated jackpot. So if you want to take the lump sum offering the stated jackpot will need to be

\[ \mbox{Jackpot}_{min} = \$217~\mbox{million} / 0.7 = \$310~\mbox{million}.\]

In fact, these values are likely a bit low, since we have not included the increase to the marginal tax rate, nor have we included other effects like having to split a prize (which seems to happen a lot) or inflation effects if you take the prize in yearly installments.

In any case, a quick look through the jackpot history shows that these threshold values are only met occasionally. An eyeball estimate puts about one jackpot per year that exceeds the (absolute) minimum $217 million threshold.

So am I going to win?

No. No, you will not. BUT if you played record setting lotteries hundreds of millions of times, you might see decent (~10%) returns. Although, it may just be easiest to, you know, invest that money.

Only One Useless Footnote

[1] Although, to be fair, the top marginal tax rate is currently at historical lows. It could always be worse... [back]

Pi storage

2012-03-14T15:13:00.001-04:00

Let me share my worst "best idea ever" moment. Sometime during my undergraduate I thought I had solved all the world's problems.

You see, on this fateful day, my hard drive was full. I hate it when my hard drive fills up, it means I have to go and get rid of some of my stuff. I hate getting rid of my stuff. But what can someone do?

And then it hit me, I had the bright idea:

What if we didn't have to store things, what if we could just compute files whenever we wanted them back?

Sounds like an awesome idea, right? I know. But how could we compute our files? Well, as you may know pi is conjectured to be a normal number, meaning its digits are probably random. We also know that it is irrational, meaning pi never ends....

Since its digits are random, and they never end, in principle any sequence you could ever imagine should show up in pi eventually. In fact there is a nifty website here that will let you search for arbitrary strings (using a 5-bit format) in first 4 billion digits, for example "alemi" seems to show up at around digit 3149096356.

So in principle, I could send you just an index, and a length, and you could compute the resulting file.

But wait you cry, isn't computing digits of pi hard, don't people work really hard to compute pi farther and farther? Hold on I claim, first of all, I'm imagining a future where computation is cheap. Secondly, there is a really neat algorithm, the BBP algorithm, that enables you to compute the kth binary digit of pi without knowing any of the preceding digits. In other words, in principle if you wanted to know the 4 billionth digit of pi, you can compute it without having to first compute the first 4 billion other digits.

Cool, this is beginning to sound like a really good idea. What's the catch?

Perhaps you've already gotten a taste of it. Let's try to estimate just how far along in pi we would have to look before our message of interest shows up.

Let's assume we have written our file in binary, and are computing pi in binary e.g.

11.
00100100 00111111 01101010 10001000 10000101 10100011 00001000 11010011

etc. So, if the sequence is random, there is a 1/2 chance that at any point we get the right starting bit of our file, and then a 1/2 chance we get the next one, etc. So the chance that we would create our file if we were randomly flipping coins would be
\[ P = \left( \frac{1}{2} \right)^N = 2^{-N} \]
if our file was N bits long.

So where do we expect this sequence to first show up in the digits of pi? Well, this turns out to be a subtle problem, but we can get a feel for it by assuming that we compute N digits of pi at a time and see if its right or not. If its not, we move on to the next group of N digits, if its right, we're done. If this were the case, we should expect to have to draw about
\[ \frac{1}{P} = 2^N \]
times until we have a success, and since each trial ate up N digits, we should expect to see our file show up after about
\[ N 2^N \]
digits of pi.

Great, so instead of handing you the file, I could just hand you the index the file is located. But how many bits would I need to tell you that index. Well, just like we know that 10^3 takes 4 digits to express in decimal, and 6 x 10^7 takes 8 digits to express, in general it takes
\[ d = \log_b x + 1 \]
digits to express a number in base b, in this case it takes
\[ d = \log_2 ( N 2^N ) + 1= \log_2 2^N + \log_2 N + 1 = N + \log_2 N + 1 \]
digits to express this index in binary.

And there's the rub. Instead of sending you N bits of information contained in the file, all my genius compression algorithm has manged to do is replace N bits of information in the file, with a number that takes $ ~ N + \log_2 N $ bits to express. I've actually managed to make the files larger not smaller!

You may have noticed above, that even for the simple case of "alemi", all I managed to do was swap the binary message

alemi -> 0000101100001010110101001
with the index 3149096356 -> 10111011101100110110010110100100

which is longer in binary!

As an aside, you may have felt uncomfortable with my estimation for how long we have to wait to see our message, and you would be right. Just because all N digits I draw at a time don't match up doesn't mean that the second half isn't useful. For instance if I was looking for 010, lets say some of the digits are 101,010. While both of those sequences didn't match, if I was looking at every digit at a time, I would have found a match. And you'd be right. Smarter people than I have computed just how long you should have to wait, and end up with the better estimation
\[ \text{wait time} \sim 2^N N \log 2 \]
which is pretty darn close to our silly estimate.

Calculator Pi

2012-03-14T14:16:00.004-04:00

There is a very fast converging algorithm for computing pi that you can do on a desktop calculator.

Set x = 3
Now set x = x + sin(x)
Repeat

This converges ridiculously fast, after 1 step you get 4 digits right, after 2 steps you get 11 correct, in general we find:

# steps	Digits right
1	4
2	11
3	33
4	100
5	301
6	903
7	2708
8	8124

of course on a pocket calculator, you only need to do 2 steps to have an accuracy greater than the calculator can display. To make this chart I had to trick a computer into doing high precision arithmetic, the code is here.

Granted, this approximation is really cheating, since sin is a hard function to compute, and basically being able to compute sin means you know what pi is already. Really, this is just Newton's method for computing the root of sin(x) in disguise

A Clarification

2012-03-14T12:20:00.004-04:00

As there seems to be some confusion among my fellow Virtuosi, I wanted to point out that Pi day occurs on July 22nd or, in the year 4159, on January 3rd.

Today is in fact Seventh Power Day.

Pi-rithmetic

2012-03-14T11:52:00.001-04:00

Fun fact: pi squared is very close to 10. How close? Well, Wolfram Alpha tells me that it is only about 1% off.

I first realized this fact when looking at my slide rule, pictured to the left (click to embiggen), just another reason why slide rules are awesome.

It turns out I use this fact all of the time. How's that you ask? Well, I use this fast to enable me to do very quick mental arithmetic.

It goes like this. For every number you come across in a calculation, drop all of the information save two parts, first, what's its order of magnitude, that is, how many digits does it have, and second, is it closest to 1, pi, or 10?

The first part amounts to thinking of every number you come across as it looks in scientific notation, so a number like 2342 turns into 2.342 x 10^3, so that I've captured its magnitude in a power of 10. As for the next part, the rules I usually use are:

If the remaining bit is between 1 and 2, make it 1
If its between 2 and 6.5 make it pi
if its bigger than 6.5, make it another 10

Another way to think of this is to estimate every number to be a power of ten, and then either 1, a few, or 10. The reason I choose pi is because if I use pi, I know how the rest of the arithmetic should work, namely, I only need to know a few rules, plus when I use this to estimate answers of physics formulae, making a bunch of pis show up tends to help me cancel other natural pis that are in the formulae.

\[ \pi \times \pi \sim10 \qquad \frac{1}{\pi} \sim \frac{\pi}{10} \qquad \sqrt{10} \sim \pi \]

Which you might notice is just the same approximation written in 3 different ways.

Let's work an example

\[ \begin{align*} 23 \times 78 / 13 \times 2133 &= ? \\ \pi \times 10 \times 100 / 10 \times \pi \times 10^3 &= ? \\ \pi^2 \times 10^5 &\sim 10^6 \\ \end{align*}\]

of course the real answer is 294,354, so you'll notice I got the answer wrong, but I only got it wrong by a factor of 3, which is pretty good for mental arithmetic, and in particular mental arithmetic that takes no time flat.

In fact, the average error I introduce by using this approximation is just 30% or so for each number, which I've shown below [the script that produced this plot for those interested is here].

So, there you go, now you can impress all of your friends with a simple mental arithmetic that gets you within a factor of 3 or so on average.

Moving Pi-ctures

2012-03-14T08:35:00.000-04:00

My TV celebrates without me.

Today, as I'm sure you're aware, is Pi Day - a day for the festive consumption of pies and quiet self-reflection. In the spirit of the holiday, I'd like to present a point for discussion:

Everyone has a great talent for at least one thing.

That this is true for at least some people is seen through even a cursory glance at a history book: George Washington was really good at leading revolutions, Michelangelo was an outstanding ceiling painter [1], and Batman was the best at solving complex riddles (especially in English, pero especialmente en español).

But I'm certain that this holds for everyone. What's your talent? Mine, as those of you who read this blog should know very well by now, is certainly not doing physics. Nope, my talent is watching TV. Seriously guys, I watch TV like a boss [2]. In light of this talent, I thought I would describe a few instances in which I have seen pi represented (for better or for worse) in TV and movies.

Over the last few months, I have been re-watching a lot of the TV show Psych with my good friend and fellow Virtuosi contributor, Matt "TT" Showbiz [3]. For the uninitiated, Psych is a detective show where the main characters (Shawn and Gus) run a (fake) psychic detective agency, which allows them to solve mysteries, engage in various shenanigans, and make an inordinate number of references to Tears for Fears frontman Curt Smith [4].

In one of the episodes, Shawn and Gus enter a room where a long train of digits is written across the top of the wall. It soon becomes evident that these are the digits of pi and the camera is sure to zoom in on the famous first few digits to reassure us.

But there are hundreds of digits written out and I have very little faith in TV prop people when it comes to background mathematical expressions. So I decided to check it out.

Pi on the Wall (click to enhance for texture)

Using a neat little pi searcher, I checked to see if (and where) this sequence appeared in pi. Turns out it's legit and (almost!) correct. The first 105 digits of pi (counting after the three) are:

3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067982148

where I have underlined the 99th, 100th, and 101st digits. Looking back at the writing on the wall, we see that the 100th digit has been duplicated.

Very Almost Pi

So close! Oh well, nobody is perfect. Even though there is an error here, I very much appreciate that whoever was doing the set design decided to use the actual digits of pi. All too often I see nonsensical equations in the background of TV shows and movies when it would take exactly the same amount of work to put real equations there. So congratulations to you, O nameless prop-making intern!, for giving an accurate (well, to a part in 10^100) value of pi.

Neat, so are there any other TV shows or movies that have pi in them? Well, there's Pi. Pi is film by Darren Aronofsky (Requiem for a Dream, etc) about a mathematician looking for patterns in the stock market. It's a pretty good movie with a really cool soundtrack by Clint Mansell. It also appears to display the digits of pi in the opening credits. But does it? To the Youtube-mobile!

You can watch the opening credits here if you like and here is a still image of the relevant section.

Pi?

Looks pretty cool, huh? But once we get past the slick aesthetics, we see that something doesn't seem right. This number they are showing appears at first glance to be our good friend pi, but after the 8th digit the cover is blown and we see that this is actually some impostor number!

More like Darren Aron-wrong-sky.

Now, I fully understand that this has no bearing whatsoever on the film and, in the grand scheme of things, is not a Big Deal. But it would have been just as easy to put the real digits of pi here instead of just random filler.

The only way that this could possibly be better than the real deal would be if it is actually a secret code. I have not yet ruled this out, as the movie is entirely about looking for meaning in seemingly random numbers. Unfortunately, the difficulty in transcribing the numbers from the screen greatly outweighs the very small chance that this isn't just gibberish. Four hundred Quatloos to anyone who can tell me if this is a code or not!

[1] And an above average Ninja Turtle to boot. [back]

[2] Yes, I am putting my TV watching skills on par with the talents of George Washington. In fact, the stoic way in which I persevered through the entirety of The Sarah Connor Chronicles in under two weeks was described by historian David McCullough as "Washingtonian." These are simply facts. [back]

[3] The extra "T" is for extra talent. [back]

[4] A duo can absolutely have a frontman. For evidence, feel free to ask the not-George-Michael-guy from Wham! or the not-Paul-Simon-guy from Simon & Garfunkel. [back]

A Very Small Slice of Pi

2012-03-14T08:22:00.000-04:00

Rhubarb pie (Source: Wikipedia)

Some people know a suspiciously large number of the digits of pi. Perhaps you have met one of these people. They can typically be found hiding behind bushes and under the counters at pastry shops, just... waiting.

At the slightest hint of a mention of pi, they will jump out and start reciting the digits like there's a prize at the end. After rattling off numbers for a few minutes they abruptly come to an end, grin like an idiot, and walk away. It is an unpleasant encounter.

The sheer uselessness of this kind of thing has always bothered me, so I'd like to set a preliminary upper bound on the number of digits of pi that could ever possibly potentially kind of be useful (maybe). For those following along at home, now would be a good time to put on your numerology hats.

Alright, so I hear this thing pi is fairly useful when dealing with circles. Let's say we want to make a really big circle and have its diameter only deviate by a very small amount from the correct value. To do this successfully, we will have to know pi fairly well.

Let's take this to extremes now. Suppose I want to put a circle around the entire visible universe such that the uncertainty in the diameter is the size of a single proton. What would be the fractional uncertainty in the circumference in this case?

If we know pi exactly, then we have that

\[\delta C = \frac{\partial C}{\partial d} \delta d = \pi \delta d = C \frac{\delta d}{d}, \]

where d is the diameter and C is the circumference. In other words, the fractional uncertainty in the circumference is just

\[\frac{\delta C}{C} = \frac{\delta d}{d}. \]

Using a femtometer for the size of a proton and 90 billion light years for the size of the Universe [1], we get

\[\frac{\delta C}{C} = \frac{\delta d}{d} = \frac{10^{-15}\mbox{m}}{(90\times10^9)(3\times10^7\mbox{s})(3\times10^8\mbox{m s}^{-1})} \sim\frac{10^{-15}}{10^{27}}\sim10^{-42}.\]

Alright, so how well do we need to know pi to get a similar fractional uncertainty? Well, we have that

\[\frac{\delta \pi}{\pi} = \frac{\delta C}{C} = 10^{-42}, \]

so we can afford an uncertainty in pi of

\[ \delta \pi = \pi \times 10^{-42}\]

and thus we'll need to know pi to about 42 digits. How's that for an answer?

So if we have a giant circle the size of the entire visible universe, we can find its diameter to within the size of a single proton using pi to 42 digits. Therefore, I adopt this as the maximal number of digits that could ever prove useful in a physical sense (albeit under a somewhat bizzarre set of circumstances).

If reciting hundreds of digits is what makes you happy, go for it. But 42 digits is more than enough pi for me.

[1] "But I thought the Universe was only 13.7 billion years! What voodoo is this!?" Yeah, I know. See here for a nice explanation. [back]

Primes in Pi

2012-03-14T03:14:00.000-04:00

Recently, I've been concerned with the fact that I don't know many large primes. Why? I don't know. This has led to a search for easy to remember prime numbers. I've found a few goods ones, namely

867-5309 - Jenny's number
the digit 1 - 1031 times, in the style of the picture to above, and the largest known repunit prime
1987 (my birth year), 2011 (last year), 1999 (the party year)

But then I remembered that I already know 50 digits of pi, memorized one boring day in grade school, so this got me wondering whether there were any primes among the digits of pi

Lo an behold, I wrote a little script, and found a few:

Found one, with 1 digits, is: 3 Found one, with 2 digits, is: 31 Found one, with 6 digits, is: 314159 Found a rounded one with 12 digits, is: 314159265359 Found one, with 38 digits, is: 31415926535897932384626433832795028841

I think it's usual for most science geeks to know pi to at least 3.14159, if you're one of those people, now you know a 6 digit prime! for free!

F-91 Revisited

2012-03-12T00:57:00.000-04:00

Farmer Uncle Sam...with a rifle.
(Image Credit: Wikipedia)

Today was a sunny exception to the grey overcast rule of weather in Ithaca. I should be overjoyed by this anomaly, spending the day outside flying a kite or playing frisbee with a border collie in a bandanna.

Unfortunately, today was also the beginning of Daylight Savings Time (DST) - my least favorite day of the year. For my colleagues unfamiliar with this temporal travesty (I'm looking at you Arizona), let me briefly explain DST.

Once a year, the time lords steal a single hour from us and place it in an escrow account for future disbursement, presumably in some elaborate scheme to gain the favor of hat-throwing farmer-clock hybrids (see image left). The details are a bit murky, but the net result is that today I had one less hour to do my very favorite thing in the whole wide world - sleep.

It also means that I have to set my watch, so I figured I'd check in and see how well my previous model for time-loss in my watch has held up.

About a month ago, I looked at how my watch slowly deviated from the official time (the original post can be found here and a helpful clarification by Tom can be found here). Based on a little over 50 days worth of data, I found that my watch lost about 0.35 seconds per day against the official time. About 50 days have passed since my last measurement and today when I set the watch, so I thought it would be interesting to see how well my model fit the new data.

The old data are presented in Figure 1 in blue, the old best fit line is in red, and the new data point (taken this morning) is in green. As always, click through the plot for a larger version.

Figure 1

The new data point appears to be in fair agreement with the old best-fit model, but it's a little hard to see here. Zooming in a bit, though, we see that the model lies outside the error bar of the new data point.

Figure 2

So is this a big deal? Not really. But if it will help you sleep at night, we can redo the fitting with the new data point included to see how much things change. The the plots with the updated model to include all data points are provided below with the old data in blue and the new point in green.

Figure 3

The new model looks a whole lot like the old one, except the best fit line now appears to go through the new data point. Zooming in a little, we see that it does indeed fall within the error bars of our new point.

Figure 4

Alright, so the new model fits with our new point, but how much did the model have to change? Well, the fit to just the old data gave an offset of 0.36 seconds and a loss rate of 0.35 seconds per day. The new model has an offset of 0.40 seconds and a loss rate of 0.348 seconds per day. Overall, not a significant change.

It looks as though I may continue to not worry about the accuracy of my watch. I have set it to match the official time and have no intention of fiddling with it until I have to set it again at the end of Daylight Savings Time - my favorite day of the year.

Proofiness: A look into how mathematics relates to American political life

2012-03-06T14:37:00.000-05:00

Dearest readers,

This is my first post on The Virtuosi, so I thought I’d take a moment to introduce myself. I’m a first year physics graduate student at Cornell, recently joined after 2 years working as an engineer first at a private firm and then at a national lab. I myself have had lots of fun following the exploits of my estimable colleagues here on The Virtuosi, and I thought I could bring a new angle to the content here. I would like to use this space to discuss how science interacts with everyday life in a cultural sense. How does science appear in popular culture? How do political or social issues relate back to science? Those sorts of questions. (I understand that there are plenty of other resources elsewhere that offer far more intelligent insight into these matters than I can, but in the very least this will give people a chance point them out to me as they yell at me in the forum below.)

Enough intro, here begins my very first blog post:

Being interested in how science is communicated to the public, I am an avid reader of popular science. While academic types sometimes dismiss this kind of writing as shallow or otherwise uninteresting, I think science writers perform a very important function serving as a way to convey information about conceptually challenging topics to a general audience. At their best, I find that these books serve as examples for how I can communicate my own ideas better, and in addition challenge my understanding of how science relates back to society in general.

This being said, I cannot recommend Charles Seife’s Proofiness enough. The basic premise of this book is to explore the way that good mathematics is hijacked, twisted, or ignored in everyday life, and the ugly consequences of the tendency to misunderstand numbers and measurements.

Seife gives a number of fascinating examples of the ways in which numbers and math connect to American democracy. American government functions through representation, and so the “enumeration” of citizens and their opinions through the Census and elections is an essential part of the democratic process. This “enumeration” is a counting measurement, subject to errors like any other. And yet, the laws that govern how Censuses and elections are run ignore this fact. Seife’s discussion of elections (and in particular Bush v. Gore) is fascinating, but I won’t spoil that here. Here’s my take on the discussion of the Census that appears in Proofiness:

Consider a (vague) physics experiment. I want to know how many particles are inside a box. To figure this out, I have a detector that goes *ping* every time a particle passes through it. I set up my detector inside the box and count the number of times that it goes *ping* in a certain amount of time. I can then use that count to guess at the number of particles that I have in my box. My measurement will let me estimate N to within some margin of error. This process is perhaps unnecessary if I have only five particles in my box (in which case, I might just open the box and count what I see inside), but if I have 300 million particles in my box, it would be totally impractical for me to reach into the box 300 million times and count each one individually.

We can consider the Census to be just like this physics experiment. I have N inhabitants (particles) living in my country (box), and I can use my detector (census replies) to count a certain number of people. In principle, using well-understood statistical techniques of regression and error analysis, I can estimate to within a very good margin of error how many people live in each region of the country. Instead, what the Census requires is that we reach inside the box (send representatives to every household that doesn’t reply by mail) and count every single person. The whole process ignores the fact that even if we send a representative to every single household there will still be some margin of error in our counting measurements. No such measurement can be made without errors.

The consequences of ignoring these errors, says Seife, can be that we waste money in attempting the impossible and trying to count everybody. From a civic-minded perspective, this attitude towards the perfection of the Census can backfire. For example, if undercounting occurs (i.e., certain households do not respond for some reason), the Census has no mechanism for correcting that miscount. Counter-intuitively, the Census laws actually prohibit the use of any statistical techniques to correct miscounting. The result is that those slow to respond are ignored and not taken into account when allotting seats in the legislature to represent them.

Proofiness is a fascinating book and a fun read, and I recommend you all look it up. In addition, it serves as an excellent example of science writing that helped me to rethink how scientific ideas relate to everyday life. I hope to invite consideration of these topics here and in future posts.

If you want to know more about the inspiration for this post, go here.

Time Keeps On Slippin'

2012-02-12T22:00:00.000-05:00

This is picture of a watch. (Source: Wikipedia)

A couple of months ago, the Virtuosi Action Team (VAT) assembled for lunch and the discussion quickly and unexpectedly turned to watches. As Nic and Alemi argued over the finer parts of fancy-dancy watch ownership, I looked down at my watch: the lowly Casio F-91W. Though it certainly isn't fancy, it is inexpensive, durable, and could potentially win me an all-expense paid trip to the Caribbean.

But how good of a watch is it? To find out, I decided to time it against the official U.S. time for a couple of months. Incidentally, about half-way in I found out that Chad over at Uncertain Principles had done essentially the same thing already. No matter, science is still fun even if you aren't the first person to do it. So here's my "new-to-me" analysis.

Alright, so how do we go about quantifying how "good" a watch is? Well, there seem to be two main things we can test. The first of these is accuracy. That is, how close does this watch come to the actual time (according to some time system)? If the official time is 3:00 pm and my watch claims it is 5:00 am, then it is not very accurate. The second measure of "good-ness" is precision or, in watch parlance, stability. This is essentially a measure of the consistency of the watch. If I have a watch that is consistently off by 5 minutes from the official time, then it is not accurate but it is still stable. In essence, a very consistent watch would be just as good as an accurate one, because we can always just subtract off the known offset.

To test any of the above measures of how "good" my cheap watch is, we will need to know the actual time. We will adopt the official U.S. time as provided on the NIST website. This time is determined and maintained by a collection of really impressive atomic clocks. One of these is in Colorado and the other is secretly guarded by an ever-vigilant Time Lord (see Figure 1).

Figure 1: Flavor Flav, Keeper of the Time

At 9:00:00 am EST on November 30th, I synchronized my watch with the time displayed on the NIST website. For the next 54 days, I kept track of the difference between my watch an the NIST time. On the 55th day, I forgot to check the time and the experiment promptly ended. The results are plotted below in Figure 2 (and, as with all plots, click through for a larger version).

Figure 2: Best-fit to time difference

As you can see from Figure 2, the amount of time the watch lost over the timing period appears to be fairly linear. There does appear to be a jagged-ness to the data, though. This is mainly caused by the fact that both the watch and the NIST website only report times to the nearest second. As a result, the finest time resolution I was willing to report was about half a second.

Adopting an uncertainty of half a second, I did a least-squares fit of a straight line to the data and found that the watch loses about 0.35 seconds per day.

As far as accuracy goes, that's not bad! No matter what, I'll have to set my watch at least twice a year to appease the Daylight Savings Gods. The longest stretch between resetting is about 8 months. If I synchronize my watch with the NIST time to "spring forward" in March, it will only lose about

\[ t_{loss} = 8~\mbox{months} \times 30\frac{\mbox{days}}{\mbox{month}} \times 0.35 \frac{\mbox{sec}}{\mbox{day}} = 84~\mbox{sec} \]

before I have to re-synchronize to "fall back" in November. Assuming the loss rate is constant, I'll never be more than about a minute and a half off the "actual" time. That's good enough for me.

Furthermore, if the watch is consistently losing 0.35 seconds per day and I know how long ago I last synchronized, I can always correct for the offset. In this case, I can always know the official time to within a second (assuming I can add).

But is the watch consistent? That's a good question. The simplest means of finding the stability of the watch would be to look at the timing residuals between the data and the model. That is, we will consider how "off" each point is from our constant rate-loss model. A plot of the results is shown below in Figure 3.

Figure 3: Timing residuals

From Figure 3, we see that the data fit the model pretty well. There's a little bit of a wiggle going on there and we see some strong short-term correlations (the latter is an artifact of the fact that I could only get times to the nearest second).

To get some sense of the timing stability from the residuals, we can calculate the standard deviation, which will give us a figure for how "off" the data typically are from the model. The standard deviation of the residuals is

\[ \sigma_{res} = 0.19~\mbox{sec}. \]

A good guess at the fractional stability of the watch would then just be the standard deviation divided by the sampling interval,

\[ \frac{\sigma_{res}}{T} = 0.19~\mbox{sec} \times \frac{1}{1~\mbox{day}} \times \frac{1~\mbox{day}}{24\times3600~\mbox{sec}} \approx 2\times10^{-6}.\]

In words, this means that each "tick" of the watch is consistent with the average "tick" value to about 2 parts in a million.

That's nice...but isn't there something fancier we could be doing? Well, I have been wanting to learn about Allan variance for some time now, so let's try that.

The Allan variance (refs: original paper and a review) can be used to find the fractional frequency stability of an oscillator over a wide range of time scales. Roughly speaking, the Allan variance tells us how averaging our residuals over different chunks of time affects the stability of our data. The square root of the Allan variance, essentially the "Allan standard deviation," is plotted against various averaging times for our data below in Figure 4.

Figure 4: Allan variance of our residuals

From Figure 4, we see that as we increase the averaging time from one day to ten days, the Allan deviation decreases. That is, the averaging reduces the amount of variation in the frequency of the data, making it more stable. However, at around 10 days of averaging time it seems as though we hit a floor in how low we can go. Since the error bars get really big here, this may not be a real effect. If it is real, though, this would be indicative of some low-frequency noise in our oscillator. For those who prefer colors, this would be "red" noise.

Since the Allan deviation gives the fractional frequency stability of the oscillator, we have that

\[\sigma_A = \frac{\delta f}{f} = \frac{\delta(1/t)}{1/t} = \frac{\delta t}{t}. \]

Looking at the plot, we see that with an averaging time of one day, the fractional time stability of the watch is

\[\frac{\delta t}{t} \approx 2\times10^{-6}, \]

which corresponds nicely to our previously calculated value. If we average over chunks that are ten days long instead, we get a fractional stability of

\[\frac{\delta t}{t} \approx 10^{-7}, \]

which would correspond to a deviation from our model of about 0.008 seconds. Not bad.

The initial question that started this whole ordeal was "How good is my watch?" and I think we can safely answer that with "as good as I'll ever need it to be." Hooray for cheap and effective electronics!