Your instinct tells you that running is the best choice, but is it so? Let’s analyze the problem from a mathematical point of view.
To simplify the situation, let’s imagine that your body is a rectangular cuboid: a solid that has six rectangular sides.
Imagine that there is no wind, so the rain falls vertically. You can thus define the following parameters of the problem:
= distance to be covered
= speed of the person
= speed of the rain
= rain density in liters per cubic meter
The last two parameters describe rain intensity, in particular measures how much water is present in a volume of one cubic meter of air if we imagine time has frozen.
Suppose you have to go from point A to point B, covering a distance . Let’s try to understand how much water hits the vertical face and how much water hits the horizontal face of the rectangular parallelepiped that represents ”you”.
The rain hitting the vertical face during the movement is that found at the initial time inside the volume of a prism which has the vertical area as the base and the distance to be traveled as the height.
In the following image you can see two different prisms depending on whether you move slower (upper prism) or faster (lower prism).
But pay attention! For the Cavalieri principle the volumes of these prisms are the same and so the amount of water hitting the vertical face does not change if we go slower or faster!
Let’s call the area of the vertical face . The volume of the prism is equal to so the amount of water hitting the vertical face is given by this volume multiplied by :
What happens if we consider the horizontal face? The situation is similar, in the sense that the amount of rain hitting the face is always that found inside a certain prism, but now the volume of the prism depends on the speed.
(To make the following image look clearer the distance AB has been represented smaller than before.)
The height of the prism is no longer fixed but corresponds to the distance traveled by the rain in the time spent in going from A to B. The time taken to travel the route is given by . To find the distance traveled by the rain we multiply this term by the speed of the rain, , so the prism height is:
The prism volume is given by and the amount of rain hitting the horizontal face is obtained by multiplying this volume by water density :
We see that for this component, as speed increases, the quantity of water is reduced more and more (going infinitely fast will prevent even a single drop hitting the horizontal face).
Now adding the two contributions and we obtain the total quantity of water:
And collecting common terms, we obtain the final formula:
For very low speeds the amount of water hitting you is very large. For large speeds the second term inside the brackets is very small but you can’t go below the quantity of water.
Within the brackets you find a sort of effective area you can change by modifying your speed.
Here is the graph of the effective area at different speeds.
Above a certain speed, going faster hardly reduces the effective area. Whether you are a normal person running at 5 m/s or a word class sprinter running at more than 10 m/s the amount of water hitting you changes very little.
Summing up: running is better but not much better, and running won’t change the amount of water hitting your torso and legs, but will reduce only the amount of water hitting your head and shoulders.
Real life is often much more complicated than the models we use to analyze it. This problem is a good example of how even a seemingly simple question can become complex to deal with it rigorously.
For example, we could add some wind, which would provide the rain with a horizontal speed component. If the wind blows against you then increasing the speed will always lead to a decrease in the water hitting you, just like before.
If the wind blows in your favor, then, depending on the ratio between the horizontal and vertical area of your body, it may be better to run as fast as possible or to proceed at a speed equal to the horizontal speed of the rain (in this way you don’t get wet either on the front nor back of you. For details see the following article by David E. Bell: Walk or Run in the Rain?).
The wind is the easiest variable to add to the model, other variables are more complicated, for example:
It’s interesting to notice that when we find ourselves in these situations, unconsciously and without making precise calculations, our instinct can take into account so many variables (distance to be covered, possible increase in intensity of the rain, type of clothes, danger of slipping) to finally choose the strategy which seems the most sensible.
As a last remark remember that you often have the option to find a safe shelter in a café and wait for the rain to stop. That would give you time to catch up on your favorite blog posts too .
If you liked this post, consider sharing it or sign up to our newsletter so you never miss any of our future posts.
]]>Conic sections are found in many real-word situation, one of them is the calculation of celestial body orbits.
In 2017 they helped us understand where a strange asteroid named Oumuamua came from, so let’s see how.
Conic sections are the curves obtained by sectioning a cone with a plane.
To visualize ‘conics’ there is a very simple way: just project the light of a flashlight onto a wall.
The light created by the flashlight has the shape of a cone and the wall represents the plane that intersects this cone.
If we hold the flashlight perpendicular to the wall, we get a circle (top left image).
If we slightly turn the flashlight while continuing to project the whole beam onto the wall, we obtain ellipses (top right image).
When the part of the beam that is farthest from the wall is parallel to the wall, the light forms a parabola (bottom left image).
If we further turn the flashlight outward, we obtain a hyperbole (bottom right image).
A body attracted by the gravitational field of another body with a much larger mass moves along trajectories that are conical sections.
But what’s the reason why, sometimes, the orbit is an ellipse, as with planets, while other times it’s a different conic?
It’s not easy to explain this without going into technical details, to keep things simple we can say that the type of conic depends on how much energy the body has. For example:
1. If the body does not have enough kinetic energy to escape the gravitational field of the heavier body, then it will follow a circumference or an ellipse.
2. If the body has enough kinetic energy to completely escape the gravitational field of the heavier body moving further and further away, then it will follow a hyperbola.
3. If the body has a precise energy which represents the critical value between the first and second scenario above, then it will follow a parabola, moving further and further away, but with its speed tending towards zero.
Here are some examples of these three cases.
Planet motion belongs to case 1. Planets do not travel at a fast-enough speed to escape the gravitational field of the Sun, so they follow elliptical orbits. The same consideration applies to artificial satellites that orbit around the Earth, so they also follow circular or elliptical orbits.
Space probes performing a flyby of a planet follow hyperbolic orbits because they have enough energy to escape the planet’s gravitational field, so their path belongs to case 2.
Non-periodic comets are examples of case 3. They are comets that originate at the most extreme boundaries of the solar system, complete a single orbit around the Sun and then return just as far from where they started following a parabolic trajectory.
In October 2017 an asteroid with an anomalous orbit was sighted. After a month of observations, astronomers found it didn’t follow a common elliptical orbit around the Sun but it was traveling on a hyperbolic orbit.
The object was initially considered an asteroid, then in June 2018 more detailed analysis clarified it was a comet.
It was the first time that anyone had observed a comet with this kind of orbit and the conclusion was clear: the comet came from the outside of our solar system.
This 230-meter-long piece of rock is making an incredible journey inside our galaxy, visiting different star systems from time to time.
Having been discovered by an astronomical observatory located in the Hawaiian archipelago, the comet has been named “Oumuamua” which in the local language, roughly translated, means “first distant messenger”.
Oumuamua is now moving away from our solar system at great speed and who knows how many other planetary systems it will visit in its wanderings through our galaxy.
And here they are again. The good old conics that made us struggle at school, yet which appear every time we use a flashlight, or we see the beam of a street light against a wall. They help us understand the world around us and draw conclusions which, however bizarre, represent the logical consequence of our observations.
If you liked this post, consider sharing it or sign up to our newsletter so you never miss any of our future posts.
Subscribe me to the newsletter!
NB: Regarding the three examples of conics (planets, long-period comets, and the artificial satellite flyby), the trajectories are very close but not exactly equal to ellipses, parabolas, and hyperbolae. The main reason is that each body doesn’t only interact with the heavier body that is orbiting around, but also with all the other planets and celestial objects in the solar system!
]]>What are they and what caused this renewed interest?
Dyson spheres would be alien megastructures built around a star with the aim of collecting a large amount of energy emitted by the star itself.
The first who speculated about this kind of structures was the philosopher and science fiction writer Olaf Stapledon in a novel published in 1937.
Later the idea was popularized by the renowned physicist Freeman Dyson. In an article published in 1960 he considered this kind of structures the most natural way in which a highly advanced alien civilization could meet its energy needs.
He hypothesized that these structures could be observable thanks to a couple of clues
In the following years, many different geometric shapes were hypothesized for this kind of structures, for example Dyson swarm, Dyson Bubble, and Dyson Shells.
Our current technology is far from the possibility of building even small versions of this kind of structures, but other extraterrestrial civilizations may already be able to build something similar.
In recent years a couple of astronomical observations had a certain media coverage because, among the many hypotheses made to explain some weird phenomenon, also the presence of a Dyson sphere has been considered.
The most interesting case is a star called KIC 8462852 or Tabby’s star, 1280 light years away. The star, not visible to the naked eye, had already been discovered in the 19th century but its strange properties had never been noticed.
Between 2009 and 2013 the Kepler spacecraft monitored the brightness of a lot of stars searching for evidence of exoplanets. After some time this data were analyzed in a citizen scientist project and an odd variability in the brightness of Tabby’s star was noticed.
The star has small and irregular brightness oscillations on a daily scale and much bigger brightness decreases with intervals of a few years.
During these periods its brightness can drop up to 22%. This is a very high percentage considering that Jupiter, the largest solar system planet, would obscure a fraction of about 1% of Tabby’s star.
For this reason some have suggested that the change in brightness is due to the presence of a giant alien structure.
The SETI Institute has pointed radio antennas at the star and no particular signal has been found although the star is so far away that alien signals could be too weak to be detected.
Even the infrared emissions caused by the heating of the structure hypothesized by Dyson has not been observed.
By now, however, also other possible natural explanations don’t completely satisfy scientists.
The most plausible cause seems to be the presence of a large cloud formed by the debris of a comet or planet, as a result of an impact with an asteroid.
The probability of this phenomenon being caused by a Dyson sphere is rather low, and we don’t have to fall into the trap of believing something is true just because we like that kind of explanation!
Something similar happened when pulsars were observed for the first time. It was 1967, and among the various possible explanations of their highly regular pulsating electromagnetic emissions, also the presence of an alien civilization was taken into account.
Then, in a short time, it became clear that the signal was due to a natural phenomenon.
We must learn from past experiences and always remember the golden rule popularized by Carl Sagan which states that extraordinary claims require extraordinary evidence.
One day maybe we will have stronger evidence and we could say that we have identified an alien civilization.
At the moment we have nothing more than a strange change in a star brightness, not a strong evidence at all for such an extraordinary claim.
If you liked this post, consider sharing it or signing up for the newsletter.
Subscribe me to the newsletter!
P.S. I’m currently working on the 2018 math and physics calendar, stay tuned!
]]>
A few days ago while I was preparing dinner, I was peeling carrots and noticed how much of the original thickness of the carrot I was peeling away.
The peeled carrots were significantly more slender than the carrots I started out with.
Suddenly I had a flash of intuition from the “sick of mathematics” side of my brain: the more spherical a vegetable is, the less volume is proportionally removed when it’s peeled.
Let me explain what I mean.
We can imagine that peeling is equivalent to removing a thin layer of thickness dx from a vegetable with a surface S and volume V.
The volume peeled off can be approximated by the value dx * S so that the proportion of removed volume relative to the total volume is given by:
If you always use the same peeler, the dx value is fixed. Let’s say that normally dx may be equal to a couple of millimeters.
So, given the peeler, the removed volume is proportional to the S/V ratio.
It’s well known that the geometric figure with the lowest S/V ratio is the sphere.
This is connected to the circle’s analogous properties of being the planar geometric shape that maximizes the area given the perimeter.
Demonstrating this property is not as trivial as it may seem. The first to achieve results was the mathematician Jacob Steiner in 1838, and later mathematicians completed the demonstration.
The two main ideas behind the demonstration are the following:
1) If a planar figure is concave, then there is another figure with the same perimeter but with greater area
2) A planar figure that is not fully symmetrical can be deformed to create another flat shape with the same perimeter but with larger area
As a result of 1) and 2) the planar figure that maximizes the area given the perimeter must be convex and must have the greatest possible symmetry, and then it is the circumference (obviously this is just a sketch of the demonstration).
Formally the result is known as the isoperimetric inequality: each closed curve of length L and area A satisfies:
and the equality is true only for the circumference.
Then I would like to suggest the following result.
Peeler corollary to the isoperimetric inequality: given two vegetables with equal volume, the one whose shape is closer to a sphere is the one that minimizes the volume wasted peeling it.
Clearly this statement is somewhat vague because I haven’t defined what “closer to a sphere” means.
Things get complicated, however, if you have to buy a whole bag of potatoes of different sizes. In this case, in fact, to minimize the waste you should evaluate which bag has, given the same weight, the smallest total surface area (obtained from the sum of all the potatoes’ surfaces).
A bag with two large potatoes not at all spherical could have a smaller total area than a bag with many tiny perfectly spherical potatoes.
Reasoning in this way, in addition to reducing waste, will also minimize the time required for peeling all those potatoes (which is always proportional to the surface).
Take into account these considerations the next time you choose a sack of potatoes!
]]>In May 1997, for the first time, a reigning world chess champion was beaten by a computer in a match under tournament conditions.
In 2016, another significant milestone was reached: a program defeated one of the best Go players in the world. This event had less media coverage than Kasparov’s defeat, but it has aroused a great deal of wonder among artificial intelligence experts.
Why did it take almost 20 years from the chess victory to victory in the game of Go? Will the machines overtake men in all activities, even the most complex? Can we now say that machines do think?
The leading characters in the 1997 match were Garry Kasparov, then the undisputed World Chess Championship, and Deep Blue, a supercomputer designed in its hardware and software components by IBM. The match consisted of six games and, as usually happens in chess tournaments, after each game the winner gains one point or, in case of a tie, half a point is given to each player.
The previous year a similar match was held in which, although Kasparov had lost a game, the man won with a score of 4-2.
The second time the computer prevailed, winning two games, drawing three and losing just one (the first) so the final score was 3.5-2.5 for Deep Blue.
How did Deep Blue choose his moves?
The computer based its analysis on an algorithm that took a board position as an input and returned as an output a value that quantified the advantage (or disadvantage) with respect to the opponent player.
This algorithm was created by IBM engineers with the help of professional chess players and took into account material advantages (as a result of piece captures) or positional advantages (as a consequence of placing a piece in a key square).
Given this algorithm, Deep Blue implemented a brute-force approach. The supercomputer calculated this function on all the possible following positions up to a certain depth and chose the move that guaranteed the best result.
Deep Blue evaluated 200 million positions per second. This enabled it to analyze a position up to a depth of 6/8 moves if the chessboard presented many pieces, or up to 20 or more moves in the presence of only a few pieces.
Since then, chess programs have become a bit smarter and instead of searching through all possible moves, they only analyze the most promising variants. That’s more similar to how humans decide what the next move is: they evaluate fewer positions, but based on experience they know how to choose the most significant sequences.
For example, in 2006 the program Deep Fritz won against the world champion Vladimir Kramnik running on a standard personal computer that allowed it to evaluate “only” eight million positions per second, a lot fewer than Deep Blue.
Now, no human is able to win a chess game against the smartest computer program.
Go is a board game with very simple rules. The players take turns placing pieces called “stones” on a square board with 19×19 intersections. If a player occupies all the intersections adjacent to an opponent’s group with his stones, the opponent’s group is captured and removed from the board.
The player who gains more points adding up the stones he captured and the number of intersections surrounded with his own stones wins the game.
Despite its simple rules (much simpler than chess), the game is extremely complex for these reasons:
As a consequence, the creation of a program capable of competing with the best Go players has been considered an ambitious challenge in the field of artificial intelligence.
AlphaGo was developed by Google Deep Mind and, unlike Deep Blue, has been programmed according to the machine learning approach. This approach consists in submitting examples to a program in such a way that it learns how to make decisions based on those examples without being given explicit instructions.
More specifically, AlphaGo consists of two neural networks and has been trained with the submission of many professional player games for a total of 60 million moves. One of the two neural networks is used to figure out what the most promising future moves are (policy network), while the other one assigns a value to a position to represent the probability of victory for either player (value network).
After the first phase of learning based on human player games, AlphaGo has been trained by playing against different versions of itself to further improve its way of playing.
The result of the match against Lee Sedol, one of the strongest go players in the world, has been quite clear-cut: AphaGo won 4 out of 5 games.
Recently a new match as been held, this time between AlphaGo and Ke Jie, the latter considered the word’s strongest Go player. AlphaGo also won this new match with a score of 3-0.
Given these stunning results, we should now begin to ask ourselves one question:
From the point of view of the results they have achieved there is no doubt that in a certain way, machines are thinking.
Compared to humans, they think differently, that’s for sure. But planes also fly differently from birds, and we still describe what they do as flying. Why shouldn’t we say that computers think?
Machines are not yet able to perform more artistic tasks like composing music or writing coherent texts, but that’s just a matter of reaching higher levels of complexity that sooner or later will be achieved. I’m pretty sure in some years we’ll be listening to the first symphony completely composed by a computer.
By now let’s make do with the first pop song written by a computer imitating the style of the Beatles, and then performed and mixed by humans.
Some argue that machines cannot create anything genuinely new, because whatever they do is an imitation of some human activity or something they have been taught to do.
I’d like to offer a couple of considerations to counterpose to this reasoning.
First, consider AlphaGo. The program began trying to imitate the moves of professional players, but then also improved playing against itself. Just as if it had “studied” to find stronger than human strategies.
In a sense, AlphaGo’s playing style is new and in fact, according to professional players, AlphaGo occasionally chooses moves considered rather original.
Second, even human creativity doesn’t come out of nowhere. In an artist’s style you can find influences of other artists, elements linked to the place where the artist grew up, events they took part in, things they heard or read during their life. Artists basically transform personal experiences into pictures, sounds, or words.
Nothing prevents us from imagining a process whereby a very complex neural network could start imitating the works of one or more artists and then form a personal style, maybe with random elements inserted into its evolution or with the influence of something similar to “personal experiences” that could be linked to images, text, or music it takes its cue from.
The situation is quite clear to me. Machines are going to execute every human task in the same way we do or better than we do. In the coming years we’ll become more and more aware of this trend.
And yes, I find no reason to say that machines don’t think.
What’s your opinion?
At this time I am particularly interested in human points of view, but if some machine wants to post a comment or join the newsletter, it is welcome!
]]>In the previous post we discussed a Monte Carlo method to calculate an approximate value for the area of a circular sector. The trick was in counting how many randomly drawn points within the unit square fell within the circumference.
You can easily guess how this approach can be generalized to calculate areas with different shapes.
From calculus we know that an area can also be calculated using integrals:
But then when we use the Monte Carlo method to find areas, we are just calculating an approximation of an integral!
This approach is not very useful when calculating integrals in a single dimension. In this case, there are more efficient deterministic methods.
However, if you want to calculate integrals over many dimensions, deterministic methods become less efficient, while Monte Carlo methods become more useful.
What causes this strange phenomenon?
Given the degree of precision you want to achieve in the calculation, deterministic methods require a number of elementary steps that grow exponentially with the number of dimension. On the other hand, the error you expect from an estimation done with a Monte Carlo method depends only on the number of total extractions, not on the number of dimensions.
For example, to calculate a 100-dimensional integral, deterministic approaches would require such a huge number of calculations as to be practically impossible, while Monte Carlo methods are still applicable.
But wait! In what cases is it necessary to calculate integrals with so many dimensions?
It typically happens in statistical mechanics, a branch of theoretical physics that studies systems with many degrees of freedom, for example, a system composed of many particles enclosed in a box. To calculate the value taken by macroscopic quantities such as temperature or pressure, it is necessary to perform integrals on a space with a number of dimensions proportional to the number of particles.
You can understand that the number of dimensions of these statistical mechanics integrals can be very large. In cases like these, Monte Carlo methods are used more than deterministic methods.
Various algorithms are used to find the local minima of a function. Typically, these algorithms:
Applying these steps many times these algorithms finally reach a minimum (in this case, at step number 2 you can’t determine which direction to move in).
However, what if we have a function with many local minima and want to find the point that globally minimizes the function?
A local search process may stop at any of the many local minima of the function. How can an algorithm understand if it has found one of the many local minima or the global minimum?
Actually, there is no way to exactly establish this. The only option is to explore different areas of the search domain to increase the likelihood of finding the global minimum among the various local minima.
Many methods have been developed to realize this idea of “exploring” the domain. Very popular are those known as genetic algorithms and are inspired by species evolution.
These are Monte Carlo methods in which a starting population of points is created and subsequently evolved through an algorithm that pairs a couple of points to generate new ones (the coordinates of the two points are combined to obtain the coordinates of a new point). In this pairing process, random genetic mutations happen so that the pairing function gives a randomized result. During the simulation of different generations of points a process of natural selection intervenes that keeps only the best points (those that give lower values of the function to be minimized).
In the meantime, the algorithm also keeps track of which point represents the best “individual” ever.
Continuing with this process the points tend to move toward the local minima but exploring many areas of the optimization domain. The process at some point is stopped (usually by setting a limit to the number of generations) and the best individual is taken as the estimate for the global minimum (usually this is used as a starting point for a subsequent local optimization algorithm that refines the result).
The last kind of application concerns the generation of probability distributions that can’t be derived through analytical methods.
Example: estimate the probability distribution of damage caused by tornadoes in the United States over a period of one year.
In this type of analysis there are two sources of uncertainty: how many tornadoes will happen over one year and how much damage each tornado causes. Even if you are able to assign a probability distribution to these two logical levels, it’s not always possible to put them together to get an annual loss distribution with analytical methods.
It’s much simpler to do a Monte Carlo simulation like this:
Repeating these three steps many times you can generate a sample of annual losses you can use to estimate the probability distribution that you couldn’t derive analytically.
If you liked this post, consider sharing it or signing up for the newsletter.
]]>At first glance, it seems the answer is no. We usually think that a mathematical algorithm is deterministic, so that if you run the calculation again with the same input you should get the same result.
Yet there is a large class of algorithms that use random numbers to calculate their results. For that kind of algorithm, every time you repeat the calculation you get a different result. But what’s the usefulness of these algorithms?
The point is that in the field of applied mathematics the goal is to solve concrete problems. In these cases, if you can’t find the theoretical solution, finding a good enough solution may be acceptable. For this reason, if the next time you run the algorithm you obtain a different but still good enough result, that’s OK (what good enough means exactly depends on the particular application).
This kind of algorithms are called Monte Carlo methods or Monte Carlo simulations and they may be loosely defined as all those algorithms that make use of random number generators.
Let’s see a classic example just to understand the usefulness and the main features of these methods: the calculation of an approximation of through a Monte Carlo simulation.
Suppose you know how to generate random numbers uniformly distributed in the interval (all high-level programming languages can do that).
If we draw a couple of values both uniformly distributed in we can interpret as a random point drown inside the unit square with vertices , , , .
Generating such points we can check which of them are inside the circle of unit radius centered in the origin: it is sufficient to check whether the distance from the origin is less than 1.
Note: points are uniformly distributed inside the square so the probability of a random point falling inside the circle is equal to the ratio between the area of the circular sector and the area of the square.
The area of the square is 1 while the circular sector area is (one-fourth of the area of a unitary circle) so that the probability of a random point falling inside the circle is given by:
As a consequence, the ratio between the number of points falling inside the circle over the total points will tend toward this value:
But then we could:
obtaining in this way an approximation of .
Here you have an animated gif that clarifies what’s going on.
You can easily make this simulation also in Excel, here you have an example file: MonteCarloPi.xlsx.
I have to underline that the purpose of this method is only educational, because much more efficient deterministic algorithms exist that can be used to calculate the value of with a given precision.
Nevertheless, this basic example is very useful to understand the logic behind Monte Carlo methods, and the final result is random but gradually converges to the theoretical result increasing the number of simulations.
It’s possible to demonstrate that the estimation error in a Monte Carlo simulation goes as where is the number of simulations. To halve the error you have to quadruple the number of simulations.
The Buffon’s needle experiment designed in the 1700s could be considered the oldest Monte Carlo simulation.
Suppose you have a sheet on which parallel equidistant lines are drawn. You also have many sticks and you randomly throw them on this sheet. Each stick can fall between two straight lines or intersect one or more (if long enough) parallel lines.
If the stick is smaller than the distance between the lines you can demonstrate that the probability of a stick intersecting a line is given by:
where is the stick length and is the distance between the parallel lines. Thanks to the fact that the formula contains , this experiment can be used to calculate approximations of in a way that is similar to the preceding one.
It’s just a matter of throwing the sticks many times to estimate the probability . The values for and are known so that you can invert the formula and obtain an estimate for .
Experiments similar to this one of Buffon’s needle had no real application, because to obtain results with an acceptable degree of approximation, they required an unreasonably large number of repetitions.
However, around the middle of the last century, the advent of computers drastically changed the scenario. The new technology made it possible to run simulations with sufficient speed to solve practical problems that could not be solved in other ways.
The pioneers of the Monte Carlo methods were Stanislaw Ulam and John Von Neumann. In 1946 they both worked on the Manhattan Project aimed at building the atomic bomb, and they used Monte Carlo methods to carry out calculations related to neutron absorption that they couldn’t solve with more conventional approaches.
Ulam had the idea while he was recovering from an illness. While he was playing solitaire, he asked himself what the chance was of successfully finishing it. The rules of solitaire rules made it difficult to calculate this probability.
He realized it was possible to simulate many different games with a computer using random deck arrangements and check how many times the game of solitaire could be finished. This way you could empirically calculate the probability of winning that game of solitaire.
Because the results were part of the secret plans for building the atomic bomb, it was necessary to assign a code name to the project. Because chance played a fundamental role in the estimation method, the codename Monte Carlo was chosen, and this is why even today we use this terminology.
Since then, these methods have been used in many fields: weather forecasting, elementary particle physics, astrophysics, molecular chemistry, electronics, fluid dynamics, biology, computer graphics, artificial intelligence, finance, project evaluation… and many others!
Despite their many applications, all Monte Carlo methods fall roughly into three families: numerical integration, optimization, generation of probability distributions.
In the next post we will see an example for each of these families of Monte Carlo methods.
Stay tuned!
]]>Subsequent investigations revealed that it would be too simplistic to attribute the failure of the mission only to this problem. The biggest problem was not the unit mismatch itself, but the failure to detect and correct this mistake. This failure was caused by some imprudent choices in the way the mission was managed.
The Mars Climate Orbiter was launched on 11 December 1998 from Cape Canaveral, Florida. Along with the Mars Polar Lander, the probe was part of a project to study Martian meteorology and climate.
In particular, the Mars Climate Orbiter was designed to monitor the evolution of daily weather conditions, to study the distribution of water both on the ground, and in the atmosphere, and to measure the temperature of the atmosphere.
On 23 September 1999 the probe began its final maneuvers to enter Mars orbit.
The probe had to pass behind the planet, meaning that a temporary loss of radio signal was expected. However, the radio contact was lost 49 seconds earlier than expected and was never restored.
The subsequent investigation clarified that the spacecraft was much closer to the planet than planned, so close as to be destroyed by friction with the Martian atmosphere.
Why was the probe so close to Mars?
One part of the trajectory control software that had been developed by Lockheed Martin produced a numerical output in English units. These results were then sent to another part of the software developed by NASA that interpreted them as if they were expressed in International System Units.
More precisely, the first software passed an impulse expressed in pound * second while the second part expected it to be given in newton * second.
The result was that instead of being 226 km from the planet, the probe was only 57 km from the Martian surface.
In addition to the mismatch on the unit measure there were many other secondary factors that led to the disaster.
Just a few months earlier, in April, a bug was fixed in the trajectory management software. At that time the need to use the new code in the mission was urgent. This meant there wasn’t enough time to thoroughly test the changes.
Some members of the navigation team noticed signals indicating that the trajectory could be wrong. Although they discussed the discrepancy in meetings, they failed to report it following the available formal process.
The navigation team was following three different missions at the same time and due to budget cuts, the team was not adequately trained.
On the other hand, project managers required engineers to prove something was going wrong while, given the uncertainties in the trajectory, it was not even possible to prove that all was going right.
Due to uncertainties concerning the probe’s position, the team even considered the possibility of a trajectory correction. It seems that project managers decided to forgo the correction trusting in the more optimistic estimates. Nevertheless, it was not really clear who should decide to perform the correction.
This case should be studied by project managers, who need to understand how things can go terribly wrong when they’re dealing with big, challenging projects.
In particular there are 6 important lessons to learn:
For a deeper analysis of the Mars Climate Orbiter mission failure, take a look at this really well-written account by James Oberg for the IEEE Spectrum magazine.
If you liked this post, consider sharing it or signing up for the newsletter!
]]>Let’s see what they are and how they’re used in applications.
A wavelet is a real function which represents a wavelike oscillation localized in a limited range of its domain.
Here are some examples:
Given a mother wavelet we can define a set of child wavelets through the parameters
Parameter scales the function while shifts it. In applications it is common to take into account a discrete set of pairs so you can index child functions with discrete parameters with .
The general idea behind wavelets is that a function can be represented as a linear combination of child wavelets:
The function could be, for example, the sound of a musical instrument or the signal of a seismograph or electrocardiogram.
At first, the signal is registered sampling at a certain frequency. In practice for each sampling interval the value of the function is recorded. If the sampling frequency is high, a signal stored in such a way can occupy a lot of memory.
Through wavelets it is possible to store the signal using only the values of the main coefficients of the wavelet expansion.
The truncation of the wavelet series results in a little loss of precision in representing the function, but it also results in a huge saving in the amount of information to be stored, also called compression.
In the JPEG-2000 and MPEG-4 standards, images and videos are represented through a wavelet expansion. In addition to data compression, the main advantage of using wavelets in this field is to manage different resolutions of the image with a single file.
Once an image is saved as a wavelet expansion, if you want to create a low-resolution preview of the same image, it is sufficient to use fewer elements of the summation.
Different image resolutions are obtained by simply truncating the wavelet series at different depths.
Experienced readers will have noticed the similarity between the wavelet decomposition and discrete Fourier transforms.
The Fourier transforms have many properties that make them interesting from a theoretical point of view. However, wavelets have some significant advantages in applied mathematics.
1) Personalization: Fourier transforms always make use of sine and cosine functions. On the other hand, depending on the particular application, you can choose the wavelets that better adapt to deal with that problem.
2) Localization: signals that are analyzed in applications often consist of several blocks of information separated by intervals of near-zero signal (for example, in the case of the electrocardiogram). As a consequence, it’s more natural to decompose this kind of signal through wavelets that represent localized waves.
3) More control over Gibbs phenomenon: Fourier transforms present some problems in describing discontinuous signals. I’m referring to the so-called Gibbs phenomenon.
The classic example is that of a square wave that alternately takes the values 0 and 1. The discrete Fourier expansion of this signal presents a peak near the discontinuity with a value of about 1.09.
The left image shows the approximation using 25 harmonics while the right image using 125 harmonics. The height of the peak remains stable, even increasing the terms of the Fourier series!
This is somewhat counterintuitive, because you would expect the series to converge to the function and so to the value 1.
Also, the wavelet expansion exhibits this kind of phenomenon, but to a lesser extent compared to the discrete Fourier transform.
Another difference between wavelets and Fourier transform is geometric. The sine and cosine functions used in Fourier series form a basis of the space of functions .
This means they are linearly independent vectors that span the whole space of functions.
Often, wavelets used in applications are frames rather than bases. A frame is a set of vectors that span the vector space but that are not linearly independent.
As a consequence we have that the decomposition of a vector in terms of wavelets is not unique. This feature, which might seem to be a problem, represents instead a further computational advantage, contributing to an improved numerical stability of wavelets with respect to Fourier transform.
]]>In 1915 Einstein published the field equations of general relativity. These formulas create a link between the presence of matter and energy and the curvature of space-time. Gravitation was then explained as a consequence of the space-time curvature caused by matter and energy.
In 1917, Einstein applied these equations in a physical model for the entire universe and realized that it was not possible to have a static universe within that model. The universe should expand or contract, but it couldn’t stay still.
At that time the idea the universe could evolve was considered so bizarre that Einstein introduced a new term into the field equations called a cosmological constant, just to make the existence of a static universe a feasible solution.
In 1929 Edwin Hubble made one of the more sensational discoveries of the century. He found that the galaxies beyond those in our local group were moving away from us, and that they were receding at a speed that was proportional to their distance. This meant that our universe is expanding.
The cosmological constant was introduced just to fit a static universe in the theory. With the discovery of the universe’s expansion the constant no longer seemed to be a necessary hypothesis.
As a consequence, from the early 30’s almost all research in the field of cosmology hypothesized that the cosmological constant was equal to zero.
Einstein realized that starting from the equations, he could have hypothesized about the universe’s expansion before it was experimentally discovered and called the introduction of the cosmological constant his biggest blunder.
During the 90’s, many cosmological observations began suggesting that the expansion of the universe is accelerating. In particular, in 1998 two groups of cosmologists, the Supernova Cosmology Project and the High-Z Supernova Search Team, independently came to this conclusion observing the redshift of supernovae.
The discovery was a huge breakthrough because most cosmologists expected to find that the expansion was decelerating.
That was one of those fascinating moments in physics history when everybody expects A, and B just happens, making it clear that something deep in our theory needs to be better understood.
This acceleration was not compatible with zero cosmological constant models. After more than 60 years, scientists began again to consider the presence of this term in the equations of general relativity.
The reasons for the constant’s comeback were completely different from the ones that led Einstein to introduce it, but finally the constant regained its position in the equations.
Is everything clear now? Not at all. The interpretation of the cosmological constant is still one of the biggest mysteries in physics.
In the field equations of general relativity, you can identify two parts, the physical terms which describe the distribution of matter and energy and the geometrical terms related to the curvature of space-time.
It’s not clear whether the cosmological constant should be considered an element of the geometrical part or as a term of the energy/matter part generated by some physical process not yet identified (or even whether it should be the result of the sum of both of these components).
By now there are several hypotheses but no certainty. So young physicists, come forward! This is a problem still waiting for someone to explain it!
For more mathematical details on the cosmological constant, take a look at this nice post by Peter Coles: One Hundred Years of the Cosmological Constant.
]]>