Computing Integrals
Abstract
We now turn our attention to solving mathematical problems through computer programming. There are many reasons to choose integration as our first application. Integration is well known already from high school mathematics. Most integrals are not tractable by pen and paper, and a computerized solution approach is both very much simpler and much more powerful – you can essentially treat all integrals \(\int_{a}^{b}f(x)dx\) in 10 lines of computer code (!). Integration also demonstrates the difference between exact mathematics by pen and paper and numerical mathematics on a computer. The latter approaches the result of the former without any worries about rounding errors due to finite precision arithmetics in computers (in contrast to differentiation, where such errors prevent us from getting a result as accurate as we desire on the computer). Finally, integration is thought of as a somewhat difficult mathematical concept to grasp, and programming integration should greatly help with the understanding of what integration is and how it works. Not only shall we understand how to use the computer to integrate, but we shall also learn a series of good habits to ensure your computer work is of the highest scientific quality. In particular, we have a strong focus on how to write Matlab code that is free of programming mistakes.
We now turn our attention to solving mathematical problems through computer programming. There are many reasons to choose integration as our first application. Integration is well known already from high school mathematics. Most integrals are not tractable by pen and paper, and a computerized solution approach is both very much simpler and much more powerful – you can essentially treat all integrals \(\int_{a}^{b}f(x)dx\) in 10 lines of computer code (!). Integration also demonstrates the difference between exact mathematics by pen and paper and numerical mathematics on a computer. The latter approaches the result of the former without any worries about rounding errors due to finite precision arithmetics in computers (in contrast to differentiation, where such errors prevent us from getting a result as accurate as we desire on the computer). Finally, integration is thought of as a somewhat difficult mathematical concept to grasp, and programming integration should greatly help with the understanding of what integration is and how it works. Not only shall we understand how to use the computer to integrate, but we shall also learn a series of good habits to ensure your computer work is of the highest scientific quality. In particular, we have a strong focus on how to write Matlab code that is free of programming mistakes.
The method (3.1) provides an exact or analytical value of the integral. If we relax the requirement of the integral being exact, and instead look for approximate values, produced by numerical methods, integration becomes a very straightforward task for any given \(f(x)\) (!).
The downside of a numerical method is that it can only find an approximate answer. Leaving the exact for the approximate is a mental barrier in the beginning, but remember that most real applications of integration will involve an \(f(x)\) function that contains physical parameters, which are measured with some error. That is, \(f(x)\) is very seldom exact, and then it does not make sense to compute the integral with a smaller error than the one already present in \(f(x)\).
Another advantage of numerical methods is that we can easily integrate a function \(f(x)\) that is only known as samples, i.e., discrete values at some x points, and not as a continuous function of x expressed through a formula. This is highly relevant when f is measured in a physical experiment.
3.1 Basic Ideas of Numerical Integration
Proceeding from (3.5), the different integration methods will differ in the way they approximate each integral on the right hand side. The fundamental idea is that each term is an integral over a small interval \([x_{i},x_{i+1}]\), and over this small interval, it makes sense to approximate f by a simple shape, say a constant, a straight line, or a parabola, which we can easily integrate by hand. The details will become clear in the coming examples.
Computational example
3.2 The Composite Trapezoidal Rule
3.2.1 The General Formula
Composite integration rules
The word composite is often used when a numerical integration method is applied with more than one subinterval. Strictly speaking then, writing, e.g., ‘‘the trapezoidal method’’, should imply the use of only a single trapezoid, while ‘‘the composite trapezoidal method’’ is the most correct name when several trapezoids are used. However, this naming convention is not always followed, so saying just ‘‘the trapezoidal method’’ may point to a single trapezoid as well as the composite rule with many trapezoids.
3.2.2 Implementation
Specific or general implementation?
Suppose our primary goal was to compute the specific integral \(\int_{0}^{1}v(t)dt\) with \(v(t)=3t^{2}e^{t^{3}}\). First we played around with a simple hand calculation to see what the method was about, before we (as one often does in mathematics) developed a general formula (3.17) for the general or ‘‘abstract’’ integral \(\int_{a}^{b}f(x)dx\). To solve our specific problem \(\int_{0}^{1}v(t)dt\) we must then apply the general formula (3.17) to the given data (function and integral limits) in our problem. Although simple in principle, the practical steps are confusing for many because the notation in the abstract problem in (3.17) differs from the notation in our special problem. Clearly, the f, x, and h in (3.17) correspond to v, t, and perhaps \(\Delta t\) for the trapezoid width in our special problem.
The programmer’s dilemma
 1.
Should we write a special program for the special integral, using the ideas from the general rule (3.17), but replacing f by v, x by t, and h by \(\Delta t\)?
 2.
Should we implement the general method (3.17) as it stands in a general function trapezoid(f, a, b, n) and solve the specific problem at hand by a specialized call to this function?
The first alternative in the box above sounds less abstract and therefore more attractive to many. Nevertheless, as we hope will be evident from the examples, the second alternative is actually the simplest and most reliable from both a mathematical and programming point of view. These authors will claim that the second alternative is the essence of the power of mathematics, while the first alternative is the source of much confusion about mathematics!
Implementation with functions
For the integral \(\int_{a}^{b}f(x)dx\) computed by the formula (3.17) we want the corresponding Matlab function trapezoid to take any f, a, b, and n as input and return the approximation to the integral.
We write a Matlab function trapezoidal in a file trapezoidal.m as close as possible to the formula (3.17), making sure variable names correspond to the mathematical notation:
This function must be placed in a file trapezoidal.m to be reused in other programs and in interactive sessions.
Solving our specific problem in a session
An interactive session can make use of the trapezoidal function in trapezoidal.m to solve our particular problem \(\int_{0}^{1}v(t)dt\):
Let us compute the exact expression and the error in the approximation:
Is this error convincing? We can try a larger n:
Fortunately, many more trapezoids give a much smaller error.
Solving our specific problem in a program
Instead of computing our special problem in an interactive session, we can do it in a program. As always, a chunk of code doing a particular thing is best isolated as a function even if we do not see any future reason to call the function several times and even if we have no need for arguments to parameterize what goes on inside the function. In the present case, we just put the statements we otherwise would have put in a main program, inside a function:
Now we compute our special problem by calling application() as the only statement in the main program. The application function and its call is in the file trapezoidal_app.m, which can be run as
3.2.3 Alternative Flat SpecialPurpose Implementation
Let us illustrate the implementation implied by alternative 1 in the Programmer’s dilemma box in Sect. 3.2.2. That is, we make a specialpurpose code where we adapt the general formula (3.17) to the specific problem \(\int_{0}^{1}3t^{2}e^{t^{3}}dt\).
Basically, we use a for loop to compute the sum. Each term with \(f(x)\) in the formula (3.17) is replaced by \(3t^{2}e^{t^{3}}\), x by t, and h by \(\Delta t\) ^{1}. A first try at writing a plain, flat program doing the special calculation is
 1.
We need to reformulate (3.17) for our special problem with a different notation.
 2.
The integrand \(3t^{2}e^{t^{3}}\) is inserted many times in the code, which quickly leads to errors.
 3.
A lot of edits are necessary to use the code to compute a different integral – these edits are likely to introduce errors.
Unfortunately, the two other problems remain and they are fundamental.

the formula for v must be replaced by a new formula

the limits a and b

the antiderivative V is not easily known^{2} and can be omitted, and therefore we cannot write out the error

the notation should be changed to be aligned with the new problem, i.e., t and dt changed to x and h
With the previous code in trapezoidal.m, we can compute the new integral \(\int_{1}^{1.1}e^{x^{2}}dx\) without touching the mathematical algorithm. In an interactive session (or in a program) we can just do
When you now look back at the two solutions, the flat specialpurpose program and the functionbased program with the generalpurpose function trapezoidal, you hopefully realize that implementing a general mathematical algorithm in a general function requires somewhat more abstract thinking, but the resulting code can be used over and over again. Essentially, if you apply the flat specialpurpose style, you have to retest the implementation of the algorithm after every change of the program.
The present integral problems result in short code. In more challenging engineering problems the code quickly grows to hundreds and thousands of lines. Without abstractions in terms of general algorithms in general reusable functions, the complexity of the program grows so fast that it will be extremely difficult to make sure that the program works properly.
Another advantage of packaging mathematical algorithms in functions is that a function can be reused by anyone to solve a problem by just calling the function with a proper set of arguments. Understanding the function’s inner details is not necessary to compute a new integral. Similarly, you can find libraries of functions on the Internet and use these functions to solve your problems without specific knowledge of every mathematical detail in the functions.
This desirable feature has its downside, of course: the user of a function may misuse it, and the function may contain programming errors and lead to wrong answers. Testing the output of downloaded functions is therefore extremely important before relying on the results.
3.3 The Composite Midpoint Method
The idea
Rather than approximating the area under a curve by trapezoids, we can use plain rectangles. It may sound less accurate to use horizontal lines and not skew lines following the function to be integrated, but an integration method based on rectangles (the midpoint method) is in fact slightly more accurate than the one based on trapezoids!
With \(f(t)=3t^{2}e^{t^{3}}\), the approximation becomes 1.632. Compared with the true answer (1.718), this is about 5 % too small, but it is better than what we got with the trapezoidal method (10 %) with the same subintervals. More rectangles give a better approximation.
3.3.1 The General Formula
3.3.2 Implementation
We follow the advice and lessons learned from the implementation of the trapezoidal method and make a function midpoint(f, a, b, n) (in a file midpoint.m ) for implementing the general formula (3.21):
We can test the function as we explained for the similar trapezoidal method. The error in our particular problem \(\int_{0}^{1}3t^{2}e^{t^{3}}dt\) with four intervals is now about 0.1 in contrast to 0.2 for the trapezoidal rule. This is in fact not accidental: one can show mathematically that the error of the midpoint method is a bit smaller than for the trapezoidal method. The differences are seldom of any practical importance, and on a laptop we can easily use \(n=10^{6}\) and get the answer with an error of about \(10^{12}\) in a couple of seconds.
3.3.3 Comparing the Trapezoidal and the Midpoint Methods
The next example shows how easy we can combine the trapezoidal and midpoint functions to make a comparison of the two methods in the file compare_ integration_methods.m :
Note the efforts put into nice formatting – the output becomes
A visual inspection of the numbers shows how fast the digits stabilize in both methods. It appears that 13 digits have stabilized in the last two rows.
Remark
3.4 Testing
3.4.1 Problems with Brief Testing Procedures
Testing of the programs for numerical integration has so far employed two strategies. If we have an exact answer, we compute the error and see that increasing n decreases the error. When the exact answer is not available, we can (as in the comparison example in the previous section) look at the integral values and see that they stabilize as n grows. Unfortunately, these are very weak test procedures and not at all satisfactory for claiming that the software we have produced is correctly implemented.
To see this, we can introduce a bug in the application function that calls trapezoidal: instead of integrating \(3t^{2}e^{t^{3}}\), we write ‘‘accidentally’’ \(3t^{3}e^{t^{3}}\), but keep the same antiderivative \(V(t)e^{t^{3}}\) for computing the error. With the bug and n = 4, the error is 0.1, but without the bug the error is 0.2! It is of course completely impossible to tell if 0.1 is the right value of the error. Fortunately, increasing n shows that the error stays about 0.3 in the program with the bug, so the test procedure with increasing n and checking that the error decreases points to a problem in the code.
Let us look at another bug, this time in the mathematical algorithm: instead of computing \(\frac{1}{2}(f(a)+f(b))\) as we should, we forget the second \(\frac{1}{2}\) and write 0.5*f(a) + f(b). The error for n = 4,40,400 when computing \(\int_{1.1}^{1.9}3t^{2}e^{t^{3}}dt\) goes like 1400, 107, 10, respectively, which looks promising. The problem is that the right errors should be 369, 4.08, and 0.04. That is, the error should be reduced faster in the correct than in the buggy code. The problem, however, is that it is reduced in both codes, and we may stop further testing and believe everything is correctly implemented.
Unit testing
A good habit is to test small pieces of a larger code individually, one at a time. This is known as unit testing. One identifies a (small) unit of the code, and then one makes a separate test for this unit. The unit test should be standalone in the sense that it can be run without the outcome of other tests. Typically, one algorithm in scientific programs is considered as a unit. The challenge with unit tests in numerical computing is to deal with numerical approximation errors. A fortunate side effect of unit testing is that the programmer is forced to use functions to modularize the code into smaller, logical pieces.
3.4.2 Proper Test Procedures
 1.
Comparing with handcomputed results in a problem with few arithmetic operations, i.e., small n.
 2.
Solving a problem without numerical errors. We know that the trapezoidal rule must be exact for linear functions. The error produced by the program must then be zero (to machine precision).
 3.
Demonstrating correct convergence rates. A strong test when we can compute exact errors, is to see how fast the error goes to zero as n grows. In the trapezoidal and midpoint rules it is known that the error depends on n as \(n^{2}\) as \(n\rightarrow\infty\).
Handcomputed results
Solving a problem without numerical errors
The best unit tests for numerical algorithms involve mathematical problems where we know the numerical result beforehand. Usually, numerical results contain unknown approximation errors, so knowing the numerical result implies that we have a problem where the approximation errors vanish. This feature may be present in very simple mathematical problems. For example, the trapezoidal method is exact for integration of linear functions \(f(x)=ax+b\). We can therefore pick some linear function and construct a test function that checks equality between the exact analytical expression for the integral and the number computed by the implementation of the trapezoidal method.
A specific test case can be \(\int_{1.2}^{4.4}(6x4)dx\). This integral involves an ‘‘arbitrary’’ interval \([1.2,4.4]\) and an ‘‘arbitrary’’ linear function \(f(x)=6x4\). By ‘‘arbitrary’’ we mean expressions where we avoid the special numbers 0 and 1 since these have special properties in arithmetic operations (e.g., forgetting to multiply is equivalent to multiplying by 1, and forgetting to add is equivalent to adding 0).
Demonstrating correct convergence rates
Normally, unit tests must be based on problems where the numerical approximation errors in our implementation remain unknown. However, we often know or may assume a certain asymptotic behavior of the error. We can do some experimental runs with the test problem \(\int_{0}^{1}3t^{2}e^{t^{3}}dt\) where n is doubled in each run: n = 4,8,16. The corresponding errors are then 12 %, 3 % and 0.77 %, respectively. These numbers indicate that the error is roughly reduced by a factor of 4 when doubling n. Thus, the error converges to zero as \(n^{2}\) and we say that the convergence rate is 2. In fact, this result can also be shown mathematically for the trapezoidal and midpoint method. Numerical integration methods usually have an error that converge to zero as \(n^{p}\) for some p that depends on the method. With such a result, it does not matter if we do not know what the actual approximation error is: we know at what rate it is reduced, so running the implementation for two or more different n values will put us in a position to measure the expected rate and see if it is achieved.
The idea of a corresponding unit test is then to run the algorithm for some n values, compute the error (the absolute value of the difference between the exact analytical result and the one produced by the numerical method), and check that the error has approximately correct asymptotic behavior, i.e., that the error is proportional to \(n^{2}\) in case of the trapezoidal and midpoint method.
3.4.3 Finite Precision of FloatingPoint Numbers
The test procedures above lead to comparison of numbers for checking that calculations were correct. Such comparison is more complicated than what a newcomer might think. Suppose we have a calculation a + b and want to check that the result is what we expect. We start with \(1+2\):
Then we proceed with \(0.1+0.2\):
So why is \(0.1+0.2\neq 0.3\)? The reason is that real numbers cannot in general be exactly represented on a computer. They must instead be approximated by a floatingpoint number ^{3} that can only store a finite amount of information, usually about 17 digits of a real number. Let us print 0.1, 0.2, 0.1 + 0.2, and 0.3 with 17 decimals:
We see that all of the numbers have an inaccurate digit in the 17th decimal place. Because \(0.1+0.2\) evaluates to 0.30000000000000004 and 0.3 is represented as 0.29999999999999999, these two numbers are not equal. In general, real numbers in Matlab have (at most) 16 correct decimals.
When we compute with real numbers, these numbers are inaccurately represented on the computer, and arithmetic operations with inaccurate numbers lead to small rounding errors in the final results. Depending on the type of numerical algorithm, the rounding errors may or may not accumulate.
If we cannot make tests like 0.1 + 0.2 == 0.3, what should we then do? The answer is that we must accept some small inaccuracy and make a test with a tolerance. Here is the recipe:
Here we have set the tolerance for comparison to \(10^{15}\), but calculating 0.3  (0.1 + 0.2) shows that it equals 5.55e17, so a lower tolerance could be used in this particular example. However, in other calculations we have little idea about how accurate the answer is (there could be accumulation of rounding errors in more complicated algorithms), so \(10^{15}\) or \(10^{14}\) are robust values. As we demonstrate below, these tolerances depend on the magnitude of the numbers in the calculations.
3.4.4 Constructing Unit Tests and Writing Test Functions
Software testing in other languages often applies comprehensive test frameworks to automatically run through large numbers of tests. This is very advantageous as one can at any time check that the code works. It is a good habit to run the test suite after every edit of the source code files.

the name must start with test_

the test function cannot have any arguments

the tests inside test functions must be boolean expressions

a boolean expression b must be tested with assert(b, msg), where msg is an optional object (string or number) to be written out when b is false
A corresponding test function might then be
Test functions and their calls are conveniently placed in files whose names start with test_. A simple script can be made to search for such files and run them automatically (essentially, this is what testing frameworks do).
As long as we add integers, the equality test in the test_add function is appropriate, but if we try to call add(0.1, 0.2) instead, we will face the rounding error problems explained in Sect. 3.4.3, and we must use a test with tolerance instead:
Below we shall write test functions for each of the three test procedures we suggested: comparison with hand calculations, checking problems that can be exactly solved, and checking convergence rates. We stick to testing the trapezoidal integration code and collect all test functions in one common file by the name test_trapezoidal.m.

The numerical method (to be tested) must be available as a function in a file with the same name as the function.

The test functions are put in separate files.
Handcomputed numerical results
Our previous hand calculations for two trapezoids can be checked against the trapezoidal function inside a test function (in a file test_trapezoidal.m ):
Note the importance of checking err against exact with a tolerance: rounding errors from the arithmetics inside trapezoidal will not make the result exactly like the handcomputed one. The size of the tolerance is here set to \(10^{14}\), which is a kind of allround value for computations with numbers not deviating much from unity.
Solving a problem without numerical errors
We know that the trapezoidal rule is exact for linear integrands. Choosing the integral \(\int_{1.2}^{4.4}(6x4)dx\) as test case, the corresponding test function for this unit test may look like
Demonstrating correct convergence rates
In the present example with integration, it is known that the approximation errors in the trapezoidal rule are proportional to \(n^{2}\), n being the number of subintervals used in the composite rule.
 for \(i=1,2,\ldots,q\)

\(n_{i}=2^{i}\)

Compute integral with n _{ i } intervals

Compute the error E _{ i }

Estimate r _{ i } from (3.24) if i > 1

Making a test function is a matter of choosing f, F, a, and b, and then checking the value of r _{ i } for the largest i:
Running the test shows that all r _{ i }, except the first one, equal the target limit 2 within two decimals. This observation suggests a tolerance of \(10^{2}\).
Remark about version control of files
Having a suite of test functions for automatically checking that your software works is considered as a fundamental requirement for reliable computing. Equally important is a system that can keep track of different versions of the files and the tests, known as a version control system. Today’s most popular version control system is Git ^{4}, which the authors strongly recommend the reader to use for programming and writing reports. The combination of Git and cloud storage such as GitHub is a very common way of organizing scientific or engineering work. We have a quick intro ^{5} to Git and GitHub that gets you up and running within minutes.
 1.
Before you start working with files, make sure you have the latest version of them by running git pull.
 2.
Edit files, remove or create files (new files must be registered by git add).
 3.
When a natural piece of work is done, commit your changes by the git commit command.
 4.
Implement your changes also in the cloud by doing git push.
3.5 Vectorization
The functions midpoint and trapezoid usually run fast in Matlab and compute an integral to a satisfactory precision within a fraction of a second. However, long loops in Matlab may run slowly in more complicated implementations. To increase the speed, the loops can be replaced by vectorized code. The integration functions constitute a simple and good example to illustrate how to vectorize loops.
We have already seen simple examples on vectorization in Sect. 1.4 when we could evaluate a mathematical function \(f(x)\) for a large number of x values stored in an array. Basically, we can write
The result y is the array that would be computed if we ran a for loop over the individual x values and called f for each value. Vectorization essentially eliminates this loop in Matlab (i.e., the looping over x and application of f to each x value are instead performed in a library with fast, compiled code).
Vectorizing the midpoint rule
 1.
compute all the evaluation points in one array x
 2.
call f(x) to produce an array of corresponding function values
 3.
use the sum function to sum the f(x) values
The code is found in the file midpoint_vec.m . An interactive test reads
Note the need for the vectorized operator .* in the function expression since v(x) will be called with array arguments x.
The vectorized code performs all loops very efficiently in compiled code, resulting in much faster execution. Moreover, many readers of the code will also say that the algorithm looks clearer than in the loopbased implementation.
Vectorizing the trapezoidal rule
We can use the same approach to vectorize the trapezoid function. However, the trapezoidal rule performs a sum where the end points have different weight. If we do sum(f(x)), we get the end points f(a) and f(b) with weight unity instead of one half. A remedy is to subtract the error from sum(f(x)): sum(f(x))  0.5*f(a)  0.5*f(b). The vectorized version of the trapezoidal method then becomes (code in trapezoidal_vec.m )
3.6 Measuring Computational Speed
Now that we have created faster, vectorized versions of functions in the previous section, it is interesting to measure how much faster they are. The purpose of the present section is therefore to explain how we can record the CPU time consumed by a function so we can answer this question. The ‘‘stop watch’’ in Matlab is the function pair tic (start) and toc. Here is an interactive session measuring the effect of midpoint_vec versus midpoint:
The vectorized version is 100 times faster!
3.7 Double and Triple Integrals
3.7.1 The Midpoint Rule for a Double Integral
Derivation via onedimensional integrals
Direct derivation
Programming a double sum
The formula (3.25) involves a double sum, which is normally implemented as a double for loop. A Matlab function implementing (3.25) may look like
With this function, which is available in the file midpoint_double1.m , we may now compute some integral \(\int_{0}^{2}\int_{2}^{3}(2x+y)dydx=9\) in an interactive shell and demonstrate that the function computes the right number:
Reusing code for onedimensional integrals
It is very natural to write a twodimensional midpoint method as we did in function midpoint_double1 when we have the formula (3.25). However, we could alternatively ask, much as we did in the mathematics, can we reuse a welltested implementation for onedimensional integrals to compute double integrals? That is, can we use function midpoint
from Sect. 3.3.2 ‘‘twice’’? The answer is yes, if we think as we did in the mathematics: compute the double integral as a midpoint rule for integrating \(g(x)\) and define \(g(x_{i})\) in terms of a midpoint rule over f in the y coordinate. The corresponding function has very short code:
The important advantage of this implementation is that we reuse a welltested function for the standard onedimensional midpoint rule and that we apply the onedimensional rule exactly as in the mathematics.
Verification via test functions
How can we test that our functions for the double integral work? The best unit test is to find a problem where the numerical approximation error vanishes because then we know exactly what the numerical answer should be. The midpoint rule is exact for linear functions, regardless of how many subinterval we use. Also, any linear twodimensional function \(f(x,y)=px+qy+r\) will be integrated exactly by the twodimensional midpoint rule. We may pick \(f(x,y)=2x+y\) and create a proper test function that can automatically verify our two alternative implementations of the twodimensional midpoint rule. To compute the integral of \(f(x,y)\) we take advantage of SymPy to eliminate the possibility of errors in hand calculations. The test function becomes
Let test functions speak up?
If we call the above test_midpoint_double function and nothing happens, our implementations are correct. However, it is somewhat annoying to have a function that is completely silent when it works – are we sure all things are properly computed? During development it is therefore highly recommended to insert a print statement such that we can monitor the calculations and be convinced that the test function does what we want. Since a test function should not have any print statement, we simply comment it out as we have done in the function listed above.
The trapezoidal method can be used as alternative for the midpoint method. The derivation of a formula for the double integral and the implementations follow exactly the same ideas as we explained with the midpoint method, but there are more terms to write in the formulas. Exercise 3.13 asks you to carry out the details. That exercise is a very good test on your understanding of the mathematical and programming ideas in the present section.
3.7.2 The Midpoint Rule for a Triple Integral
Theory
Implementation
We follow the ideas for the implementations of the midpoint rule for a double integral. The corresponding functions are shown below and found in the files midpoint_triple1.m , midpoint.m , midpoint_triple2.m , test_midpoint_triple.m .
3.7.3 Monte Carlo Integration for ComplexShaped Domains
Repeated use of onedimensional integration rules to handle double and triple integrals constitute a working strategy only if the integration domain is a rectangle or box. For any other shape of domain, completely different methods must be used. A common approach for two and threedimensional domains is to divide the domain into many small triangles or tetrahedra and use numerical integration methods for each triangle or tetrahedron. The overall algorithm and implementation is too complicated to be addressed in this book. Instead, we shall employ an alternative, very simple and general method, called Monte Carlo integration. It can be implemented in half a page of code, but requires orders of magnitude more function evaluations in double integrals compared to the midpoint rule.
However, Monte Carlo integration is much more computationally efficient than the midpoint rule when computing higherdimensional integrals in more than three variables over hypercube domains. Our ideas for double and triple integrals can easily be generalized to handle an integral in m variables. A midpoint formula then involves m sums. With n cells in each coordinate direction, the formula requires n ^{ m } function evaluations. That is, the computational work explodes as an exponential function of the number of space dimensions. Monte Carlo integration, on the other hand, does not suffer from this explosion of computational work and is the preferred method for computing higherdimensional integrals. So, it makes sense in a chapter on numerical integration to address Monte Carlo methods, both for handling complex domains and for handling integrals with many variables.
The Monte Carlo integration algorithm
The idea of Monte Carlo integration of \(\int_{a}^{b}f(x)dx\) is to use the meanvalue theorem from calculus, which states that the integral \(\int_{a}^{b}f(x)dx\) equals the length of the integration domain, here \(ba\), times the average value of f, \(\bar{f}\), in \([a,b]\). The average value can be computed by sampling f at a set of random points inside the domain and take the mean of the function values. In higher dimensions, an integral is estimated as the area/volume of the domain times the average value, and again one can evaluate the integrand at a set of random points in the domain and compute the mean value of those evaluations.
 1.
embed the geometry Ω in a rectangular area R
 2.
draw a large number of random points \((x,y)\) in R
 3.
count the fraction q of points that are inside Ω
 4.
approximate \(A(\Omega)/A(R)\) by q, i.e., set \(A(\Omega)=qA(R)\)
 5.
evaluate the mean of f, \(\bar{f}\), at the points inside Ω
 6.
estimate the integral as \(A(\Omega)\bar{f}\)
To get an idea of the method, consider a circular domain Ω embedded in a rectangle as shown below. A collection of random points is illustrated by black dots.
Implementation
A Matlab function implementing \(\int_{\Omega}f(x,y)dxdy\) can be written like this:
(See the file MonteCarlo_double.m .)
Verification
A simple test case is to check the area of a rectangle \([0,2]\times[3,4.5]\) embedded in a rectangle \([0,3]\times[2,5]\). The right answer is 3, but Monte Carlo integration is, unfortunately, never exact so it is impossible to predict the output of the algorithm. All we know is that the estimated integral should approach 3 as the number of random points goes to infinity. Also, for a fixed number of points, we can run the algorithm several times and get different numbers that fluctuate around the exact value, since different sample points are used in different calls to the Monte Carlo integration algorithm.
The area of the rectangle can be computed by the integral \(\int_{0}^{2}\int_{3}^{4.5}dydx\), so in this case we identify \(f(x,y)=1\), and the g function can be specified as (e.g.) 1 if \((x,y)\) is inside \([0,2]\times[3,4.5]\) and −1 otherwise. Here is an example on how we can utilize the MonteCarlo_double function to compute the area for different number of samples:
To get a oneline definition of g, we have exploited the fact that each of the boolean tests (in parenthesis separated by &&) will evaluate to either 0 (if false) or 1 (if true). If all of them evaluate to true, the whole parenthesis will evaluate to 1 and the number 1 (from \(1+2*1\)) is returned. On the other hand, if any single one of the boolean tests evaluate to false, the parenthesis will evaluate to 0 and the number \(\)1 (from \(1+2*0\)) is returned. We see that the values fluctuate around 3, a fact that supports a correct implementation, but in principle, bugs could be hidden behind the inaccurate answers.
It is mathematically known that the standard deviation of the Monte Carlo estimate of an integral converges as \(n^{1/2}\), where n is the number of samples. This kind of convergence rate estimate could be used to verify the implementation, but this topic is beyond the scope of this book.
Test function for function with random numbers
To make a test function, we need a unit test that has identical behavior each time we run the test. This seems difficult when random numbers are involved, because these numbers are different every time we run the algorithm, and each run hence produces a (slightly) different result. A standard way to test algorithms involving random numbers is to fix the seed of the random number generator. Then the sequence of numbers is the same every time we run the algorithm. Assuming that the MonteCarlo_double function works, we fix the seed, observe a certain result, and take this result as the correct result. Provided the test function always uses this seed, we should get exactly this result every time the MonteCarlo_double function is called. Our test function can then be written as shown below.
(See the file test_MonteCarlo_double_rectangle_area.m .)
Integral over a circle
The test above involves a trivial function \(f(x,y)=1\). We should also test a nonconstant f function and a more complicated domain. Let Ω be a circle at the origin with radius 2, and let \(f=\sqrt{x^{2}+y^{2}}\). This choice makes it possible to compute an exact result: in polar coordinates, \(\int_{\Omega}f(x,y)dxdy\) simplifies to \(2\pi\int_{0}^{2}r^{2}dr=16\pi/3\). We must be prepared for quite crude approximations that fluctuate around this exact result. As in the test case above, we experience better results with larger number of points. When we have such evidence for a working implementation, we can turn the test into a proper test function. Here is an example:
(See the file test_MonteCarlo_double_circle_r.m .)
3.8 Exercises
Exercise 3.1 (Hand calculations for the trapezoidal method)
Compute by hand the area composed of two trapezoids (of equal width) that approximates the integral \(\int_{1}^{3}2x^{3}dx\). Make a test function that calls the trapezoidal function in trapezoidal.m and compares the return value with the handcalculated value.
Filename: trapezoidal_test_func.m.
Exercise 3.2 (Hand calculations for the midpoint method)
Compute by hand the area composed of two rectangles (of equal width) that approximates the integral \(\int_{1}^{3}2x^{3}dx\). Make a test function that calls the midpoint function in midpoint.m and compares the return value with the handcalculated value.
Filename: midpoint_test_func.m.
Exercise 3.3 (Compute a simple integral)
Apply the trapezoidal and midpoint functions to compute the integral \(\int_{2}^{6}x(x1)dx\) with 2 and 100 subintervals. Compute the error too.
Filename: integrate_parabola.m.
Exercise 3.4 (Handcalculations with sine integrals)
 a)
Let \(b=\pi\) and use two intervals in the trapezoidal and midpoint method. Compute the integral by hand and illustrate how the two numerical methods approximates the integral. Compare with the exact value.
 b)
Do a) when \(b=2\pi\).
Filename: integrate_sine.pdf.
Exercise 3.5 (Make test functions for the midpoint method)
Modify the file test_trapezoidal.m such that the three tests are applied to the function midpoint implementing the midpoint method for integration.
Filename: test_midpoint.m.
Exercise 3.6 (Explore rounding errors with large numbers)
The trapezoidal method integrates linear functions exactly, and this property was used in the test function test_trapezoidal_linear in the file test_ trapezoidal.m. Change the function used in Sect. 3.4.2 to \(f(x)=6\cdot 10^{8}x4\cdot 10^{6}\) and rerun the test. What happens? How must you change the test to make it useful? How does the convergence rate test behave? Any need for adjustment?
Filename: test_trapezoidal2.m.
Exercise 3.7 (Write test functions for \(\int_{0}^{4}\sqrt{x}dx\))
We want to test how the trapezoidal function works for the integral \(\int_{0}^{4}\sqrt{x}dx\). Two of the tests in test_trapezoidal.m are meaningful for this integral. Compute by hand the result of using 2 or 3 trapezoids and modify the test_ trapezoidal_one_exact_result function accordingly. Then modify test_ trapezoidal_conv_rate to handle the square root integral.
Filename: test_trapezoidal3.m.
Remarks
The convergence rate test fails. Printing out r shows that the actual convergence rate for this integral is −1.5 and not −2. The reason is that the error in the trapezoidal method ^{6} is \((ba)^{3}n^{2}f^{\prime\prime}(\xi)\) for some (unknown) \(\xi\in[a,b]\). With \(f(x)=\sqrt{x}\), \(f^{\prime\prime}(\xi)\rightarrow\infty\) as \(\xi\rightarrow 0\), pointing to a potential problem in the size of the error. Running a test with a > 0, say \(\int_{0.1}^{4}\sqrt{x}dx\) shows that the convergence rate is indeed restored to \(\)2.
Exercise 3.8 (Rectangle methods)
 a)
Write a function rectangle(f, a, b, n, height=’left’) for computing an integral \(\int_{a}^{b}f(x)dx\) by the rectangle method with height computed based on the value of height, which is either left, right, or mid.
 b)
Write three test functions for the three unit test procedures described in Sect. 3.4.2. Make sure you test for height equal to left, right, and mid. You may call the midpoint function for checking the result when height=mid.
Hint
Edit test_trapezoidal.m.
Filename: rectangle_methods.m.
Exercise 3.9 (Adaptive integration)
Suppose we want to use the trapezoidal or midpoint method to compute an integral \(\int_{a}^{b}f(x)dx\) with an error less than a prescribed tolerance ϵ. What is the appropriate size of n?
To answer this question, we may enter an iterative procedure where we compare the results produced by n and \(2n\) intervals, and if the difference is smaller than ϵ, the value corresponding to \(2n\) is returned. Otherwise, we halve n and repeat the procedure.
Hint
It may be a good idea to organize your code so that the function adaptive_ integration can be used easily in future programs you write.
 a)
Write a function
that implements the idea above (eps corresponds to the tolerance ϵ, and method can be midpoint or trapezoidal).
 b)
Test the method on \(\int_{0}^{2}x^{2}dx\) and \(\int_{0}^{2}\sqrt{x}dx\) for \(\epsilon=10^{1},10^{10}\) and write out the exact error.
 c)
Make a plot of n versus \(\epsilon\in[10^{1},10^{10}]\) for \(\int_{0}^{2}\sqrt{x}dx\). Use logarithmic scale for ϵ.
Filename: adaptive_integration.m.
Remarks
The type of method explored in this exercise is called adaptive, because it tries to adapt the value of n to meet a given error criterion. The true error can very seldom be computed (since we do not know the exact answer to the computational problem), so one has to find other indicators of the error, such as the one here where the changes in the integral value, as the number of intervals is doubled, is taken to reflect the error.
Exercise 3.10 (Integrating x raised to x)
Hint
Use ideas from Exercise 3.9.
Filename: integrate_x2x.m.
Exercise 3.11 (Integrate products of sine functions)
 a)
Plot \(\sin(x)\sin(2x)\) and \(\sin(2x)\sin(3x)\) for \(x\in]\pi,\pi]\) in separate plots. Explain why you expect \(\int_{\pi}^{\pi}\sin x\sin 2x\,dx=0\) and \(\int_{\pi}^{\pi}\sin 2x\sin 3x\,dx=0\).
 b)
Use the trapezoidal rule to compute \(I_{j,k}\) for \(j=1,\ldots,10\) and \(k=1,\ldots,10\).
Filename: products_sines.m.
Exercise 3.12 (Revisit fit of sines to a function)
 a)
Compute the partial derivative \(\partial E/\partial b_{1}\) and generalize to the arbitrary case \(\partial E/\partial b_{n}\), \(1\leq n\leq N\).
 b)Show that$$b_{n}=\frac{1}{\pi}\int_{\pi}^{\pi}f(t)\sin(nt)\,dt\thinspace.$$
 c)
Write a function integrate_coeffs(f, N, M) that computes \(b_{1},\ldots,b_{N}\) by numerical integration, using M intervals in the trapezoidal rule.
 d)
A remarkable property of the trapezoidal rule is that it is exact for integrals \(\int_{\pi}^{\pi}\sin nt\,dt\) (when subintervals are of equal size). Use this property to create a function test_integrate_coeff to verify the implementation of integrate_coeffs.
 e)
Implement the choice \(f(t)=\frac{1}{\pi}t\) as a Matlab function f(t) and call integrate_coeffs(f, 3, 100) to see what the optimal choice of \(b_{1},b_{2},b_{3}\) is.
 f)
Make a function plot_approx(f, N, M, filename) where you plot f(t) together with the best approximation S _{ N } as computed above, using M intervals for numerical integration. Save the plot to a file with name filename.
 g)
Run plot_approx(f, N, M, filename) for \(f(t)=\frac{1}{\pi}t\) for N = 3,6,12,24. Observe how the approximation improves.
 h)
Run plot_approx for \(f(t)=e^{(t\pi)}\) and N = 100. Observe a fundamental problem: regardless of N, \(S_{N}(\pi)=0\), not \(e^{2\pi}\approx 535\). (There are ways to fix this issue.)
Filename: autofit_sines.m.
Exercise 3.13 (Derive the trapezoidal rule for a double integral)
Use ideas in Sect. 3.7.1 to derive a formula for computing a double integral \(\int_{a}^{b}\int_{c}^{d}f(x,y)dydx\) by the trapezoidal rule. Implement and test this rule.
Filename: trapezoidal_double.m.
Exercise 3.14 (Compute the area of a triangle by Monte Carlo integration)
Use the Monte Carlo method from Sect. 3.7.3 to compute the area of a triangle with vertices at \((1,0)\), \((1,0)\), and \((3,0)\).
Filename: MC_triangle.m.
Footnotes
 1.
Replacing h by \(\Delta t\) is not strictly required as many use h as interval also along the time axis. Nevertheless, \(\Delta t\) is an even more popular notation for a small time interval, so we adopt that common notation.
 2.
You cannot integrate \(e^{x^{2}}\) by hand, but this particular integral is appearing so often in so many contexts that the integral is a special function, called the Error function (http://en.wikipedia.org/wiki/Error_function) and written \(\mbox{erf}(x)\). In a code, you can call erf(x).
 3.
 4.
 5.
 6.
Copyright information
Open Access This chapter is distributed under the terms of the Creative Commons Attribution‐NonCommercial 4.0 International License (http://creativecommons.org/licenses/bync/4.0/), which permits any noncommercial use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, a link is provided to the Creative Commons license and any changes made are indicated.
The images or other third party material in this chapter are included in the work’s Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work’s Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.