# Numerical Computing Formalism

• Sandeep Nagar
Chapter

## Abstract

Numerical computation enables us to compute solutions for numerical problems, provided we can frame them into a proper format. This requires certain considerations. For example, if we digitize continuous functions, we are going to introduce certain errors due to the sampling at a finite frequency. Hence, a very accurate result would require a very fast sampling rate. In cases when a large dataset needs to be computed, it becomes a computationally intensive and time-consuming task. Also, users must understand that the numerical solutions are an approximation, at best, when compared to analytical solutions. The onus of finding their physical meaning and significance lies on us. The art of discarding solutions that do not have a meaning for a real-world scenario is something that a scientist/engineer develops over the years. Furthermore, a computational device is just as intelligent as its operator. The law of GIGO (garbage in, garbage out) is followed very strictly in this domain.

## 10.1 Introduction

Numerical computation enables us to compute solutions for numerical problems, provided we can frame them into a proper format. This requires certain considerations. For example, if we digitize continuous functions, we are going to introduce certain errors due to the sampling at a finite frequency. Hence, a very accurate result would require a very fast sampling rate. In cases when a large dataset needs to be computed, it becomes a computationally intensive and time-consuming task. Also, users must understand that the numerical solutions are an approximation, at best, when compared to analytical solutions. The onus of finding their physical meaning and significance lies on us. The art of discarding solutions that do not have a meaning for a real-world scenario is something that a scientist/engineer develops over the years. Furthermore, a computational device is just as intelligent as its operator. The law of GIGO (garbage in, garbage out) is followed very strictly in this domain.

In this chapter, we will consider some of the important steps in solving a physical problem using numerical computation. Defining a problem in proper terms is just the first step. Making the right model and then using the right method to solve (solver) the problem distinguishes an experienced scientist/engineer from a novice.

## 10.2 Physical Problems

Everything in our physical world is governed by physical laws. Because of the men and women of science who toiled under difficult circumstances to come up with fine solutions for the natural events happening around us, we obtained mathematical theories for physical laws. To test these mathematical formalisms of physical laws, we use numerical computation. If it yields the same results as that of a real experiment, they validate each other. Numerical simulations can remove the need of doing an experiment altogether provided we have a well-tested mathematical formalism. For example, nuclear powers of our times don’t have to test real nuclear bombs anymore. The data about nuclear explosions, which were obtained during real nuclear explosions, have enabled scientists to model these physical systems quite accurately, thus eliminating the need for a real test.

In addition to applications such as simulating a real experiment, modeling physical problems is a good educational exercise. While modeling, hands-on exercises enable students to explore the subject in depth and give a proper meaning of the topic under study. Solving numerical problems and visualizing the results make the learning permanent and also ignite the research related to flaws in mathematical theory, ultimately leading to new discoveries.

## 10.3 Defining a Model

Modeling is defined as writing equations for a physical system . As its name suggests, an equation pertains to equating two sides. An equation is written using an = sign where terms on the left-hand side are equal to terms on the right-hand side. The terms on either sides of equations can be numbers or expressions. For example:
$$3x+4y+9z=10$$

This is an equation having a expression, 3x + 4y + 9z, on the left-hand side (LHS) and a term, 10, on the right-hand side (RHS). Please note that while LHS is an algebraic term, RHS is a number.

Expressions are written using functions that show a relation between two domains. For example, f (x) = y illustrates a relationship of y to x using rules of algebra. Mathematics has a rich library of functions that can be used to make expressions. Choosing the proper function depends on problem. Some functions describe some situations better than others. For example, oscillatory behavior can be described in a reasonable manner using trigonometric functions such as sin(x) and cos(x). Objects moving in straight lines can be described using linear equations such as y = mx + c where x is their present position, m is the constant rate of change of x, w.r.t y and c is the offset position. Objects moving in a curved fashion can be described by various nonlinear functions where the power of a dependent variable, like x in the previous sentence, is not 1.

In real life, we can have situations that can be a mixture of these scenarios. For example, an object can oscillate and move in a curved fashion at the same time. In such cases, we write an expression using a mixture of functions or find new functions that could explain the behavior of the object. Verifying the functions is done by finding solutions to equations describing the behavior and matching them with observations made about an object. If they match perfectly, we obtain perfect solutions. In most cases, an exact solution might be difficult to obtain. In these cases, we get an “approximate” solution. If the errors involved while obtaining an approximate solution are within toleration limits, the models can be acceptable.

As previously discussed, physical situations can be analytically solved by writing mathematical expressions in terms of functions involving dependent variables. The simplest problems have simple functions between dependent variables with a single equation. There can be situations where multiple equations are needed to explain a physical behavior. In the case of multiple equations being solved, the theory of matrices comes in handy.

Suppose the following equations define the physical behavior of a system:
$$-x+3y=4$$
(10.1)
$$2x-4y=-3$$
(10.2)
This system of two equations can be represented by a matrix equation, as follows:
$$\left[\begin{array}{cc}-1& 3\\ {}2& -4\end{array}\right]+\left[\begin{array}{c}x\\ {}y\end{array}\right]=\left[\begin{array}{c}4\\ {}3\end{array}\right]$$

Now using matrix algebra , the values of variables x and y can be found such that they satisfy the equations. These values are called the roots of these equations. These roots are the point in 2-D space (because we had two dependent variables) where the system will find stability for that physical problem. In this way, we can predict the behavior of the system without actually doing an experiment.

The mathematical concept of differentiation and integration becomes very important when we work with a dynamic system . When the system is constantly changing the values of dependent variables to produce a scenario, it is important to know the rate of change of these variables. When these variables are independent of each other, we use simple derivatives to define their rate of change. When they are not independent of each other, we use partial derivatives.

For example, Newton’s second law of motion says that rate of change of velocity of an object is directly proportional to the force applied on it. We can show this concept mathematically:
$$F\alpha \frac{dy}{dx}$$
(10.3)
The proportionality is turned into equality by substituting for a constant of multiplication m such that the following is true:
$$F=m\times \frac{dy}{dx}$$
(10.4)

If we know the values or expressions for F, this equation can be solved analytically and solutions can be found for this equation. However, in some cases, the analytical solution may be too difficult to obtain. In such cases, we digitize the system and find a numerical solution.

There are many methods to digitize and numerically solve a given function. Programs to implement a particular method to solve a function numerically are called a solver. A lot of solvers exist to solve a function. The choice of solver is critical to successfully obtain a solution. For example, Equation 10.4 is a differential equation. It is a first-order ordinary differential equation . A number of solvers exists to solve it including Euler and Runge-Kutta. The choice of a particular solver depends on the accuracy of its solution, the time taken for obtaining a solution, and the amount of memory used during the process. The latter is important where memory is not a freely expendable commodity as when using microcomputers with limited memory storage.

The advantage of using Python to perform a numerical computation lies in the fact that it has a very rich library of modules to perform various tasks required. The predefined functions have been optimized for speed and accuracy (in some cases, accuracy can be predefined). This enables the user to rapidly prototype the problem instead of concentrating on writing functions to do basic tasks and optimizing them for speed, accuracy, and memory usage.

## 10.4 Python Packages

A number of packages exists to perform numerical computation in a particular scientific domain. The web site  gives a list of packages. Installing packages can be attained by writing the command

on the LINUX command line. Users are encouraged to check out the following packages for scientific computation:
• numpy: For numerical computation

• scipy: Superset of numpy that encompasses specific functions for physics apart from general mathematics in numpy

• scikit-learn: Machine learning

• tensorflow: Machine learning

• Pandas: Statistical data analytics

• scikit-image: Image processing

• bokeh: Interactive plotting

## 10.5 Python for Science and Engineering

Computers are used in both theoretical as well as experimental studies in science and engineering. In theoretical studies, computers are mainly used for solving problems where actions are iteratively performed for a large set of similar or different data points within a model of a real-world problem. Experimental investigations utilize computers for instrumentation and control. Hence, an ideal programming language for scientific investigation must perform these tasks in an efficient manner. Efficiency here encompasses the following:
• Ability to write the problem in simple, intuitive, and minimalistic syntax

• Minimum time of execution

• Ability to store and retrieve large amounts of data in an errorfree manner

• Ability to process data using parallel processing paradigm

• Ability to handle vast types of data sets

• Object-oriented programming

• Wide graphics capability

• Architecture independence

• Instrumentation and control system for a variety of platforms

• Networking

• Security

• Less time for prototyping a problem

Let’s evaluate Python on these parameters to judge its usage for various scientific tasks. This exercise will give important clues to users before using Python for a scientific problem.

## 10.6 Prototyping a Problem

Python is an interpretive language. The process of interpretation removes the act of compiling the code prior to producing machine code. Python reads the code line-by-line and outputs machine code as soon as one line is interpreted correctly. Interpretive languages have great advantages in the act of prototyping a problem, but are notoriously slow in execution.

### 10.6.1 What Is Prototyping?

Prototyping a problem involves formulating a mathematical model for a real-life problem and then coding the mathematical model using the syntax of a particular programming language. Mathematical modeling involves the following:
• Formulating variables

• Marking them dependent and independent in nature

• Devising functional relationships between them by assigning known mathematical functions as assumptions

Once a mathematical model has been devised, it needs to be tested using a computer program to produce output, which is then analyzed in terms of meaningful predictions and/or conclusions for real-life problems under study. Prototyping involves repeating the process of making models and testing them several times to perform feasibility analysis of a particular model before deciding to choose the same and making a complete model. At this stage of prototyping, user may choose to ignore other efficiency-related parameters like faster execution.

The act of converting a mathematical model to a computer program involves expressing mathematical architecture in terms of syntax of the programming language. This can be done part-wise for a model or as a whole. Part-wise modeling would require compatibility of various parts. Python provides a series of advantages in this regard.

### 10.6.2 Python for Fast Prototyping

The advantage of interpretive architecture is lessening the time to debug. As a code is interpreted line-by-line, the codes runs fine until it encounters a problem. This helps in identifying and isolating the problematic part of the code, thus enabling faster debugging. The overall result is a dramatic reduction of time for prototyping a solution as most of the time for devising a solution is spent in debugging. In addition to this fact, Python is devised for simple, intuitive, and minimalistic syntax, which further accelerates the prototyping process, saving time for the people involved in solving the scientific problem. The ability to easily visualize problems with powerful graphic libraries such as matplotlib and mayavi adds value to the quality of the coding process because mistakes can be found more easily and they can be presented in a better manner.

Being open source and modular in structure, Python provides the ability to part-wise model develop faster by using existing code instead of reinventing the wheel by writing it again. Within the same version of Python (Python 2 or Python 3), the modules are compatible. Python also allows codes of some languages to run natively within a Python code. Using the Cython package, you can embed C code in a Python program. Similarly, using the Jython package, you can embed java code in a Python program. This allows programmers to choose various programming languages as per their abilities and still develop their model in Python, taking advantage of what their programming languages do not offer. This also enables programmers to use legacy code instead of writing it again in Python.

## 10.7 Large Dataset Handling

The h5py package enables developers to handle HDF5 binary data format, which is mostly used to store large amounts of data efficiently. Numerical computation using Python packages like numpy and scipy can then be operated on these data points in a vectorized manner. An array-based computing paradigm used for numpy and scipy becomes advantageous here since the large dataset interfaced using h5py can be operated as if it’s an array. Thousands of datasets can be categorized, tagged, and then saved in a single file.

In the age of the Internet, users might like to fetch data from database servers, perform computation on a smaller chunk of data at a time, and send back results to application servers for further processing and report generation. For this purpose, Python provides a variety of packages. Python has a standard mechanism for accessing databases called the Database API (DB-API). The Python DB-API specifies a way to connect to databases and issue commands to them. Python DB-API fits nicely into existing Python code and allows Python programmers to easily store and retrieve data from databases .

DB-API includes the following:
• Connections that encompass the guidelines about connecting to databases

• Executing statements and stored procedures to query, update, insert, and delete data with cursors
• A cursor is a Python objects that points to a particular location in the database.

• Once a cursor is obtained, various methods like inserting, updating, and deleting data as well as querying data can be performed.

• Transactions with support for committing or rolling back a transaction
• A transaction is a sequence of operations performed as a single logical unit of work having four distinct properties (ACID: Atomicity, Consistency, Isolation, and Durability).

• The possibility to roll back a transaction is crucial for securing very important data.

• Examining metadata on the database module as well as on a database and table structure
• Metadata associated with a database describes the features of that database.

• The ability to access metadata allows developers to use the database judiciously.

• Defining the types of errors and providing exceptions using Python

A large list of databases has been included in the DB-API list, which enables developers to interact with multiple databases within a single code.

Specialized packages like pandas are designed to perform vectorized operations on large datasets in an efficient manner. These are used extensively nowadays in the field of big data. Most often, statistical analysis is needed for a dataset. The pandas package provides most of the statistical functions required for basic as well as advanced statistical analysis. Coupled with simple plotting libraries like matplotlib, quick verification of analysis and the ability to produce publication-quality graphs have extended a golden helping hand to developers.

## 10.8 Instrumentation and Control

Python cannot connect to a hardware directly unless the hardware maintains an operating system (OS) where Python is installed. There are microcomputers like Raspberry Pi (RPi) that do this. Hence, Python codes directly interact with connectors where sensors and actuators are connected. Since Python is an open source language, running on an open source hardware gives a lot of advantages to developers since ready-made solutions can be available in most cases. This reduces the total time of development drastically. RPi running Linux OS supports Python using the library rpi.gpio to perform, read, and write operations at GPIO (general-purpose input/output) pins. Thus, using Python code, developers can read electrical signals from sensors connected to RPi. Developers can also design complicated electrical systems with actuators like motors connected to GPIO pins by programming in Python to drive them. In most cases, scientific inputs work in a feedback loop configuration where sensors read some physical parameters , and these values are used to drive actuators to perform tasks. Python can perform this task with ease. With this whole package being defined under open source license, developers are free to reconfigure it in any way desired. This enables scientists to develop customized equipment as needed for their experiments.

In the case of hardware not running an OS, Python cannot directly access the underlying hardware. It cannot interface directly with the software library modules provided by most hardware vendors either. In these cases, Python codes can be written to tap communication at a serial port or a USB device that utilizes what is referred to as a virtual serial port. In these cases, there are two options for developers: writing a C extension in the form of DLL (dynamic-Link library) or using a ctypes library, which provides methods to directly access function in external DLL. If DLL is already available via a vendor, ctypes can access its functionalities within a Python code.

Another great module is PyVisa. PyVisa is a Python package that enables the developers to control all kinds of measurement devices independently of the interface (for example, GPIB, RS232, USB, and Ethernet). This is a great relief for complicated machines since a number of different protocols are used to send data over many different interfaces and bus systems (for instance, GPIB, RS232, USB, and Ethernet). Sometimes the programming language that a developer wishes to use might not have libraries that support both the device and its bus system. As a result, Virtual Instrument Software Architecture (VISA) was devised. VISA is a standard for configuring, programming, and troubleshooting instrumentation systems comprising GPIB, VXI, PXI, Serial, Ethernet, and/or USB interfaces.

In any case, the responsibility of understanding the signal output from electronic devices as well as managing data flow lies with developers. Such developers must have sound knowledge of electronics, data communication, and Python programming as well as C/C++ programming. Developers must have a clear understanding of transmission media and its limits in terms of data transfer rates, signal configuration, and data blocks. They must also understand the type of connection, whether it be serial or parallel in nature.

## 10.9 Parallel Processing

As opposed to a single task being performed in a serial processing paradigm, parallel processing devises ways to perform two or more processes at the same time. A process is the smallest unit of computation done at a processor. Single-core processors can only perform serial processing in the traditional sense. Multicore processors are affordable and can be found in most hardwares nowadays where CPS has more than one core. Python provides a number of modules that enable multicore processing.

Two important concepts make up most of the parallel programming framework: threads and processes. A process, in the simplest terms, is an executing program. One or more threads run within a process. A thread is the basic unit to which the operating system allocates processor time. Now two approaches can be employed in parallel programming :

• Running code via multiple processes

A number of “jobs” can be submitted to different threads. Jobs can be considered as “subtasks” of a single process. Threads usually have access to the same memory areas (in other words, shared memory). This approach requires proper synchronization to avoid conflict. For example, if two processes wish to write to the same memory location at the same time, conflicts will result in errors. Hence, a safer approach is to submit multiple processes to completely separate memory locations (that is, distributed memory). In this scenario, each process run completely independent from the other. The multiprocessing module ( https://docs.python.org/2/library/multiprocessing.html ) provides a simple way to allocate processes to parts of the codes that perform similar tasks so that they can be performed in a parallel fashion. For example, sorting operations on large datasets can be performed in a parallel fashion to reduce time. Similar mathematical calculations on different data points can be performed in a parallel fashion.

## 10.10 Summary

The various features of Python that we have discussed prove that it is a worthy candidate for an all-in-one solution for scientific tasks. Since Python is open source, the vast number of libraries help to reduce development times and costs. The availability to connect to databases of large varieties, to hardwares of varied configurations, and to soures on the Internet makes Python a favored option for scientific computation. In this chapter, we have not ilustrated the usage of the individual module for numerical computing and scientific work in general, but, instead, we have convered various facilities due to the limitations of the scope of one book.

In this book, we have illustrated the usage of the Python3 programming language to a beginner, specifically targeting engineers and scientists. Almost all branches of science and engineering requires numerical computation. Python is one alternative to perform numerical computation. Python has a library of optimized functions for general computation. Also, it has a variety of packages to perform a specialized job. This makes it an ideal choice for prototyping a numerical computation problem efficiently. Moreover, it has thousands of libraries for specific scientific tasks, both software- and hardware-oriented. For this reason, Python is being taught at most university to students of engineering and science. The community of developers is exponentially increasing. In the near future, don’t be surprised if Python takes over the world!