Python Basics

Updating a Spreadsheet

Your browser needs to be JavaScript capable to view this video

Try reloading this page, or reviewing your browser settings

This video segment shows how to work with spreadsheets in Python, how to open, read and write spreadsheet data.

Keywords

  • Python basics
  • spreadsheet
  • reading a spreadsheet
  • reading cell values
  • adding cells
  • saving

About this video

Author(s)
Coen de Groot
First online
17 March 2020
DOI
https://doi.org/10.1007/978-1-4842-5831-6_20
Online ISBN
978-1-4842-5831-6
Publisher
Apress
Copyright information
© Coen de Groot 2020

Video Transcript

Now that you have seen the basics of Python, let’s do something practical. We will update the spreadsheet, adding a new column based on another column. You could do this in your spreadsheet program, of course. But what if you had dozens or hundreds of spreadsheets to update?

Or maybe you want to take some data from a website and add to a spreadsheet daily. That may be fun for a while, but after a few days or weeks you may want to automate this.

Openpyxl is a powerful library which lets you use spreadsheets based on the Excel.xlsx format. This includes Excel and LibreOffice spreadsheets. This is not part of Python’s standard library. It doesn’t automatically come with Python when you install Python. Instead, it is one of the more than 200,000 packages listed in the Python Package Index at pypi.org.

To install it, run pip install openpyxl from a terminal or console window. We start by importing the openpyxl library so we can use it. We have a simple spreadsheet with a few columns. It lists some tall buildings from all around the world with a height in meters. We load the spreadsheet using the Load Workbook function.

The spreadsheet acts like a dictionary. Each sheet is one item in the dictionary. The name of the sheets are the keys. The sheets themselves– the values.

Each sheet acts like another dictionary. To access a single cell, use its column name and row number as the dictionary key. This gives us the whole cell, the value, any formula, formatting, et cetera. To get the value, use .value.

As you can see, cell A1 contains the text name. Next, I’d like to loop over all rows and show the data. That is, rows 2, 3, 4, and 5.

However, we can’t use numbers to access the cells directly. Let’s say we have a row number as an integer. How do we use the row number to get our cell? This doesn’t work. This also fails because we can’t add a string and a number.

Python doesn’t know how to handle this. This is the simplest way based on what you’ve already seen. We convert the row number to a string, and then concatenate the two strings.

This is another way of doing it. This uses something called F strings. This was introduced late 2016 in Python 3.6. Any variable name or other Python code between the curly brackets gets replaced by its value and converted to a string, if necessary.

We don’t have time to cover this in detail, but we will use it in this example, because I think it is the best way of doing this. Here are both methods again for comparison.

Now let’s list all rows. For each of the data rows, so for rows 2, 3, 4, and 5, get the name, country, and height in meters, and print it. This shows us the values of the cells in our spreadsheet.

Now let’s create a new column for the height in feet. This will be column D. First, we add a header for our new column in cell D1.

Again, we treat our sheet as if it is a dictionary. The key is the cell address, D1. The value is the cell value, height in feet.

Now for the actual height. To convert from meters to feet, we multiply it by 3.3. Note that this is a simplification. The real value is closer to 3.281.

We take the height in meters from column C and multiply it by 3.3, then create a new cell right next to it in column D. Let’s see what this looks like. To save our spreadsheet, we use the Save function and give it a file name.

Here is the result. As you can see, there’s now an extra column D with a height in feet. There is a lot more you can do with spreadsheets. If this is something you’re interested in, I suggest you check out the openpyxl documentation.

Just do an online search for openpyxl. Let’s see what else we can do with Python.