Excel tips | Xiamen Data Journalism Terminology An Excel file is

Excel tips | Xiamen Data Journalism
Terminology
An Excel file is called a workbook.
A workbook can have one or more worksheets.
Each worksheet is list, with columns and rows of information. We often call this a table or spreadsheet or a
data set.
Each box in a worksheet is called a cell. Every cell has an address, like A1 (Column A, Row 1 -- the top left
corner of the worksheet).
A table usually has one row containing the labels for each column. We often call this the "header row." The
header row in this example is on Row 1:
Before you work with data, always make a copy of the worksheet. Do this by right-clicking on the tab and
selecting copy:
Then rename your worksheets, so you will know which sheet is the original data, and which sheet is your copy:
When you have a big table with a lot of data, it’s easy to get lost. It’s a good idea to “hide” the columns you are
not interested in – like Column A (the country code). To hide a column, select the entire column by click on the
A. Then, do a right-click and select “hide”:
Now Column A is no longer taking up space – but it still exists; it hasn’t been deleted:
But look what happens when you scroll down on this table:
The header row disappears! And that’s bad (and dangerous), because we don’t know if Column C is for 2014 or
for 2013. So, let’s lock in place the header row. This is called “freeze panes.” Put your cursor in Cell A2, and
then select …
Now when you scroll down, you’ll always see what is in each column:
-------------------------------------------------------------------------------------------------------------------------------OK – now we’re ready to really do some analysis! First, let’s make sure we understand what is in this
spreadsheet. It looks like each row is one country, and the columns show how many patents the inventors
from that country received each year. Let’s scroll to the bottom of the spreadsheet. (You can get their quickly
by clicking the Ctrl key and the End key at the same time.)
The last row of data is for the country Zimbabwe. This table doesn’t have any totals for all the patents issued
to all the inventors in all of the countries. Let’s add a row for “Global totals”. Type the row name in B236. Then
click on …
Do the “SUM” for the other years. Now your spreadsheet looks like this:
Let’s sort all of the countries to see which ones got the most patents in 2015. First, select the header row (Row
1) by clicking on the “1”:
Then, scroll to the bottom of the spreadsheet so that you can see the last rows of data. Hold your Shift key
down and select Row 235 (by clicking on the “235”):
Now you have selected the data you want to sort – everything except the row that says “Global totals”
(because that row is not a country).
Then, on the Menu bar, go to Data and Sort:
Let’s tell Excel to sort the data by the “Year 2015” column … and to put the results in order from the highest
number to the lowest number:
Here’s what we get:
So Chinese inventors ranked No. 5 among all foreign countries in 2015. Not bad!
Now sort the data for 2014 … and 2013. How is China doing so far in 2016? Do you see a pattern regarding
China?
-------------------------------------------------------------------------------------------------------------------------------Now let’s add up the total number of patents each country got for all four years (2013 through 2016). First, in
Cell G1, type a column heading, such as “Total, 2013-16.” Then, in Cell G2, type
=sum(C2:F2)
(Instead of typing “C2”, you can click on cell C2. In other words, you can build the formula by clicking on the
cells or by typing the cell address.)
When you hit “Enter,” you’ll see the results of the formula:
Now, click on the cell containing the formula. You’ll see a black dot at the bottom right corner of the cell. This
dot is called the AutoFill handle. If you put your cursor on this dot, your cursor changes from a big white
cross …
… to a thin black cross:
When that happens, double-click, and Excel will copy your formula down the entire column:
Let’s do another formula. Let’s calculate the percentage change in patents for each country from 2013 to 2015.
Type a column header in Cell H1:
In our formula, we need to take the “new number” (2015) … subtract the “old number” (2013) … and then
divide by the “old number” (2013). We must tell Excel to do the subtraction first. So in Cell H2 (the formula for
Row 2), our formula will be:
=(F2-E2)/E2
Click Enter, and you’ll see the result. But the result is not formatted as a percent yet. So click on:
Once you have the number formatted correctly, you can copy the formula down the column:
Now let’s sort all of the countries by the “percentage change” column. First, select the data you want to sort
(from Row 1 through Row 235) … then go to Data > Sort.
As you can see, we have some error messages; I’m sure you can figure out why!
Data journalists warn about the “law of small numbers”: You can’t draw important conclusions from small
numbers. So let’s look only at countries that had at least 1,000 patents in 2013. To do this, we must filter our
list. Make sure your cursor is on the header row – Row 1. Then click on this button (a picture of a funnel):
Now you have a drop-down arrow on each column label. Let’s tell Excel to show us only countries in which the
2013 number was greater or equal to 1000:
What do you see? Is that newsworthy? How could you turn this information into a news story?