Contents
1 Introduction
As an introduction to averages, I decided to give the students a data set to play with. To make it more interesting for them, they were to gather the data themselves. Thus each student provided one set of data, and then the data were amalgamated and distributed amongst the class. I used Google Sheets to gather and share the data.
A brief outline of the activities is as follows:
-
Each student had a spreadsheet in which to record their data.
The actual data they recorded was various measurements of their hands. I managed to find about 40 distinct measurements for them to take.
-
The different sets of data were gathered into a single spreadsheet and processed slightly. They were then randomised and anonymised and shared back with the class.
-
The students then worked on the shared data, using spreadsheet functions to calculate the three different types of average.
Although the spreadsheet has functions that calculate these directly, the students were led to develop them in slightly more detail. For example, instead of using the
AVERAGE
function for the mean, they first calculated the sum, usingSUM
, and the number, usingCOUNT
, and then the division. -
There was an extension activity which was to find extreme values, and then to find those students with at least one extreme value.
The twist here is that there were enough distinct measurements that the probability is that every student would be extreme on at least one of them.
2 Using Sheets
The strategy with using Google Sheets was as follows:
-
Start with a template sheet containing a list of all the measurements that were to be taken.
-
Make a copy of the template for each student and share it with that student.
Note that the sheet would remain owned by me. This made the next step easier, and meant that I could easily correct their data (some recorded it in the wrong column, some included units).
-
Make a master copy of all of the data by pulling in the relevant column from each of the students' sheets.
-
Process, anonymise, and randomise that data.
This involved producing a copy of the data where each student's name was replaced by a generic
Student N
, the measurements were normalised by dividing by hand length, and the order of the measurements was randomised. -
Share the data with the students.
In actual fact, the data set was split in two and half shared with one half of the class and the other half with the other.
Again, a copy of the data was made for each student and shared with them. This meant that each student could work on the data without any other student seeing what they did.
It was important that I retained ownership of all of the documents. Not only did this make the workflow easiest, it also meant that if necessary I could quickly correct someone's work.
The interaction possible in Google Sheets meant that all of the above could be accomplished in such a way that when a student entered more data, the whole data set would update. This meant that those who finished their measurements first were not impeded in starting on the analysis by having only a small data set to work with.
3 The Great Share
The first step is to create a template spreadsheet. The students should record their measurements in a specific column, but it is up to you which column they should use.
If you put lengthy explanations in one column as to the measurements then be aware of the fact that the default setting for word wrapping is "overwrite". This means that if the next cell is empty, the text will overflow into it. To people not familiar with spreadsheets, this can give the erroneous impression that those cells have content in them. Thus it is better to set the word wrapping to "Wrap". To do this, select the column with your explanations, then from the "Format" menu choose "Text wrapping" and select the "Wrap" option.
Once the template is how you want it, you need to share it with your students. For this, you need a list of the students' email address and names. It is best to put this in a spreadsheet. For this particular situation, it is good to have a fresh spreadsheet as it is useful to be able to add more details. The following script assumes that the details are in two columns with names in the first column and emails in the second.
The following script will do the sharing. Note that the ids are the long strings of letters and numbers (and other symbols) in the URL when you open a file or folder in the Google ecosystem.
This script, and the later one, can be downloaded here.
function createMeasurementSheets() {
var studentId = ""; // id of spreadsheet with students' names and emails
var folderId = ""; // id of folder where final spreadsheets should be put
var templateId = ""; // id of template spreadsheet
var sheetName = " Measurements"; // title of spreadsheet (to be appended to the students' name)
var studentsCell = 'A2'; // upper-left cell of student details block
// Spreadsheet with students' usernames
var studentSheet = SpreadsheetApp.openById(studentId);
// Top left cell of student data
var studentUL = studentSheet.getRange(studentsCell);
// Array of student data
var students = studentUL.offset(0,0,studentSheet.getLastRow() - studentUL.getRow()+1,2).getValues();
// Folder where the sheets should end up
var folder = DriveApp.getFolderById(folderId);
// Template spreadsheet
var templateSS = SpreadsheetApp.openById(templateId);
var ss, sfile, root, cell;
for (var i=0; i<students.length; i++) {
// Copy the template spreadsheet, naming it "<Student Name> <sheetName>"
ss = templateSS.copy(students[i][0] + sheetName);
// Get the corresponding file
sfile = DriveApp.getFileById(ss.getId());
// Get the current folder containing the file (this will be the root of your Drive folder)
root = sfile.getParents().next();
// Add the file to the folder where it should be put
folder.addFile(sfile);
// Remove it from the original folder
root.removeFile(sfile);
// Now set the global permissions, make it so that the students can't reshare this file
sfile.setShareableByEditors(false);
// Add the given student as editors
ss.addEditor(students[i][1]);
// We'll store the id in the third column of the student's details
cell = studentUL.offset(i,2,1,1);
cell.setValue(ss.getId());
}
}
The configuration variables are at the start:
-
studentId
This is the id of the spreadsheet holding the students' details. -
folderId
This is the id of the folder where you want to store the created spreadsheets. -
templateId
This is the id of the template spreadsheet. -
sheetName
The created spreadsheets will have a name consisting of the student's name with this string appended to it (no space is added automatically). -
studentsCell
This is the top left cell of the block containing the students' names and emails. This allows you to have a header row, or other details in the spreadsheet.
The script makes a third column in the details spreadsheet with the ids of the created spreadsheets. This is useful for referring to them later.
4 Amalgamating the Data
Amalgamating the data is quite straightforward and depends on a neat feature of Google Sheets: the ability to pull selected data from one sheet into another. This is not a one-time pull, it links the data so that updates to the master sheet are reflected in the copy (the link is only one-way, editing the copy doesn't work - in fact, putting anything in a cell that contains copied data breaks the copying process).
As we already have the ids in the details spreadsheet, it is easiest to work in that. To maintain a clean spreadsheet, it is best to make a new sheet for the data.
-
Click on the
+
in the lower left of the browser window to create a new sheet. -
Right-click (or two-finger tap if using a touchpad) on the new sheet's name to rename it something meaningful.
Actually, it's probably a good idea to name the original sheet something meaningful as well. In the following I'll assume that it is called Class List
.
The key to importing data is the importrange
function. The syntax of this is:
=importrange("spreadsheet id","range to import")
It'll be useful to have the template imported as well to see what the measurements mean.
-
In cell
A1
of the new sheet, type:=importrange("template id","A1:B")
Replace
template id
by the id of the template spreadsheet. In the range, I've typedA1:B
. This effectively means "import the first two columns" as in my template I had two columns of explanations. If only one, useA1:A
, if more then increment theB
as far as you want. -
When you import data from one sheet to another, you need to explicitly authorise the link. After hitting return on the above, it will first say
Loading...
and then#REF!
. Hover the mouse, or click on, the cell. A dialogue will pop-up asking you to authorise access. Click on the blue box to do this.
As I imported two columns, my first free one is the C
column. So to import the data from the first student, I do the following.
-
In cell
C1
of the new sheet, type:=importrange(indirect("'Class List'!R" & column()-2 & "C3",false),"C1:C")
If you are in cell
B1
then in place ofcolumn()-2
you would typecolumn()-1
, if in cellD1
it would becolumn()
, and so on.Also, in this I'm assuming that the ids are in the 3rd column of the
Class List
sheet and start at the first row. If a different column, change the3
in the"C3"
to a different number (theC
here refers to "column" not to theC
column specifically). If a different row, adjust thecolumn()-2
accordingly. Basically,column()
expands to the column number of the given cell (withA
begin and so on).I'm also assuming that the data in the students' sheets is in the third column. If a different column, change the
C
s in theC1:C
part accordingly. -
Now copy that from cell
C1
across the top row as many times as you need. -
As before, you will need to authorise access for each of these imports (unfortunately, there's no way that I know of to automate this).
And there you have your data! Moreover, as students take more measurements it will automatically update.
5 Processing the Data
You may want to process the data before resharing it with the class. One reason for this is to allow the students to focus properly on the task at hand, which is analysing the data. If they can identify whose measurements are whose they might focus more on that.
What I did (remember that my measurements were of their hands) was to divide each measurement by the hand length. Partly this was to normalise it, and partly to anonymise it. I also randomly sorted the measurements (my student list was already in a random order).
To accomplish these ends, I created a third sheet in the spreadsheet where I'd imported all the data. I created a list of random numbers in one column and then the numbers to in the second (I had measurements other than the hand length).
-
In cell
A1
type=rand()
. This will enable us to randomise the data by sorting on this column. -
In cell
B1
type=row()
. This will mean that the teacher can still match the data to its original meaning. -
Copy these down as many rows as you need.
-
Select the region you've just created (including the first row) and copy it.
-
Right-click (or two-finger tap) on cell
A1
and from the menu select "Paste Special" and then "Paste values only". This replaces all of your formulae by their actual values. This will be important when sorting the data.
In each students' measurements, the first one was the hand length and thus the one I wanted to divide all the others by. Unfortunately, not all students recorded this measurement, and so sometimes dividing by it would give the inevitable "divide by zero" error. Also, even if a student did record this measurement, not all students recorded every measurement so some data would be missing, but spreadsheets often treat empty cells as being . In either of those cases, I wanted the corresponding cell to be empty rather than an error or zero. The following formula accomplishes that end:
=if(indirect("Measurements!R" & $B1 & "C" & Column(),false) = 0, "", iferror(indirect("Measurements!R" & $B1 & "C" & Column(),false)/Measurements!C$2,""))
Let's look at that in more detail.
-
indirect("Measurements!R" & $B1 & "C" & Column(),false)
The bit inside this,
"Measurements!R" & $B1 & "C" & Column()
is a reference to the corresponding original measurement. We pick it out using theRC
notation whereinR2C4
is the cell in the nd row and th column. So we pick out the row corresponding to the value inB1
(this makes it robust against reordering) and the column according to the current column. The outerindirect
then gets the cell contents. -
=if(... = 0, "", ...)
This is a conditional. It tests if the cell contents are . If so, it displays the empty string,
""
. If not, it goes on to the next part. -
=iferror(...,"")
The … in this are dividing the cell by the hand length. If that produces an error (i.e., if the hand length is zero) then the surrounding
iferror
catches this and displays just""
. Otherwise it displays the result of the calculation.
Note the dollars at various parts in this. We'll copy this into a block the size of our original data (less one row). Normally, when a formula is copied then it adjusts the cell references within it. The dollars inhibit this for various bits.
At some point, we reorder the data based on the random numbers in column A
. To do this, click on a cell in column A
, and then from the "Data" menu select "Sort sheet by column A, A Z".
6 Sharing the Data
Sharing the data is pretty much like sharing the original template: we make a copy for each student and share that with the individual student. But we don't want to copy the current spreadsheet as that contains the original data as well. So we make a new spreadsheet to share that contains just the processed data. (I actually split my data into two groups at this point.)
-
Create a new spreadsheet.
-
In cell
B2
type:=importrange("id of data spreadsheet","Data!C1:Z")
In this, use the id of the spreadsheet with the data. The
Data!C1:Z
refers to the processed data, hereData
is the name of the sheet containing the processed data andC1:Z
says that it starts in columnC
and goes on to columnZ
.I put this in
B2
so that the first row can be a header row and the first column also a label column. -
Do the "allow access" thing as before.
-
If you want to split the data, do the above for each grouping and only import that part you want for each group.
In the following, I'll assume that we're using multiple sheets to show how that would work. Other than that, the script is very similar to the initial one.
function createDataSheets() {
var studentId = ""; // id of spreadsheet with students' names and emails
var folderId = ""; // id of folder where final spreadsheets should be put
var dataIds = [
"",
""
]; // array of ids of data spreadsheets
var sheetName = " Data"; // title of spreadsheet (to be appended to the students' name)
var studentsCell = 'A2'; // upper-left cell of student details block
// Spreadsheet with students' usernames
var studentSheet = SpreadsheetApp.openById(studentId);
// Top left cell of student data
var studentUL = studentSheet.getRange(studentsCell);
// Array of student data
var students = studentUL.offset(0,0,studentSheet.getLastRow() - studentUL.getRow()+1,2).getValues();
// Folder where the sheets should end up
var folder = DriveApp.getFolderById(folderId);
// Data spreadsheets
var dataSS = [];
for (var i=0; i<dataIds.length; i++) {
dataSS.push(SpreadsheetApp.openById(dataIds[i]));
}
var ss, sfile, root, cell, n;
for (var i=0; i<students.length; i++) {
n=i%dataSS.length;
// Copy the nth data spreadsheet, naming it "<Student Name> <sheetName>"
ss = dataSS[n].copy(students[i][0] + sheetName);
// Get the corresponding file
sfile = DriveApp.getFileById(ss.getId());
// Get the current folder containing the file (this will be the root of your Drive folder)
root = sfile.getParents().next();
// Add the file to the folder where it should be put
folder.addFile(sfile);
// Remove it from the original folder
root.removeFile(sfile);
// Now set the global permissions, make it so that the students can't reshare this file
sfile.setShareableByEditors(false);
// Add the given student as editors
ss.addEditor(students[i][1]);
// We'll store the id in the forth column of the student's details
cell = studentUL.offset(i,3,1,1);
cell.setValue(ss.getId());
}
}
The new configuration detail here is that we have a list of ids in dataIds
instead of the single id in the previous script.
The one irritation about this is that the new sheets have to be individually authorised to access the data. This means opening each one in turn and allowing access. Irritating as this is, it is a small price to pay for the fact that the data is now linked from start to finish and as students input more measurements, so the shared data sheets will update automatically.
7 Calculating the Averages
Although this has been primarily about how to use Google Apps Scripts, I may as well finish off with the instructions given to the students as to how to analyse the data. The aim of this activity was to introduce the students to how to use spreadsheets to calculate things like averages, but also to remind them of how those averages are actually calculated. Thus they aren't instructed to use functions like AVERAGE
directly, but are led through the building blocks of each type of average and only after calculating it do they compare it with the in-built spreadsheet function.
The shared data filled the region B2:M43
, with the first row as a header row.
These instructions can also be downloaded as a PDF.
7.1 Introduction
There are three types of "average" that are used in statistics:
-
Mean: to calculate the mean, you add up all the values and divide by how many there are.
This is useful when the actual values of the data are important.
-
Median: to find the median, you sort the values and pick the middle one. If there is no "middle one", you find the mean of the two middle values.
This is useful if you want to divide your data into two equal groups.
-
Mode: to find the mode, you find the value that appears the most.
This is useful if you want the most common value.
In the following, you will calculate each of these averages for each row of your data. You will do this by calculating them for one row and then copying the formulae to the other rows.
Not all your rows may contain data initially, so you might want to do the initial calculations on a row that has some data. In which case, in all of the following formulae change the 2
for the row that you are actually working with.
7.2 Calculating the Mean
To calculate the mean, we add up the entries in a row and divide by the number of entries in that row.
-
In cell
N1
typeMean
. -
In cell
N2
type=sum(B2:M2)
to calculate the total of the data in the second row. -
In cell
O2
type=count(B2:M2)
to count the number of entries in that row. -
In cell
P2
type=N2/O2
to divide the total by the count. -
In cell
Q2
type=average(B2:M2)
to display what the spreadsheet thinks the average is. -
Copy the formulae from
N2:Q2
into rows3
to43
to compute the means for the other rows.
7.3 Calculating the Median
To calculate the median, we make a copy of each row with the data in order and then pick out the middle value (if there is one, if not the mean of the two middle values).
-
In cell
R1
typeMedian
. -
Click on the
+
at the bottom left of the spreadsheet window to create a new sheet. -
Click on the
Sheet2
tab at the bottom to select it. -
In cell
B2
type=transpose(sort(transpose(Sheet1!B2:M2),1,true))
to make a sorted copy of the second row of data. -
Copy this into cells
B3
toB43
to make a sorted copy of all the rows of data. -
Click on the
Sheet1
tab at the bottom to go back to the first sheet. -
In cell
R2
type (all on one line):
index(Sheet2!B2:M2,O2/2+1))/2)
This picks out the middle value in the list, if the list has an odd number of entries, or the mean of the middle two if even.
-
In cell
S2
type=median(B2:M2)
to display what the spreadsheet thinks the median is. -
Copy the formulae from
R2:S2
into rows3
to43
to compute the medians for the other rows.
7.4 Calculating the Mode
To calculate the mode we first put the entries into "bins" of width . That is, we put all the values that are between, say, and together.
-
In cell
T1
typeMode
. -
Click on the
+
at the bottom left of the spreadsheet window to create a new sheet. -
Click on the
Sheet3
tab at the bottom to select it. -
In cell
A1
type0
. -
In cell
B1
type=A1+.1
and copy this into cellsC1
toU1
. These define the "bins" into which we'll put our data. -
In cell
A2
type=countif(Sheet1!$B2:$M2,"<"&B$1)-countif(Sheet1!$B2:$M2,"<"&A$1)
(note the dollars) and copy this into cellsB2:T43
.This counts how many entries in a given row lie between the value at the top of the column and the value at the top of the next.
-
Click on the
Sheet1
tab at the bottom to go back to the main sheet. -
In cell
T2
type=max(Sheet3!A2:T2)
to find the largest bin size. -
In cell
U2
type=match(T2,Sheet3!A2:T2,0)
to find which bin this occurs in. -
In cell
V2
type=index(Sheet3!A$1:T$1,1,U2)
to find the label on that bin, which is the mode for this data. -
In cell
W2
type=mode(arrayformula(if(isnumber(B2:M2),rounddown(B2:M2,1),)))
to find what the spreadsheet thinks the mode of the data is.
7.5 Finding the Special People
Let's say someone is special if one of their measurements is either the largest or smallest of that measurement.
-
In cell
X1
typeMaximum
and in cellAA1
typeMinimum
. -
In cell
X2
type=max(B2:M2)
and in cellAA2
type=min(B2:M2)
to find the maximum and minimum values in the row. -
In cell
Y2
type=match(X2,B2:M2,0)
to find which column the maximum lies in and in cellAB2
type=match(Y2,B2:M2,0)
. -
In cell
Z2
type=index(B$1:M$1,1,Y2)
to get the corresponding person's "name" and in cellAC2
type
=index(B$1:M$1,1,AB2)
. -
Right-click (or two-finger tap) on cell
B1
and selectConditional Formatting
. -
In the new panel, in the menu
Format cells if...
selectCustom formula is
and in the box underneath type=match(B1,$Z2:$Z43,0) > 0
. ClickDone
. -
Click
Add new rule
and do the same but in the box type=match(B1,$AC2:$AC43,0) > 0
. -
Click again on cell
B1
and useCtrl+C
to copy it. Then select cellsC1:M1
, right-click (two-finger tap), and hover overPaste special
to bring up the second menu from which selectPaste conditional formatting only
. This will mean that the special people are highlighted.
How many people are special?