Plotly tutorial (a beginner exercise)
A ploty “Heatmap” Tutorial: Starting From Scratch
My nighttime decompression time lately has been used to play with data sets to familiarize myself with some of the functionality of plotly. I’ve worked with a handful of sets of data, but I thought I’d choose one and make a quick tutorial. I am no expert. But if you’re beginning like I am, sometimes you want advice from a novice.
So, here’s a sample plotly exercise I had some fun with. I say it’s been fun, but my idea of fun is orthogonal to the plane . . . And no, physics friends, I do not mean “normal” by that reference!
Obtain a data set from somewhere. Preferably a data set that has meaning to you–or you’ll run the risk of not being motivated enough to produce a great graph (it can take some time). I chose MCAT performance data, since the end of the semester has me thinking of pre-med students and their progress toward their application processes. For a heatmap, you’ll want a typical two variable data set, with some kind of third measure of a magnitude or scalar quantity. I know, that last sentence ended with a redundancy.
Notice the initial score along the y-axis and the change in score along the x-axis. Important for a heatmap, notice the values in the table are along a range, not binary (1’s or 0’s). This is what makes these data a good candidate for generating a heatmap.
This one’s easy. Copy and paste the source/URL somewhere handy. Why not in an extra tab in the same Excel file used to store the data? Put it somewhere, though. Because if you’re like me, soon you’ll have who knows how many browser tabs open and none of them will have the page you snagged the data from. And of course, you’ll have forgotten to bookmark the page. Link to data.
Enter the data into a spreadsheet. For the table above, I entered it in Excel by hand since there wasn’t much data and I couldn’t copy and paste the data from the pdf. It is possible to enter it directly into a plotly “grid,” but I like having a backup copy of the data on my physical hard drive. If you’re lucky, you can just copy the data from the source and paste it in Excel. That’s what I did for many of my data sets, though I often still had to do some transposing. If you find that pasted data in Excel ends up being all crammed into a single cell, try the “Text to Columns…” command pictured below. It just might save you time!
Upload your data to plotly. This is simple, really.
Click on the grid file and hang on! [“Grid” refers to the little four square icon. That’s plotly’s term for “data file.”]
A few quick things right off
- You can share your data table right away if you like by clicking the Share button at the upper right.
- Changing column titles is easy. Just click and type. I found the cursor would flip off the intended header unexpectedly occasionally. Just click on a data cell and then the column header to prevent weird cursor action.
- For some reason, plotly did not like my original column headers -4, -3, -2, . . . +6, +>7. When I tried to plot, strange things happened. That’s why I wrote out “Minus/Plus” for each column.
- plotly may select some columns as x- and y-axes variables for you. Just deselect them by clicking on them if you want to change those.
The heart of plotly, of course, is in the “Make A Plot” function. Click this and (for this tutorial) select Heatmaps
Notice there are lots of options here. It’s easy to switch back and forth. When viewing the grid, just click the Make a Plot function at any time to select a different type of plot.
Play around with which columns to plot along the axes.
- If you’ve been wise in titling your columns, then select “Column Names” for your x-axis.
- I didn’t use the N column. It’s ok to have columns that aren’t plotted. Really. It is. You might do a different plot with the same file later in which you do want to use previously inactive columns.
- The z-axis is what is plotted along a color spectrum emulating “heat” (I hate that term from a pedagogical standpoint, by the way–but that’s another blog entry).
How I did the MCAT plot is below. Your plot, naturally, will be different. So select and deselect away. If the output is undesirable, just close the plot and try different selections.
Touch up your graph with Notes and editing your title, axes. Just click on the existing labels to edit them, it’s really easy.
To add notes, click on the Notes icon. These are created in layers. So to edit them once created, just click on the Notes icon again and select the specific note you’d like to edit from the drop down box.
The + adds a new note. However many times you click the + is how many new annotations will appear on your graph. If you add way too many (like I did initially), then just click the – . Be careful that the annotation you want to be deleted is selected from the drop down box first though. Otherwise you may delete an annotation you didn’t want to (I did this a couple times, too).
Many aesthetic things about the arrow, text and box itself can be edited pretty easily from the Notes icon. And, you can choose to link the annotations to the data or the page. So if an arrow points to a data point of importance, select the link to data. When viewers zoom in and out, the arrow will still point at what you’d intended.
Share your plot! Clicking the blue Share button brings up a box for you to share your plot via email or social media. This is what plotly’s about: sharing data and plots. When plots are shared, those who can see the plots will also have access to the data. However, they will not be able to change your plot (unless you specifically give them that level of access). They can instead, save the data to their own file system and modify the plot as a separate file.
Turn this over to your students! Challenge them to find data of significance to them and then work to represent that data which paints a clear picture of the data’s meaning. I don’t care what discipline you teach–this can be a powerful tool for you and your students.