Friday, December 29, 2017

Visualizing a distribution

Understanding how the students in my course did on an exam includes visualizing the distribution of their scores.  In the past I used two standard visualizations: a histogram of the letter grade and a scatter plot that showed the distribution (similar to a cumulative distribution function).

For example, consider a final exam that 49 students took.  The exam was worth 35 points, so the thresholds for letter grades (90% = "A," etc.) were 21, 24.5, 28, 31.5.  After converting each student's score to a letter grade, I generated a histogram of the letter grades (Figure 1), which clearly shows that the most common grade was a "B."  To get more details, I also generated the full distribution of the exam scores (Figure 2).
Figure 1.  The histogram.


Figure 2. The full distribution (each dot is one score).
Although the vertical gridlines in the full distribution are on the thresholds for letter grades, there is a great deal of wasted space, and it is not obvious on this chart by itself which letter grade was most common.  (In a larger data set, the vertical distance between points would have to shrink and could become too small to distinguish the markers.)  To overcome this limitation, I changed the cumulative count to restart at each threshold, which yielded the chart (which I am calling a "distrogram") in Figure 3.
Figure 3. The distrogram.
There are now four "curves," one for each letter grade, and each curve shows the distribution of scores in that letter grade.  The height of each curve shows the total number of scores in that letter grade (as the histogram does).  Thus the distrogram clearly shows that the most common grade was a "B."  It also shows that the scores that correspond to "C" (between 24.5 and 28) were near the threshold for a "B"  and that no one earned a perfect score (a 35).  Thus, this distrogram provides the same information as the histogram plus additional information in a layout that is easier to navigate than the full distribution (the second chart).   Because there are multiple curves, there is more vertical space between markers.  Compared with the full distribution, however, the distrogram does require more operations to answer a distribution question such as "How many students earned at least 30 points on the exam?" because one would have to add the counts for multiple bins.

A distrogram should be useful for numerical data where the individual values and their grouping into categories (such as letter grades) based on these values are important.  The key feature is that the simple bar in the histogram is replaced by a scatter plot showing the distribution of the values in that bin.  Markers are needed to show the individual values; lines connecting the markers are not necessary; if they are used, they should be light so that the markers are easy to see.  Horizontal and vertical gridlines should also be light if used.

To create a distrogram, define the thresholds for the bins and sort the values in ascending order.  Determine the bin for each value.  Add a cumulative count that starts at 1 and increases by 1 at each value (even if this value and the previous one are equal).  Reset the cumulative count to 1 when the new value and the previous value are in different bins.  Create a scatter plot, with the values on the horizontal axis and the cumulative count on the vertical axis




Wednesday, July 12, 2017

Japanese food that minimizes the risk of choking

NPR posted a piece about food that minimizes the risk of choking.  In Japan, more people die from choking than from traffic accidents, and the difficulty that elderly person have swallowing is a leading cause of choking.  The cooked food is pureed and then re-formed (with a thickener) into a dish that looks like regular food but is easier to swallow (no chewing required).

Meanwhile, The Washington Post had an article about the importance of knowing the Heimlich maneuver and CPR, both of which can help someone who is choking. 

These highlight both sides of managing risk: (1) preventing a potential problem (choking) by eating foods that are less likely to cause choking, and (2) following a contingency plan (the Heimlich maneuver) if someone does start choking.

Bill Murray, who played a weatherman who saves someone from choking in the 1993 movie Groundhog Day, saved a man from choking in a Phoenix restaurant in 2016 by using the Heimlich maneuver, which he learned while making the movie.

Monday, June 19, 2017

Educating Future Engineers by Learning about Design Processes

I had the pleasure of attending the Clive L. Dym Mudd Design Workshop at Harvey Mudd College earlier this month. There were many good talks about design research and some discussion of the  skills that engineering students in the future need to learn.

I was given the opportunity to present a poster based on an abstract that I submitted.  You can find the extended abstract here.  The short abstract follows: 
Engineers in the near future should have strong analytical skills, practical ingenuity, and creativity. These engineers should also be dynamic, agile, resilient, and flexible; that is, they should be able to adapt. Consistent with this view, this paper presents a specific vision of what these engineers should be able to do in the years ahead to adapt to the ever-changing needs of society and a vision of how design education should adapt as well.

Saturday, March 4, 2017

Article in Wiley StatsRef

My article Rational Decision Making was published in Wiley StatsRef:Statistics Reference Online.  The article discusses the following topics:
  • decision making;
  • decision theory;
  • decision analysis;
  • game theory;
  • multicriteria decision making;
  • risk;
  • uncertainty; and
  • rationality.
Here is the abstract:
Rational decision making requires executing an appropriate decision-making process to select the best alternative. This can be challenging when information is uncertain or when time is limited. This article describes three important perspectives on decision making: (i) the problem-solving perspective, (ii) the decision-making process perspective, and (iii) the decision-making system perspective. This article describes important concepts from these three perspectives, including rationality, multicriteria decision making, group decision making, decision making under uncertainty, game theory, contexts for decision making, decision-making processes, and techniques for modeling and improving decision-making systems. Understanding and applying these concepts can improve decision making. The article will first consider the challenge of selecting the best alternative, which is the problem-solving perspective. Then, the article will discuss the decision-making process perspective: how people make decisions. Finally, the article will describe decisions from the decision-making system perspective by considering the decision-making behaviors and information flow within organizations and how to improve those decision-making systems. This article also provides numerous references to sources that provide additional details about rational decision making.