Friday, December 22, 2023

Metareasoning for Robots

 

Cover image: Springer Nature.

Earlier this year, Springer Nature published my book Metareasoning for Robots: Adapting in Dynamic and Uncertain Environments.   It is in their Synthesis Lectures on Computer Science series.  It is a short book that has five chapters.

From the website:

This book is a state of the art resource that robotics researchers and engineers can use to make their robots and autonomous vehicles smarter. Readers will be able to describe metareasoning, select an appropriate metareasoning approach, and synthesize metareasoning policies. Metareasoning for Robots adopts a systems engineering perspective in which metareasoning is an approach that can improve the overall robot or autonomous system, not just one component or subsystem. 

 This book introduces key concepts, discusses design options for metareasoning approaches and policies, and presents approaches for testing and evaluation of metareasoning policies. After considering the conceptual design phase, it discusses how to implement metareasoning in the robot’s software architecture and how to synthesize metareasoning policies.  

Every chapter has references to valuable works on robotics and metareasoning, and the book uses examples from the author's own research and from other research groups to illustrate these ideas. In addition, this book provides links to books and papers for readers who wish to investigate these topics further.  


 


Friday, March 3, 2023

Competence and Trust for Human-Autonomy Teams

 
I recently attended two talks on how autonomous systems interact with people.  The first considered the human operator who helps an autonomous system; the second studied the human whom the autonomous system is helping.

The 37th AAAI Conference on Artificial Intelligence included a "Bridge Session" on Artificial Intelligence and Robots.  In that session, Connor Basich, a Ph.D. student at the University of Massachusetts, gave a talk with the title "Competence-Aware Autonomy: An Essential Skill for Robots in the Real World."  According to Basich, a competence-aware autonomous system has "the ability to know, reason about, and act on the extent of its own capabilities in any situation in the context of different sources of external assistance."  In his talk, Basich referred to work by himself and Sholomo Zilberstein (his advisor) as well as papers by Sadegh Rabiee and Joydeep Biswas (at the University of Texas) and others.

Their work on competence-aware autonomy is motivated by the need for safe autonomy, including safe autonomous vehicles.  Some autonomous systems will ask a human operator to intervene when it cannot determine a safe action, which is good, but sometimes the autonomous system relies too much upon the operator, which is undesirable.  Thus, the autonomous systems needs a better way to determine whether the operator's intervention is needed or not.

Competence-aware perception (also called introspective perception) can predict which parts of the sensed input are inaccurate or will produce erroneous results downstream in the perception pipeline.  Rabiee and Biswas developed introspective vision for SLAM (IV-SLAM). This approach uses a convolutional neural network to predict the reliability of each part of the input image.  The algorithm then tells the SLAM algorithm to sample more features from the reliable parts of the image.

More generally, a competence-aware system has multiple levels of autonomy; each level calls the human operator under different conditions.  In the manual operation level, the operator does everything.  In a higher level of autonomy, the system first attempts the selected action but calls the operator if the attempt fails.  A competence-aware system can select the optimal level of autonomy for the current situation (Basich, 2020; Basich et al., 2023).

It appears to me that competence-aware autonomy is an innovative type of metareasoning.  It's metareasoning because it monitors and controls the system's reasoning processes.  For instance, the IV-SLAM approach uses a metareasoning policy to control the SLAM algorithm.  Basich's competence-aware system uses a metareasoning policy to determine the level of autonomy (which encompasses the entire reasoning process).

Dawn Tilbury, Herrick Professor of Engineering and Department Chair of Robotics at the University of Michigan, gave a seminar at the University of Maryland about her research on how much drivers trust automated systems.  In her work, the human subjects drove a simulated car that had level 3 autonomy that kept the vehicle in a lane, maintained a constant speed, warned the drive of obstacles, and stopped the vehicle to avoid collision (emergency braking).  In addition, the subjects were rewarded for their performance on a visual search task that they did while driving, but they were penalized for emergency stops, so they needed to steer the vehicle around obstacles when needed.  In this setup, the warning system was inaccurate (it sometimes gave false alarms, and it sometimes failed to warn the driver).  If the subject trusted the system too much, he spent too much time on the search task and failed to react when the it missed an obstacle.  If the subject trusted the system too little, he spent too little time on the search task and too much time driving.  After developing a way to measure trust, Tilbury and her collaborators gave the automated system the ability to monitor the level of trust and tell the subject when he needed to pay attention more and when he needed to trust the system more.  They found that this successfully nudged the subjects to trust the system appropriately.

Saturday, October 8, 2022

The Cone on the Map: Hurricane Risk Management

Hurricane Ian (photo: NOAA, via Wikipedia)


In the wake of Hurricane Ian, risk communication is in the news again.

The Washington Post had an article about the limitations of the hurricane forecast cone (Scott Dance and Amudalat Ajasa, "The confusion and controversy over the forecast cone," October 5, 2022).  Rebecca Morss, from the National Center for Atmospheric Research, said "Some people think the cone represents the size of the storm, which it doesn't.  Some people think it represents the area of impact, which it doesn't."

5-day Forecast Track, 8:00 P.M., September 25, 2022 (Source: NOAA).

The deficiencies of this single graphic were significant because the forecast hurricane track changed rapidly in the days before landfall, as the National Hurricane Center's archive shows.  At 8:00 P.M. on Sunday, September 25, 2022, the line in the middle of the five-day cone crossed the Big Bend area of Florida, approximately 300 miles north of Fort Myers.  At 8:00 P.M. on Monday, as the hurricane approached Cuba, the line in the middle of the three-day cone crossed the west coast of Florida just north of Tampa Bay, and Fort Myers was just outside the edge of the cone.  By the next morning, that line crossed the coast just south of Tampa Bay.  By Tuesday evening, the line had moved to Port Charlotte, and Fort Myers was inside the cone.

The miscommunication was preventable.  As mentioned in The Washington Post article, the forecast cone graphic includes the disclaimer: "Note: The cone contains the probable path of the storm center but does not show the size of the storm.  Hazardous conditions can occur outside of the cone."

Also, the National Hurricane Center publishes numerous other graphics that do indicate the expected size of the storm: the Hurricane-Force Wind Speed Probabilities is a particularly useful one.  On Sunday evening, this graphic showed the probability of hurricane-force winds in Fort Myers was at least 10%.  By Monday evening, this probability had increased to at least 20%.  By Tuesday morning, it was at least 40%.

To me, because I am familiar with it, the hurricane forecast cone is useful, but it clearly has limitations, and I don't rely on it, especially when my family is in harm's way.  The media can better inform the public by showing the wind-speed probability and storm surge graphics and emphasizing the reach of the storm instead of focusing on where the center of the storm makes landfall.  

The Hurricane Ian archive includes a Graphics Archive that will present a slideshow of the graphics, which is useful for seeing how the forecasts changed.

Friday, December 31, 2021

Frequencies and human rationality

This post was inspired by reading a number of articles and books about whether or not humans can make rational decisions.  Different definitions of rationality lead to different answers.  Although a popular view is that certain "cognitive illusions" show that humans can be irrational, Gerd Gigerenzer has argued over the years that humans are indeed rational and that human cognition is more adapted to certain representations of information than others.  He and his collaborators have shown that changing the format of the information changes how well experimental subjects do on judgment tasks (which are used to test human rationality).  Herbert Simon made a similar argument about how the representation of a problem can affect how one solves it.

Gigerenzer identified two types of information that are called "probability": subjective probabilities (about single events) and frequencies (within a set).  He repeated judgment experiments by asking for frequency judgments instead of probabilities about single events; in his experiments, fewer subjects committed the errors that were found in the original judgment experiments.  Humans seem irrational if one considers only a certain algorithm or norm as describing true rationality, but the environment and the representation of the information are also relevant when assessing someone's judgment.

To explain human judgment, he proposed the theory of probabilistic mental models (PMM), a model of bounded rationality.  In this theory, a person uses different cues to help make judgments.  In situations with time pressure, a person may use the first useful cue, a heuristic that corresponds to bounded rationality.

Heuristics such as Take the Best and recognition are "short cuts that can produce efficient decisions" and help humans adapt to a complex, dynamic environment.   In 1991 Gigerenzer noted that the heuristics of similarity, availability, and anchoring and adjustment "are largely undefined concepts" and are merely well-known principles of the mind.  Gigerenzer has proposed ecological rationality to explain human decision making.  "Ecological rationality can refer to the adaptation of mental processes to the representation of information ... It can also refer to the adaptation of mental processes to the structure of information in an environment."

Gigerenzer cited Herbert Simon to argue that our cognitive abilites are limited and that our minds "should be understood relative to the environment in which they evolved, rather than to the tenets of classical rationality."  Because our environment included natural frequencies, not single-event probabilities, fewer subjects make errors in experiments with frequencies.  Humans acquire and update natural frequencies through experience.  Moreover, humans can recognize that the environment has changed and will ignore data about the past when it changes.

Recently, Gigerenzer has argued that the study of behavioral economics has a "bias bias"; that is, economists are looking for evidence of systematic errors and bias while ignoring psychological research that contradicts the view that humans are systematically irrational.

Related work

  1. Gigerenzer, Gerd, "How to make cognitive illusions disappear: Beyond heuristics and biases," European Review of Social Psychology, Volume 2, Number 1, pages 83-115, 1991.
  2. Gigerenzer, Gerd, "The bounded rationality of probabilistic mental models," in K.I. Manktelow and D.E. Over, editors, Psychology and Philosophical Perspectives, Routledge, London, 1993.
  3. Gigerenzer, Gerd, "Ecological intelligence: An adaptation for frequencies," In D.D. Cummins and C. Allen, editors, The Evolution of Mind, pp. 9-29, Oxford University Press, Oxford, 1998.
  4. Gigerenzer, Gerd, Adaptive Thinking, Oxford University Press, Oxford, 2000.
  5. Gigerenzer, Gerd, "The bias bias in behavioral economics," Review of Behavioral Economics, Volume 5, Number 3-4, pages 303-336, 2018.

Friday, October 22, 2021

The Price of Safety

 

A recent paper by Chao Chen, Genserik Reniers, Nima Khakzad, and Ming Yang discusses safety economics.  Safety economics is concerned with the costs of safety measures, and an important objective is to minimize the sum of two costs:  (1) the expected cost of the harms due to accidents in the future and (2) the current and future cost of safety measures.  Safety economics is a tool for making "decisions that are as good as possible (or 'optimal')" in order to both optimize the use of resources and maximize safety.  The paper discusses the importance of cost modeling, which includes direct costs, indirect costs, and "non-economic costs" that need to be monetized.  The value of statistical life and willingness to pay are mentioned in this context.

A common approach in safety economics is risk-based safety optimization, which is a type of risk management process.  This includes hazard identification, risk analysis, risk evaluation (i.e., is the risk acceptable), and risk mitigation.  The last step is accomplished by safety cost optimization, which evaluates the costs of the different safety strategies and selects the one with minimal cost.

The paper also discusses the minimal total safety cost approach (which considers both the safety strategy cost and the potential accident cost), cost-benefit analysis, cost-effectiveness analysis, multi-objective optimization, and game theory approaches.

To me the variety of approaches suggests that one must first engage in metareasoning to decide which decision-making process should be used.  Moreover, all of the approaches require human input in the form of setting thresholds (for risk acceptance criteria or cost-effectiveness ratios), weighing criteria, and making tradeoffs.  In practice, as with many decision models, a "decision calculus" (Little, 1970) may emerge in which the decision-maker asks the analyst to "find the solution," but these two people iterate as the decision-maker asks "what if?" in response to the results that the analyst generates.

Finally, the paper's focus on minimizing costs suggests that safety economics is based on substantive rationality, in which a decision-maker should choose the optimal alternative (Stirling, 2003).  Because bounded rationality better describes human decision-making, approaches that focus on finding satisfactory (not necessarily optimal) solutions may be more practical (Simon, 1981).

Cited sources:
Chen, Chao, Genserik Reniers, Nima Khakzad, and Ming Yang, "Operational safety economics: Foundations, current approaches and paths for future research," Safety Science, Volume 141, 2021.
Little, John D.C., “Models and managers: the concept of a decision calculus,” Management Science, Volume 16, Number 8, pages B-466-485, 1970.
Simon, Herbert A., The Sciences of the Artificial, second edition, The MIT Press, Cambridge, Massachusetts, 1981.
Stirling, Wynn C., Satisficing Games and Decision Making, Cambridge University Press, Cambridge, 2003.
 

Image source: https://www.gov.uk/government/news/venues-required-by-law-to-record-contact-details

Monday, July 26, 2021

Reducing Noise and Improving Decision Making

 

Cover page image: https://www.littlebrownspark.com/

Noise: A Flaw in Human Judgment. By Daniel Kahneman, Olivier Sibony, and Cass R. Sunstein

Kahneman, Sibony, and Sunstein have written a book that is both valuable and frustrating.

This book presents multiple ideas related to human judgment and decision making: (1) a review of studies that have described the variability in judgments in many domains, (2) approaches for reducing that variability, (3) an approach for making decisions when there are multiple factors that should be considered, and (4) an appeal for better procedures in the legal system.  According to the authors, the book offers an understanding of the psychological foundations of disparities in judgments, which are here classified as noise and bias.

The book’s strengths include its review of the literature on the variability of judgments and its distinction between noise and bias.  It presents examples of judgments from many domains (including medicine, business, and the legal system).  It strongly supports systematic decision-making processes (a topic that is important to me) and emphasizes the importance of accurate judgments.  It acknowledges the difficulties of reducing noise.  Finally, its notes provide references to original studies that provide context and details to the book’s discussion.

The book describes a range of best practices for judgment and decision making: employing persons who are better at making judgments, aggregating multiple judgments, using judgment guidelines, using a shared scale grounded in an outside view, and structuring complex decisions.  The first three items are meant to reduce judgment errors due to noise and bias.  The last item is a practical multi-criteria decision-making process that (a) decomposes the decision into a set of assessments, (b) collects information about each assessment independently, and (c) presents this evidence to the decision-maker(s), who may use intuition to synthesize this information and select an alternative.  In an appendix, the authors use their recommendations, for the book provides a checklist (a guideline) for evaluating a decision-making process.

When discussing ratings, the book wisely recommends that “performance rating scales must be anchored on descriptors that are sufficiently specific to be interpreted consistently.”  Scales with undefined terms such as “poor” and “good” and “excellent” should be discarded unless they are well-understood in the group of persons who are using them as a common language.

Unfortunately, two weaknesses (one minor, one major) frustrated me.  The first is the fact that the text contains no superscripts, citations, or other marks that indicate the notes that are available in the Notes section at the end of the book.  In the Notes section, each note has only a page number and a brief quote that suggest the text to which the note applies.  This unreasonable scheme reduces the value of the notes' many citations and explanations by making them harder to find and use.

The second weakness is more significant.  The book does not distinguish between judgment and decision making.  It tends to treat them as the same thing.  Indeed, a note explains that the authors “regard decisions as a special case of judgment” (page 403). 

For example, the book discusses the judgments that an insurance company’s employees make.  One example is a claims adjuster’s estimate of the cost of a future claim, which is indeed a judgment.  The other example is the premium that an underwriter quotes, which is a decision, not a mere judgment.  It is based on numerous judgments, of course, but the underwriter chooses the premium amount.  The book states that making a judgment is similar to measuring something, which is appropriate, but then goes on to say that the premium in the underwriter’s quote is also a judgment, which is not appropriate, because it is the result of a decision, not a measurement.

Elsewhere, the book claims that “an evaluative judgment determines the choice of an acceptable safety margin” for an elevator design (page 67).  This is inappropriate, however, for choosing the safety margin is a decision, not a measurement.

The book states that the process of judgment involves considering the given information, engaging in computation, consulting one’s intuition, and generating a judgment; that is, judgment is “an operation that assigns a value on a scale to a subjective impression (or to an aspect of an impression)” (page 176).  This is not the same as decision making.  Decision making is a more comprehensive process that defines relevant objectives, identifies (or develops) alternatives, evaluates the alternatives, selects one, and implements it.  In this process, judgment is an activity that may be used to evaluate the alternatives.  The book provides a relevant example that shows the distinction: job candidates get ratings, but only one gets hired.  The ratings are judgments, but choosing and hiring someone is a decision.

Those seeking to improve decision making in their organizations will find many useful suggestions in this book, but they should keep in mind that decision making is a process, not a judgment.