Data Challenges in Building a Facial Recognition Model and How to Mitigate Them

Link: https://www.soa.org/resources/research-reports/2023/data-facial-rec/

PDF: https://www.soa.org/49022b/globalassets/assets/files/resources/research-report/2023/dei107-facial-recognition-challenges.pdf

Graphic:

Excerpt:

This paper is an introduction to AI technology designed for actuaries to understand how the technology works, the potential risks it could introduce, and how to mitigate risks. The author focuses on data bias as it is one of the main concerns of facial recognition technology. This research project was jointly sponsored by the Diversity Equity and Inclusion Research and the Actuarial Innovation and Technology Strategic Research Programs

Author(s): Victoria Zhang, FSA, FCIA

Publication Date: Jan 2023

Publication Site: SOA Research Institute

More and Better Uses Ahead for Governments’ Financial Data

Link: https://www.governing.com/finance/more-and-better-uses-ahead-for-governments-financial-data

Excerpt:

In its lame duck session last month, Congress tucked a sleeper section into its 4,000-page omnibus spending bill. The controversial Financial Data Transparency Act (FDTA) swiftly came out of nowhere to become federal law over the vocal but powerless objections of the state and local government finance community. Its impact on thousands of cities, counties and school districts will be a buzzy topic at conferences all this year and beyond. Meanwhile, software companies will be staking claims in a digital land rush.

The central idea behind the FDTA is that public-sector organizations’ financial data should be readily available for online search and standardized downloading, using common file formats. Think of it as “an http protocol for financial data” that enables an investor, analyst, taxpayer watchdog, constituent or journalist to quickly retrieve key financial information and compare it with other numbers using common data fields. Presently, online users of state and local government financial data must rely primarily on text documents, often in PDF format, that don’t lend themselves to convenient data analysis and comparisons. Financial statements are typically published long after the fiscal year’s end, and the widespread online availability of current and timely data is still a faraway concept.

…..

So far, so good. But the devil is in the details. The first question is just what kind of information will be required in this new system, and when. Most would agree that a complete download of every byte of data now formatted in voluminous governmental financial reports and their notes is overwhelming, unnecessary and burdensome. Thus, a far more incremental and focused approach is a wiser path. For starters, it may be helpful to keep the initial data requirements skeletal and focus initially on a dozen or more vital fiscal data points that are most important to financial statement users. Then, after that foundation is laid, the public finance industry can build out. Of course, this will require that regulators buy into a sensible implementation plan.

The debate over information content requirements should focus first on “decision-useful information.” Having served briefly two decades ago as a voting member of the Governmental Accounting Standards Board (GASB), contributing my professional background as a chartered financial analyst, I can attest that almost every one of their meetings included a board member reminding others that required financial statement information should be decision-useful. A key question, of course, is “useful to whom?”

Author(s): Girard Miller

Publication Date: 17 Jan 2023

Publication Site: Governing

Government Financial Reporting – Data Standards and the Financial Data Transparency Act

Link: https://xbrl.us/events/230124/

Date and Time of upcoming event: 3:00 PM ET Tuesday, January 24, 2023 (60 Minutes)

Description:

The U.S. Congress passed legislation on December 15, 2022 that includes requirements for the Securities and Exchange Commission to adopt data standards related to municipal securities. The Financial Data Transparency Act (FDTA) aims to improve transparency in government reporting, while minimizing disruptive changes and requiring no new disclosures. The University of Michigan’s Center for Local State and Urban Policy (CLOSUP) has partnered with XBRL US to develop open, nonproprietary financial data standards that represent government financial reporting which could be freely leveraged to support the FDTA. The Annual Comprehensive Financial Reporting (ACFR) Taxonomy today represents general purpose governments, as well as some special districts, and can be expanded upon to address all types of governments that issue debt securities. CLOSUP has also conducted pilots with local entities including the City of Flint.

Attend this 60-minute session to explore government data standards, find out how governments can create their own machine-readable financial statements, and discover what impact this legislation could have on government entities. Most importantly, discover how machine-readable data standards can benefit state and local government entities by reducing costs and increasing access to time-sensitive information for policy making.

Presenters:

  • Marc Joffe, Public Policy Analyst, Public Sector Credit
  • Stephanie Leiser, Fiscal Health Project Lead, Center for Local, State and Urban Policy (CLOSUP), University of Michigan’s Ford School of Public Policy
  • Campbell Pryde, President and CEO, XBRL US
  • Robert Widigan, Chief Financial Officer, City of Flint

Publication Site: XBRL.us

The most common restaurant cuisine in every state, and a chain-restaurant mystery

Link: https://www.washingtonpost.com/business/2022/09/29/chain-restaurant-capitals/

Graphic:

Excerpt:

The places that drive the most tend to have the same high share of chain restaurants regardless of whether they voted for Trump or Biden. As car commuting decreases, chain restaurants decrease at roughly the same rate, no matter which candidate most residents supported.

If the link between cars and chains transcends partisanship, why does it look like Trump counties have more chain restaurants? It’s at least in part because he won more of the places with the most car commuters!

About 83 percent of workers commute by car nationally, but only 80 percent of folks in Biden counties do so, compared with 90 percent of workers in Trump counties. The share of car commuters ranges from 55 percent in the deep-blue New York City metro area to 96 percent around bright red Decatur, Ala.

Author(s): Andrew Van Dam

Publication Date: 1 Oct 2022

Publication Site: WaPo

The amazing power of “machine eyes”

Link: https://erictopol.substack.com/p/the-amazing-power-of-machine-eyes

Graphic:

Excerpt:

Today’s report on AI of retinal vessel images to help predict the risk of heart attack and stroke, from over 65,000 UK Biobank participants, reinforces a growing body of evidence that deep neural networks can be trained to “interpret” medical images far beyond what was anticipated. Add that finding to last week’s multinational study of deep learning of retinal photos to detect Alzheimer’s disease with good accuracy. In this post I am going to briefly review what has already been gleaned from 2 classic medical images—the retina and the electrocardiogram (ECG)—as representative for the exciting capability of machine vision to “see” well beyond human limits. Obviously, machines aren’t really seeing or interpreting and don’t have eyes in the human sense, but they sure can be trained from hundreds of thousand (or millions) of images to come up with outputs that are extraordinary. I hope when you’ve read this you’ll agree this is a particularly striking advance, which has not yet been actualized in medical practice, but has enormous potential.

Author(s): Eric Topol

Publication Date: 4 Oct 2022

Publication Site: Eric Topol’s substack, Ground Truths

Using First Name Information to Improve Race and Ethnicity Classification

Link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2763826

Graphic:

Abstract:

This paper uses a recent first name list to improve on a previous Bayesian classifier, the Bayesian Improved Surname Geocoding (BISG) method, which combines surname and geography information to impute missing race and ethnicity. The proposed approach is validated using a large mortgage lending dataset for whom race and ethnicity are reported. The new approach results in improvements in accuracy and in coverage over BISG for all major ethno-racial categories. The largest improvements occur for non-Hispanic Blacks, a group for which the BISG performance is weakest. Additionally, when estimating disparities in mortgage pricing and underwriting among ethno-racial groups with regression models, the disparity estimates based on either BIFSG or BISG proxies are remarkably close to those based on actual race and ethnicity. Following evaluation, I demonstrate the application of BIFSG to the imputation of missing race and ethnicity in the Home Mortgage Disclosure Act (HMDA) data, and in the process, offer novel evidence that race and ethnicity are somewhat correlated with the incidence of missing race/ethnicity information.

Author(s):

Ioan Voicu
Office of the Comptroller of the Currency (OCC)

Publication Date: February 22, 2016

Publication Site: SSRN

Suggested Citation:

Voicu, Ioan, Using First Name Information to Improve Race and Ethnicity Classification (February 22, 2016). Available at SSRN: https://ssrn.com/abstract=2763826 or http://dx.doi.org/10.2139/ssrn.2763826

Embedded Bias: How Medical Records Sow Discrimination

Link: https://khn.org/news/article/electronic-medical-records-doctor-bias-open-notes-treatment-discrimination/

Excerpt:

Narrow or prejudiced thinking is simple to write down and easy to copy and paste over and over. Descriptions such as “difficult” and “disruptive” can become hard to escape. Once so labeled, patients can experience “downstream effects,” said Dr. Hardeep Singh, an expert in misdiagnosis who works at the Michael E. DeBakey Veterans Affairs Medical Center in Houston. He estimates misdiagnosis affects 12 million patients a year.

Conveying bias can be as simple as a pair of quotation marks. One team of researchers found that Black patients, in particular, were quoted in their records more frequently than other patients when physicians were characterizing their symptoms or health issues. The quotation mark patterns detected by researchers could be a sign of disrespect, used to communicate irony or sarcasm to future clinical readers. Among the types of phrases the researchers spotlighted were colloquial language or statements made in Black or ethnic slang.

“Black patients may be subject to systematic bias in physicians’ perceptions of their credibility,” the authors of the paper wrote.

That’s just one study in an incoming tide focused on the variations in the language that clinicians use to describe patients of different races and genders. In many ways, the research is just catching up to what patients and doctors knew already, that discrimination can be conveyed and furthered by partial accounts.

Author(s): Darius Tahir

Publication Date: 26 Sept 2022

Publication Site: Kaiser Health News

An Actuarial View of Correlation and Causation—From Interpretation to Practice to Implications

Link: https://www.actuary.org/sites/default/files/2022-07/Correlation.IB_.6.22_final.pdf

Graphic:

Excerpt:

Examine the quality of the theory behind the correlated variables. Is there good
reason to believe, as validated by research, the variables would occur together? If such
validation does not exist, then the relationship may be spurious. For example, is there
any validation to the relationship between the number of driver deaths in railway
collisions by year (the horizontal axis), and the annual imports of Norwegian crude
oil by the U.S., as depicted below?36 This is an example of a spurious correlation. It is
not clear what a rational explanation would be for this relationship.

Author(s): Data Science and Analytics Committee

Publication Date: July 2022

Publication Site: American Academy of Actuaries

BIG DATA AND ALGORITHMS IN ACTUARIAL MODELING AND CONSUMER IMPACTS

Link: https://www.actuary.org/sites/default/files/2022-08/IABAAug2022_Sandberg_Presentation.pdf

Graphic:

Excerpt:

Systemic Influences and Socioeconomics
❑ Checking for and removing of systemic biases is difficult.
❑ Systemic biases can creep in at every step of the modeling process: data,
algorithms, and validation of results.
❑ Human involvement in designing and coding algorithms, where there is a lack of diversity
among coders
❑ Biases embedded in training datasets
❑ Use of variables that proxy for membership in a protected class
❑ Statistical discrimination profiling shopping behavior, such as price optimization
❑ Technology-facilitated advertising algorithms used in ad targeting and ad delivery

Author(s): David Sandberg, Data Science and Analytics Committee, AAA

Publication Date: August 2022

Publication Site: American Academy of Actuaries

Consumer Watchdog Calls on Insurance Commissioner Lara to Reject Allstate’s Job-Based Insurance Rate Discrimination, Adopt Regulations to Stop the Practice Industrywide

Link: https://www.prnewswire.com/news-releases/consumer-watchdog-calls-on-insurance-commissioner-lara-to-reject-allstates-job-based-insurance-rate-discrimination-adopt-regulations-to-stop-the-practice-industrywide-301631577.html

Additional: https://consumerwatchdog.org/sites/default/files/2022-09/2022-09-22%20Ltr%20to%20Commissioner%20re%20Allstate%20Auto%20Rate%20Application%20w%20Exhibits.pdf

Graphic:

Excerpt:

Insurance Commissioner Ricardo Lara should reject Allstate’s proposed $165 million auto insurance rate hike and its two-tiered job- and education-based discriminatory rating system, wrote Consumer Watchdog in a letter sent to the Commissioner today. The group called on the Commissioner to adopt regulations to require all insurance companies industrywide to rate Californians fairly, regardless of their job or education levels, as he promised to do nearly three years ago. Additionally, the group urged the Commissioner to notice a public hearing to determine the additional amounts Allstate owes its customers for premium overcharges during the COVID-19 pandemic, when most Californians were driving less.

Overall, the rate hike will impact over 900,000 Allstate policyholders, who face an average $167 annual premium increase.

Under Allstate’s proposed job-based rating plan, low-income workers such as custodians, construction workers, and grocery clerks will pay higher premiums than drivers in the company’s preferred “professional” occupations, including engineers with a college degree, who get an arbitrary 4% rate reduction.

Author(s): Consumer Watchdog

Publication Date: 22 Sept 2022

Publication Site: PRNewswire

Avoiding Unfair Bias in Insurance Applications of AI Models

Link: https://www.soa.org/resources/research-reports/2022/avoid-unfair-bias-ai/

Report: https://www.soa.org/4a36e6/globalassets/assets/files/resources/research-report/2022/avoid-unfair-bias-ai.pdf

Graphic:

Excerpt:

Artificial intelligence (“AI”) adoption in the insurance industry is increasing. One known risk as adoption of AI increases is the potential for unfair bias. Central to understanding where and how unfair bias may occur in AI systems is defining what unfair bias means and what constitutes fairness.

This research identifies methods to avoid or mitigate unfair bias unintentionally caused or exacerbated by the use of AI models and proposes a potential framework for insurance carriers to consider when looking to identify and reduce unfair bias in their AI models. The proposed approach includes five foundational principles as well as a four-part model development framework with five stage gates.

Smith, L.T., E. Pirchalski, and I. Golbin. Avoiding Unfair Bias in Insurance Applications of AI Models. Society of Actuaries, August 2022.

Author(s):

Logan T. Smith, ASA
Emma Pirchalski
Ilana Golbin

Publication Date: August 2022

Publication Site: SOA Research Institute

What can go wrong? Exploring racial equity dataviz and deficit thinking, with Pieta Blakely.

Link: https://3iap.com/what-can-go-wrong-racial-equity-data-visualization-deficit-thinking-VV8acXLQQnWvvg4NLP9LTA/

Graphic:

Excerpt:

For anti-racist dataviz, our most effective tool is context. The way that data is framed can make a very real impact on how it’s interpreted. For example, this case study from the New York Times shows two different framings of the same economic data and how, depending on where the author starts the X-Axis, it can tell 2 very different — but both accurate — stories about the subject.

As Pieta previously highlighted, dataviz in spaces that address race / ethnicity are sensitive to “deficit framing.” That is, when it’s presented in a way that over-emphasizes differences between groups (while hiding the diversity of outcomes within groups), it promotes deficit thinking (see below) and can reinforce stereotypes about the (often minoritized) groups in focus.

In a follow up study, Eli and Cindy Xiong (of UMass’ HCI-VIS Lab) confirmed Pieta’s arguments, showing that even “neutral” data visualizations of outcome disparities can lead to deficit thinking (and therefore stereotyping) and that the way visualizations are designed can significantly impact these harmful tendencies.

Author(s): Eli Holder, Pieta Blakely

Publication Date: 2 Aug 2022

Publication Site: 3iap