Varun Bhatnagar


Biased black-box algorithms have drawn increasing levels of scrutiny from the public. This is especially true for those black-box algorithms with the potential to negatively affect protected or vulnerable populations.1 One type of these black-box algorithms, a neural network, is both opaque and capable of high accuracy. However, neural networks do not provide insights into the relative importance, underlying relationships, structures of the predictors or covariates with the modelled outcomes.2 There are methods to combat a neural network’s lack of transparency: globally or locally interpretable post-hoc explanatory models.3 However, the threat of such measures usually does not bar an actor from deploying a black-box algorithm that generates unfair outcomes on racial, class, or gendered lines.4

Fortunately, researchers have recognized this issue and developed interpretability frameworks to better understand such black-box algorithms. One of these remedies, the Shapley Additive Explanation (“SHAP”) method, ranks determinative factors that led to the algorithm’s final decision and measures the partial effects of the independent variables that were used in the model.5 Another, the Local Interpretable Model-agnostic Explanations (“LIME”) method, uses a similar method to reverse-engineer the determinative factors harnessed by the algorithm.6 Both the SHAP/LIME methods have the potential to shine light into the most accurate, precise black-box algorithms.

These black-box algorithms can harm peoples’ physical being and property interests.7 However, algorithm developers currently hide behind the nominally impenetrable nature of the algorithm to shield themselves from liability. These developers claim that black-box algorithms are the industry standard, due to the increased accuracy and precision that these algorithms typically possess. However, SHAP/LIME can ascertain which factors might be cloud the judgement of the algorithm, and therefore cause harm. As such, SHAP/LIME may lower the foreseeability threshold currently set by tort law and help consumer-rights advocates combat institutions which recklessly foist malevolent algorithms upon the public.

Part II will provide an overview of the SHAP/LIME methods, as well as applying it to a tort scenario involving a self-driving car accident. Part III will cover the potential tort claims that may arise out of the self-driving car accident, and how SHAP/LIME would advance each of these claims. SHAP/LIME’s output has not yet been compared to the foreseeability threshold under negligence or product/service liability. There are numerous factors that sway SHAP/LIME both towards and against reaching that threshold. The implications of this are severe—if the foreseeability threshold is not reached, a finder of fact might not find fault with the algorithm generator. Part IV will cover the evidentiary objections that might arise when submitting SHAP/LIME-generated evidence for admission. Reverseengineering an algorithm mirrors crime scene re-creation. Thus, the evidentiary issues involved in recreating crime scenes appear when reverseengineering algorithms.8 Important questions on relevance, authenticity, and accessibility to the algorithm directly affect the viability of submitting evidence derived using either the SHAP or LIME methods.9 Part V will conclude by contextualizing the need for transparency within an increasingly algorithm-driven society.

I conclude that tort law’s foreseeability threshold is currently not fit for purpose when it comes to delivering justice to victims of biased black-box algorithms. As for complying with the Federal Rules of Evidence, SHAP/LIME’s admissibility depends on the statistical confidence level of the method’s results. I conclude that SHAP/LIME generally have been properly tested and accepted by the scientific community, so it is probable that statistically relevant SHAP/LIME-generated evidence can be admitted.10