641/relatedwork.tex at master · rangeetpan/641 · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
\section{Related Work}
\label{sec:relatedwork}
\subsection{\textbf{Study on accountable ML model.}}
\paragraph{Validating Accountability.}
There is a vast amount of research on validating and holding an ML model accountable. One of the earliest validation technique proposed by Pulina and Tacchella \cite{pulina2010abstraction} that utilizes abstraction to explain special types of a neural network, multi-layer perceptrons (MLP). \cite{pulina2010abstraction} proposed a refinement approach to study the safety of MLP and repair them. However, this work can only be utilized if a network has six neurons. Gehr et al. \cite{gehr2018ai2} has proposed an infrastructure $AI^{2}$ that converts a neural network with convolution and fully connected layers to address the safety and robustness of the ML models. This work has addressed some issues, e.g., the trade-off between precision and scalability, presenting an abstract representation of robustness in convolutional operation. However, this infrastructure does not work for a complex model, and it suffers from scalability issues. Du et al. \cite{du2018techniques} have surveyed models and research papers and have found that though there is a vast majority of work in machine learning has been done on increasing the explainability of the models. This work describes the clear categorization and complete overview of prevalent techniques to increase the interpretability of machine learning models aiming to help the community to understand the capabilities and weaknesses of different accountable approaches better. This study concludes that the prior works do not answer some question developers have e.g., \emph{"why Q, not R"} or in our case why the accuracy changes with the same experimental setup and how we can believe an ML model if it does not provide a concrete contract to the end-user. Another survey \cite{abdul2018trends} has been conducted on similar research artifacts and have concluded the overall theme for HCI researchers to hold an ML model accountable.
%Interpretable machine learning is an active area of research, with numerous interpretations and accountable approaches evolving recently. A comprehensive survey \cite{du2018techniques} describes the clear categorization and complete overview of prevalent techniques to increase the interpretability of machine learning models aiming to help the community to understand the capabilities and weaknesses of different accountable approaches better.
\paragraph{Holding Accountability.}
Jia et al. \cite{jia2019taso} proposed a programming language to prune the neural network to explain and interpret the model behavior and find bugs in the graph-based operations. RELUVAL \cite{wang2018formal} proposed a symbolic interval analysis to provide a formal guarantee of a deep neural network-based model. This work proposed a technique to formalize the dependency information of a network while the network operations propagate.
%Input dependency and output estimation related problems are addressed by providing an interval that is similar to our work where we demonstrate the output accuracy in terms of an interval based on the input dependencies propagating through the network.
Similar to the RELUVAL, \cite{katz2017reluplex} built a framework based on SMT solver to verify neural networks. Another area of research includes increasing robustness through crafting attacks and creating a defense against adversarial attack \cite{papernot2016towards}  that posses a threat against an ML model. Other studies  \cite{anderson2019optimization,pan2019static} produce a verification process to defend against such attacks.


\subsection{\textbf{Accuracy validation}}
Ribeiro et al. \cite{ribeiro2016should} focused on the trustworthiness validation on an ML model and proposed LIME, a modular and interpretable approach to explaining the predictions of any model in an accountable manner.

In another study \cite{du2018towards}, the authors proposed an explainable DNN model that validates the prediction outcome. This work demonstrated that the abstract representation of DNN-based models can diagnose and interpret the working mechanism of the prediction task.

Zhang et al. \cite{zhang2018interpretable} proposed an alternate convolutional neural network (CNN) based models that hold more information than the traditional ones which makes the CNN model more accountable to the classification accuracy even though in some cases, accuracy may decline. So, accuracy validation is crucial while making an interpretable CNN model.

\subsection{Accountability of the DNN Model}
According to the study \cite{veale2018fairness}, accountability in decision making represents the explanation about the "ongoing strategy". From the \S\ref{sec:motivation}, we have found that a single model structure can provide different decision-making capabilities due to the assertion of the probabilistic distribution in the initialization process. Our preliminary evidence has shown that a DNN model can end up achieving different accuracy in a different setting. A simple question can be asked? How can we really say that a DNN model performs what it advertises? Is there a reliable solution that does not change with the settings?
To answer these questions and holding the DNN model accountable in terms of the reported accuracy, we propose an approach named ADNN or accountable DNN.

\subsection{Hyper-parameter Optimization}
Zhao\etal \cite{zhao2017towards} proposed a technique for improving accuracy in iris recognition utilizing the features associated with the spatial context. In that research study, they have experimented with tuning the hyperparameters in such a way that produces the maximum performance. However, our proposed work is different in this context, we are not considering the hyperparameter tuning to generate the near-optimal accuracy rather we have focused on the user-intent based searching capability by restricting in terms of maximum time, trial, and gain for best possible modification in weight and bias initialization.

Prior work \cite{bergstra2011algorithms} has demonstrated that hyper-parameter optimization utilizing the random search is enough for accurately learning deep neural networks with various datasets. However the intuition for improving the searching capability there is no such closely related work in terms of confining the searching process to user-provided maximum allowable time, trial and accuracy gain to make a DNN model’s accuracy accountable.


In our proposed approach, we learn how the initialization parameter varies the accuracy based on the manual study of the \emph{Keras} documentation. We have implemented a search and update based approach to modify the weight and bias. The process is restricted based on the users’ intent i.e., max time, gain, and trial.


%Zhang et al. \cite{zhang2018interpretable} proposed a method to modify existing convolutional neural network (CNN) based models to make it more interpretable by encoding more meaningful semantics. In this case, CNN's classification accuracy may decrease a bit because when an interpretable CNN model has been deployed to classify a large number of categories simultaneously. Because filters in a convolutional layer are assigned with different categories. So, accuracy validation is crucial while making an interpretable CNN model.