build trust with black box
- jimli44
- Dec 26, 2024
- 2 min read
Updated: Dec 27, 2024
Putting a black box in a product requires courage, a few ways to turn some of the courage into confidence.

A NN model is pretty much a black box. Feed input to it and it produces an output, we barely have visibility about what’s going on inside, sometimes this might make us nervous.
We could dissect a model to try to make sense of it, like my earlier post “from FFT to SVD” has done. This approach is not a very practical one and likely requires the model to be architected in interpretable stages, which could be a limitation.
There are more practical ways to build trust with a black box.
The old fashion testing
Still the sharpest tool in the toolbox, stress testing, corner cases etc. We all familiar with this therefore let's focus on the other two.
Understand how the model is trained
Model training essentially is an optimization process and the optimization algorithm is incredibly efficient. If we assume the model training process will digest all the training material fully, then the model is as good (or bad) as the training material. Looking deep inside the data and code to check is a key step to gain confidence, not just correctness, but also comprehensiveness.
Mistakes in data or code. The high tolerance to noise and error characteristic of NN makes mistakes harder to be spotted from the output. Unlike conventional algo, a mistake often leads to unexpected result (because we know what to expect), a mistake in NN often just cause sub-optimal performance. In a development scenario that we don't know what performance level should be expected, NN output would not give hint about any mistake.
Coverage of elements the model will be exposed to in practice. A good example is, a keyword recognition model trained with a lot of the popular dataset works really well with recorded material, but then perform miserably in practice simply due to users don’t speak right in front of the device.
Choose the right model architecture
Model architecture is not just a performance consideration, it could help with adding confidence too. Take following two as example.

The left one has total freedom to re-create a new frame, therefore it has potential to be able to correct corrupted information, but it also has the potential to produce unwanted content, which will require more effort to verify.
While the mask-based architecture on the right puts the thinking onto producing a mask, if the mask is within range [0,1], then we know for sure the model would not add anything to the original input, but will only remove unwanted components. More peace of mind, but less room for creativity.
Kommentare