Dear colleagues,
We have developed a toolkit to interpret deep NLP models with a focus on *neuron interpretation*.
The toolkit is available as: pip install neurox Project Website: https://neurox.qcri.org/ Documentation: https://neurox.qcri.org/docs/ Git: https://github.com/fdalvi/NeuroX
NeuroX implements a number of features that facilitate model interpretation such as: - Word-level activation extractions (contextualized embeddings) - Integration with Huggingface - Implementation of a number of neuron probing methods (identify neurons learning a linguistic property such as noun) - Visualize the behavior of a neuron across a set of examples - and many more
If you have any questions or feedback, feel free to reach out to me or open an issue on the project's github.
Thank you,