Keep your ML data on the down low with TensorFlow Privacy
Worried your ML models might blab about proprietary data? Now, developers can keep their training data isolated from their machine learning models with TensorFlow Privacy. This Python library optimizes ML models without running into any data security or privacy concerns with differential privacy.
There’s a fine line between learning from and memorizing. Sometimes, our machine learning models have a problem with that. Enter TensorFlow Privacy, a new library that ensures the privacy of the initial dataset without any loss in performance thanks to differential privacy.
Machine learning can do amazing things, but it often relies on training the models on proprietary or sensitive data. Although these models should encode general patterns rather than specific examples, that’s not always the case. By following the theory of differential privacy, TensorFlow Privacy ensures that your training models will keep data secure with strong mathematical guarantees.
ML on the DL
No one likes their business broadcast across the internet without their consent. So, how does TensorFlow make sure that the training data is secure?
Differential privacy is the promise to users that they “will not be affected, adversely or otherwise, by allowing their data to be used in any study or analysis, no matter what other studies, data sets, or information sources, are available.” Essentially, differential privacy addresses the paradox of learning nothing about an individual while learning useful information about a population.
TensorFlow Privacy makes sure that rare details in a training set won’t be memorized while still maintaining the same level of learning as another, less responsible ML model. Sometimes, there are just statistical outliers that should not be taken into account, you know? It’ll only pay attention to such outliers if they are statistically representative and repeated multiple times.
The library comes with tutorials and analysis tools for computing the privacy guarantees provided. Plus, it also comes with a Research Directory, full of the code necessary to reproduce results from research papers related to privacy in machine learning.
Responsible ML development is possible for developers of all skill levels with TensorFlow Privacy. You don’t need to be a data security expert. Plus, developers with existing TensorFlow models won’t need to change their model architecture, training procedures, or processes. Just make a few simple code changes, tune the privacy hyperparameters, and voila! Your secrets are safe with TensorFlow.
Getting TensorFlow Privacy
Don’t expect this Python library to kiss and tell. It’s all available on GitHub, but there are some things you need to know before you get started. Prerequisites include TensorFlow, scipy, mpmath, and tensorflow_datasets. Only time will tell if this becomes a basic part of TensorFlow 2.0!
There are a few tutorials available out of the box in the tutorials/folder. These are mostly scripts demonstrating how to use the TensorFlow Privacy library features.
Contributions to the library are extremely welcome! Send your bug fixes and new features through GitHub pull requests. Check out the contributions guidelines for more information.