Private ML marketplaces
Fixing tradeoffs between various private ML strategies
- Transaction additional data between model and data owner.
- Fairly price the transaction
- Preserve model and data details
Model owners want further improvements with additional trainind data, and data owners want to be compensated fairly. We discuss various approaches previously proposed, including smart contracts, data encryption, transformation, and approimation, and federated learning. We propose a model-data efficacy approach based on model approximation, and give an example using Model Extraction.
Consider the inference process with respect to model . Encrypting all operations conceals the model; can perform inference on the data and updates on the model. A Fully-homomorphic encruption on preserves the compiuational correctness without revealling model details, at the expense of efficiency:
Additionally a scaling function on ca be overlaid to facilitate fair pricing and secure transaction:
Yet the encryption and computation are too slow to be practival.
Figure 1: Federated learning: distributed learning with differential privacy guarantees, especially useful for simple models and many users.
- is a trained model with parameter owned by the owner. Details regarding and are valuable
- Data Owner own additional training Data, , that may improve . Data owner wants to protect data details, lest they be shared.
- , the resulting update, proxies benefits gets from additional training data D.
- for compliance, performing pure inference, but
not private when facing black-box models if the updates are visible; requires customized networks.
- Distributed, collaborative learning.
- Differentially private update aggregation.
- Complicated setup; on-device training escpecially useful for simple models); customized protocol design and optimization for integrating classifiers.
- Requires having many users to be private (suing random rotation, etc to ensure privacy
Figure 3: A Pricing Function is composed
Data: black-box , , , data , , additional data , ideal model size . Result: Price w.r.t. Let // learn a decision tree in  while not do
end while do
trim or compress // for optional encryption
end Algorithm 1: Model extraction as MDE that Draws properties on any model, Black boxes can be handled in escrow. Applies to interpretability and model testign. Trades accuracy for size, Encrypt if tiny.
- Data that is useful can be priced, and vice versa.
- Due to mismatch in representation between training data and test data e.g., insfficient data, duplicate data may be priced for reducing error.
Solution: . That is overfit to with until the resulting approximation does not price duplicate data.
For trading additional trading data fairly and practically, we introduce Mode-Data Efficacy approaches, based on model approcimation of black-box models, that prices the data without training it on the original model.
Approximating the effect of data on the model through model approximation (Model-Data Efficacy) is a moderately practical solution to preserve model and data privacy. Model extraction, for example, can be used for fair pricing. That is, useless data can be priced minimally whil useful data can be priced high.
|Giving up data||High||Low||High||Low||Default ML|
|Giving up model||Low||High||High||Low||Academic Researchers|
|Escrow smart contract||Medium||Medium||Low||High||Numerai, Enigma|
|Encrypting the Model||High||Low||Low||N/A||Corti, PySyft|
|Encrypting the Data||Medium||Low||Low||Medium||Microsoft SEAL|
|Federated Learning||Low||Low||Low||High||Google (for Android Data)|
Against black-box models, eencryptiong or approximating data have flaws regarding privacy. While federated learning with ddifferential privacy achieves privacy for both model owner and data owner, it is less practical for one-time transactions.
- Pre-training data synthesize from existing eliminates tuning on and refunes into a metric for
- Stronger transactional security against adversarial attacks against the model owner
- Aono, Yoshinori, et al. “Privacy-preserving deep learning via additively homomorphic encryption.” IEEE Transactions on Information Forensics and Security 13.5 (2017): 1333-1345.
- Bastani, Osbert, Carolyn Kim, and Hamsa Bastani. “Interpretability via model extraction.” arXiv preprint arXiv:1706.09773 (2017).