Ad

Tuesday, August 13, 2019

Differential Privacy and Federated Learning 101

A list of glossary, vocabulary for differential privacy, secure AI, and federated learning on sensitive dataset. All concepts are theoretical, for discussion purpose only, are NOT intended for production nor any professional usage, and should NOT be used. This is a very new experimental concept in AI.

Before you read our article please read our disclaimer page. It is very important to note that this article and all articles, content on our website and affiliated websites are for discussion / entertainment purpose only. They should NOT be considered professional advice. Content on this site, our affiliated sites and social media are NOT intended for commercial purpose; NOT for production purpose; NOT for professional usage.

What is a scenario that differential privacy is useful? When a researcher wants to analyze a sensitive dataset, such as a dataset containing patient data, and or when a research wants to make a model that learns sensitive features, and or make a sensitive prediction: i.e. if a patient has a health issue, such as HIV.

Differential Privacy

"mathematical definition of privacy. In the simplest setting, consider an algorithm that analyzes a dataset and computes statistics about it (such as the data's mean, variance, median, mode, etc.). Such an algorithm is said to be differentially private if by looking at the output, one cannot tell whether any individual's data was included in the original dataset or not. In other words, the guarantee of a differentially private algorithm is that its behavior hardly changes when a single individual joins or leaves the dataset " - Harvard Differential Privacy Group

Query:
See the above quote
Mean variance medium mode
Advanced: machine learning and deep learning model.

Differentially private tools

Sensitive database

Anonymization: the old way, removing sensitivity personally identifiable information, has shown to sometimes fail. No guarantee.

Example where data thought to be anonymous, fails to protect privacy:

 Latanya Sweeney showed gender, date-of-birth, zipcode can identify many Americans. This is also the governor medical record example. This is known linkage attack.

Linkage Attack:
See above

"too many innocuous (even completely random) queries about a database inherently violates the privacy of its individual contributors. ... tradeoff between statistical utility and privacy." - Harvard Privacy Group

owner: the initial virtual owner is me
pointer API for send() and get() data among virtualworkers.

Pysyft https://github.com/OpenMined/PySyft

Deep Learning using pysyft

1 comment:

  1. If you are new user to sage 50 accounting software and looking for the sage 50 technical support.If yes than you have come to right place as we provide efficient technical support service to customers who show complete faith in us. With our efficient and highly qualified team ,we never disappoint our customers.You can reach us at 1800-910-4754 at any hour of the day. You can also visit our website at https://www.geekaccounting247.com/ for the complete knowledge of the sage products and services.

    The Services we offered are following-
    Sage 50 Technical Support Number
    Sage 100 Technical Support Number
    Sage 50 live chat
    Sage 50 Technical Support phone Number
    Sage 50 support phone number

    Sage 50 customer service number
    Sage 50 payroll support number

    ReplyDelete

Machine Learning Workflow

Data cleaning Missing data Outlier Others: duplicates, typos, special characters Strategy for missing data: imputation, mean, median...