10MA – Security and Machine Learning

When training ML models, there can be some security aspects which are important. Here are some examples:

Some security goals.

  1. Training set privacy. An adversary which is familiar with the model, can not get “any” information on the data-points in the training set.
  2. Model secrecy. An adversary able to get predictions for any input by the model as a black-box, can not obtain information about the model parameters.
  3. Model reliability. The model should behave in a way that humans can predict.

Links to related attacks

  1. Membership Inference Attacks against Machine Learning Models
  2. Stealing Machine Learning Models via Prediction APIs
  3. Breaking Linear Classifiers on ImageNet
Advertisements

10MA – Should I do short analyses?

I’m trying to do a 10MA – “10 Minute Analysis”. The goal is to write the post in just 10 minutes, and see what comes out of it.

This post is about the benefits vs the downsides of making 10MAs. Hopefully we’ll reach some conclusion.

So why do I write these analyses anyway? The first, and most important, reason is for my self improvement. The second reason is because I am a big believer in sharing of knowledge and openess, and I hope some of what I plan to write here will be of use to other people later on.

Effects on my self improvement:

  • Trains intuitive analysis, and coming up with a variety of ideas, as opposed to thorough and more linear thinking. In general I am better at this kind of thinking.
  • Trains writing down quickly, and moving more thoughts to text. This is very important to me.
  • Less time to learn how to formulate coreectly.

Effects on what others will read:

  • Quantity instead of quality. probably not too bad, if I want to spread ideas and let others think for themselves.

Conclusion:

This is fun, and can be helpful to do it together with deep analysis.

Single-use code for 3D printing

When 3D printers will be potent and cheap enough, they can make an enormous economical change. In this post I discuss the main reasons for this economical change, and ponder some technological concepts which may restrict it. I am not sure if this restriction is beneficial or not, as we’ll discuss in the summary.

Digitization ⇒ duplicability

If the information of the product is entirely digital, then there are two main consequences:

  • It will be easy to share the product p2p. We see this today in many areas, such as music, film or electronic books, where downloaded copies can be shared freely as torrents or in file sharing sites.
  • It will be easy to “use” the product more then once. We usually take it for granted that this has to be the case, as music. books and the like can be used repetatively once owned. Note that it is not a necessity, and in fact there are many alternatives such as leasing or radio

Economical implications

The impact of digitization is obviously huge, as can be seen in the case of the music industry. The analysis here is important and must be data driven, so it should take a more careful research on the topic which I will postpone.

A relevant question which is not analogue to the case in the music industry is “what are the implications of being able to generate an object more then once”? I’ll leave it open as well.

Single use code

The challenge is to find a way such that users can download a design online, and use it immediately to print the object, but  in such a way that the majority of users can not print the design again.

If the printer is stateless (that is, has no intrinsic memory), then sending the same packet over to the printer will result in the same action of the printer. Hence, even if the driver of the printer acts in different ways, a simple solution to be able to print the same thing many times is by sniffing the communication for the first “legal” print, and repeating it for the next prints. This can be automated somewhat easily, and the program for doing so can be made simple enough so that many users will use it. Thus, we need some level of sophistication in the driver-printer protocol to avoid this attack. It is also clear that the printer’s code and internal state needs to be unmalleable.

The naive idea of having the printer try to remember information about which models it had already printed (say by storing their hash values), and not allow to print the same model again. This is not good enough, as it is easy to make minor changes to the model so that it wont print in the same way. Even if the printer would have a clever algorithm which can tell if two models are the same, which is very hard to do efficiently, these kinds of protections can always be overcome.

We can try to use cryptography to make sure that the printer will not use the same code twice. Assume that the printer has a secret key shared with the printing company. Then whoever wants to publish their design for a unique printing will send it to the printing company, which has a platform for selling designs, and then anyone who buys the design gets it encrypted and signed so that only his printer can decrypt and authenticate the code for the model. In this case, the model can not be shared, and the hashing solution above can protect from duplication. This solution assumes that the vast majority of users will not open their printers and obtain the private key (which can be made extremely complicated). Another version is to sign on the model and the printer ID using public key cryptography, and have the printer only print what is verified as coming from the company and have the correct ID. This version is problematic, as the code itself will be visible.

The main technical problem with the above solution is that it does not allow for printing of free models, or home generated ones, and here is where it gets interesting. Just allowing for printing of unencrypted models has the inherent problem that it only takes one person who manages to recover his own key to be able to spread the model. However, it would still cost money, so it can be still quite good. Another problem is the managing of the keys, but it should be fine.

conclusion

The above scheme is probably fine, but I think a better solution is possible. Eventually, the biggest problem for any such solution is that the printer manufacturer and the platform for the unique printing of models needs to work together, and create a large enough community of buyers and sellers so that new people will choose to but these specific printers.