10MA – Evolution strategies VS reinforcement learning

First, go and readĀ this OpenAI blog post. Read it? good!

In the next 10 minutes, I’ll write as much as I can on my thoughts regarding the claims posed in the above mentioned post.

I have a slight cognitive dissonance.. I got used to thinking that RL is very good, and that the results obtained on the Atari games, for example, are extremely high. However, it seems that Evolution Strategies (ES), as are any type of “local search” methods, are so generic and simple, such that they should be the lowest standard for any machine learning algorithm.

Is it correct to take away from this that overall RL is just not very good, but that it’s success is mostly a story of fast supercomputers?

OpenAI mentions that these kinds of local search methods are not good for supervised learning. This means that we do have some tools which are much better than local search, but that they are not easily transferable.

A different explanation could simply be that the Atari gamesĀ and OpenAI Gym-type games, are specific examples where RL algorithms are not working well. Maybe due to their small action space?

10MA – Security and Machine Learning

When training ML models, there can be some security aspects which are important. Here are some examples:

Some security goals.

  1. Training set privacy. An adversary which is familiar with the model, can not get “any” information on the data-points in the training set.
  2. Model secrecy. An adversary able to get predictions for any input by the model as a black-box, can not obtain information about the model parameters.
  3. Model reliability. The model should behave in a way that humans can predict.

Links to related attacks

  1. Membership Inference Attacks against Machine Learning Models
  2. Stealing Machine Learning Models via Prediction APIs
  3. Breaking Linear Classifiers on ImageNet

10MA – Should I do short analyses?

I’m trying to do a 10MA – “10 Minute Analysis”. The goal is to write the post in just 10 minutes, and see what comes out of it.

This post is about the benefits vs the downsides of making 10MAs. Hopefully we’ll reach some conclusion.

So why do I write these analyses anyway? The first, and most important, reason is for my self improvement. The second reason is because I am a big believer in sharing of knowledge and openess, and I hope some of what I plan to write here will be of use to other people later on.

Effects on my self improvement:

  • Trains intuitive analysis, and coming up with a variety of ideas, as opposed to thorough and more linear thinking. In general I am better at this kind of thinking.
  • Trains writing down quickly, and moving more thoughts to text. This is very important to me.
  • Less time to learn how to formulate coreectly.

Effects on what others will read:

  • Quantity instead of quality. probably not too bad, if I want to spread ideas and let others think for themselves.


This is fun, and can be helpful to do it together with deep analysis.