Monday

because we are greedy


So in one of previous posts I was talking about "Performance vs. Convenience" and need for a new programming language, but WHY? 

because we are greedy.

We are power Matlab users. Some of us are Lisp hackers. Some are Pythonistas, others Rubyists, still others Perl hackers. There are those of us who used Mathematica before we could grow facial hair. There are those who still can’t grow facial hair. We’ve generated more R plots than any sane person should. C is our desert island programming language.
We love all of these languages; they are wonderful and powerful. For the work we do — scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing — each one is perfect for some aspects of the work and terrible for others. Each one is a trade-off.
We are greedy: we want more.

We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

(Did we mention it should be as fast as C?)

While we’re being demanding, we want something that provides the distributed power of Hadoop — without the kilobytes of boilerplate Java and XML; without being forced to sift through gigabytes of log files on hundreds of machines to find our bugs. We want the power without the layers of impenetrable complexity. We want to write simple scalar loops that compile down to tight machine code using just the registers on a single CPU. We want to write A*B and launch a thousand computations on a thousand machines, calculating a vast matrix product together.

We never want to mention types when we don’t feel like it. But when we need polymorphic functions, we want to use generic programming to write an algorithm just once and apply it to an infinite lattice of types; we want to use multiple dispatch to efficiently pick the best method for all of a function’s arguments, from dozens of method definitions, providing common functionality across drastically different types. Despite all this power, we want the language to be simple and clean.
All this doesn’t seem like too much to ask for, does it?
Even though we recognize that we are inexcusably greedy, we still want to have it all.

What you just read was the answer for "Why Julia was created".
You may read one of the developers interview's here and a presentation here.

Julia seems to be a promising language for scientists and researchers.
Time will show how performance and convenience will meet each other finally. 


SVD film (1976)


"Exploration vs. Exploitation" & "Performance vs. Convenience"

If you are using optimization in your research you probably heard about "Exploration vs. Exploitation". You cant increase both of them together. The exploration's goal is to select the samples that explore and stretch the search space as much as possible. However, in exploitation phase we are trying to reduce  the search space and focus to select samples near the optimizer. Often the cost of evaluating the objective function is high and in many cases we need a trade-off between them.  

This idea is true in programming too. In many researches, we are looking for higher performance, then we need to program in a low level languages (like C or Fortran), anyhow, implementing and working in this languages and testing different ideas are not easy, and the research will turn to be more in computer science and the ability to code in low level languages. This makes many researchers to go for convenience and try to find a trade-off between performance and convenience.

Many researchers are using Matalb and R. In many cases people are using Matlab for linear algebra and also R for statistics. You can use Matlab for statistics too, but to be honest its frustrating (at least in my experience)  because R has many ready to go packages for almost every statistical algorithms, functions, distributions and etc. You can find codes online for all of them which makes it very easy to implement and change these codes to your costume codes. 

Matlab and R have some drawbacks. The most important impediment of both of them is speed. Beside of that Matlab is a commercial software and for R you need to install some IDE environment for coding (like RStudio) for better development and debugging. 
So you may say OK, why not try some tricks in Matlab like creating MEX files, vectorization and parallelization. The truth is, you can do all of that and its different from case to case, however, in some instances will not help that much. 

If you are a Matalb user and you like to keep programming in your syntax, you can try Armadillo, which is a C++ linear algebra library with Matlab's syntax. Its free and it uses LAPACK (so should be fast). Other option could be using ROOT which is developed in CERN (Switzerland). CERN  is famous for its particle physics laboratory. I don't know by using ROOT how much you can  speed up your program, although the ROOT's syntax is not like Matlab and is written in C++.  What about DylanGOO and NewtonScript?
NumPy could be a good program language which combines the functionality of both Matlab and R. 
But which one is better?
The bottom line is, at some point you will need the third language (like C or Fortran) for speed up and higher performance.

In the next post I will talk about new language which seems to be a promising  language for scientists and researchers. 
See you soon!  :)




Thursday

Matalb "out of memory" problem


If you are working and coding with Matlab, you probably encounter this problem especially when your code has many iterations or you are dealing with libraries you want to use inside of your code.

Here are some suggestions:

1) The interface of Matlab written in Java so one option could be increasing the heap size of "Java Heap Memory" (File->Preferences->Java Heap Memory).

2) Try "-nodesktop" mode. You need to go to cmd and run Matlab in the directory you want to run your program. Also in your command line type: -nodesktop. You will have just a window (like cmd) to type the code you want to run.

3) If the problem you are encountering is related to the library you are calling or using in your Matlab code, the other option could be using the MEX file. You need to create a MEX file in Matlab using that library. After compiling and creating the MEX file , you can use it like other build-in functions in Matlab. Creating the MEX file sometimes is tricky a little bit but I think its completely worth it to have your own costume function instead of loading the library each time.

4) Use machines with bigger RAMs. Many universities have HPC (High Performance Computing) lab.  The amount of RAM you can get over there is usually huge. In our university I tried to see how much RAM I can get by creating a vary big matrix (5e4*5e4), and I could get almost 21 GB RAM.



I would be happy if you let me know about other methods to deal with this problem.

Friday

Change Blindness and MCMC

You might saw the awareness test in youtube. I like this video and honestly I didn't see the bear in the first time! Right now in my research I'm using MCMC (Markov chain Monte Carlo) method, which is a sampling method and is the family of random walk and the goal is to get the target distribution by sampling. The idea of this method is very simple and first developed by people in Sandia national laboratories. 
You probably used monte carlo method for calculating the integral, now we are sampling the probability distribution (using monte carlo) based on constructing a Markov chain and in the Markov chain the current state is just dependent on previous state. Thats it! easy! 
For sure there are many details included when you want to implement the algorithm but in this post I just would like to talk about the process of our mind which is very close to Monte Carlo method. The way we are seeing the world is base of how we are sampling our surroundings. Actually we are sampling our environment and because of different backgrounds and mind states, people can see different thing when you show them a certain picture! We see things which are more important for us and we fill the rest of the picture by our mind. This ability of our mind helps us to fill the gaps. This is happening in our dreams too when we are asleep, usually we see the story and when we are part of the dream, usually we don't see ourselves, for example you don't see your arms, you are just your eyes and you see the whole thing but not your body (this is true about me ans some of my friends which I had this discussion with them, so its not general!). I'm not sure how our brain works but I just know its completing the portion we don't pay attention. The other example could be this, imaging you are in a meeting with one of your colleagues, you are sitting around a table in a coffee place and waiting for him/her, he/she comes from the front door and he/she joins you and you start talking about his/her weekend, where his/her went, how was it, how much fun he/she had, does his/her want to try these new things this time and so on; 
after the meeting if I ask you some questions you can answer them more correctly, for example: who was his/her weekend? Where did he/she go? or, what kind of coffee he/she had?
But if I ask you what was the color of his/her shoes, you probably cant answer this question, you saw him/her walking to your table and you definitely saw his/her shoes and you are sure he/she didnt come there with barefoot but this information is not important for your mind and when you are sampling the environment you are sampling thing which are more important and you mind fills the other portions, you may say, OK, so what we are seeing is illusion! I'm not sure about that but as far as I know, YES!  But its very high quality illusion!   ;)
There is a term in psychology called Change Blindness. We don't pay attention to changes which are not important for us and fill the rest by our mind (some people learned to improve the data they are capturing in one shot, if you watch some action movies you see spies have this character). There is a good example video here for change blindness.
Ok, lets back to the MCMC, how does this things even related to MCMC? So far we know that we are sampling our environment and we are sampling thing which are more important for us, the MCMC method has the same spirit, we are trying to sample the probability distribution which are more important for us. We want to construct the target distribution with less number of samples and ignore the other part of the sample space. We would like to have the best desired distribution with the perfect number of samples and use that target distribution for further analyze. Sampling methods are very good when the decision space is very big and we are limited by some constraint to see the whole picture. Ill try to have another post about different types of sampling and also different types of MCMC. 

Happy sampling! :)
       

Thursday

Star Wars and Children

When you want to start to learn something new, its very important to start from a good point. Even in optimization the starting point is very important and in many cases will change the solution you will find and also number of iteration you may have. So in many contexts starting point is crucial. Most of us had read many books and had many courses but sometimes we feel that this book is better than that or this professor describing things better than other professors. However, this feelings may vary from person to person and in some cases for some of us some books feels written in another language! O.K. lets not miss the point! 
One of my friends in EPFL is doing some experiments on children to measure level of learning during different periods of childhood and in one of their studies they tried to teach some children who learned music and notes before incorrectly (actually reteach them) and then compare them to children who have no idea about music before and just learned it for the first time. He told me that seems that its more difficult to reteach children who learned the notes inaccurate before and that reminds me the sentence from the Star Wars: "Sometimes you have to unlearn what you have learned!".
Anyways, I reached a very good book about optimization by Prof. Yang which is published in 2010 by Wiley. And as mentioned before it doesn't mean other books are not good, I just can make more connection with this book! 
You can have a look to this book here. I hope you like it!

Continue on Lagrange multipliers

So in the previous post I was talking about Lagrangian Multipliers and I created an exe file to show the objective function with one constraint. You can reach the file here which includes an exe file and a MCR file in case you don't have Matlab installed on your machine (the exe file compiled and runs just on windows machines). You can visualize the function and the constraint with this file and I will show you an example of that.
The example comes from one of my classes (CE 590 - Modeling and Analysis of Civil Engineering Systems) and called OMAGOSH problem.
Here is the problem description: 

"You are an engineer advising the State Department of Natural Resources as it manages Lake Omagosh. The water quality index, x, can be improved, and better fishing would result. Boat launch facilities could also be added (measure in units, y). The benefits of fishing and pleasure boating are 11x+8y-2xy, in million dollars. The costs are x+2y in million dollars. Only 17 million dollars are available. Use the Kuhn-Tucker approach to evaluate this problem. What is the meaning of the value of the Lagrange multiplier for this problem?"

First of all, lets talk abut Kuhn-Tacker conditions. Because we have inequality in our constraint and we would like to solve this problem with Lagrangian approach and since we can solve this problem with Lagrangian method if we have equality, we can change our inequality in our constraint to equality by adding a slack variable to our constraint. 
So, if we have equality constraint we will have something like: g(x,y)=b
but here we have inequality: g(x,y)<= b and in the Omagosh problem: g(x,y)<= 17.
we can change the inequality to equality as mentioned by adding a slack variable: g(x,y)+s=17.
's', here is our slack variable and its worth to mention that 's' is always positive, we can just put 's^2'. 

You can reach the solution I did by hand here and please let me know if you have any question.
  
In this picture you can see the objective function and constraint.

Wednesday

Lagrange Multipliers

If you are familiar with Lagrange multiplier you are lucky! You can make money by doing this concept and applying that since usually Econ people using this concept. You can get idea of it from the wiki page but since Im in Civil Eng. I will try to explain it in my way.
If you had some structural courses you may heard about virtual work. So in a very simple case lets say what will happen to my structure's forces (for example if we are interested to forces) here to my single bar if I just change the displacement one unit. Economist usually using this concept and they call it shadow price and in many cases it meas if I change the limitation or constraint I have over my resources just one unit, what will happen to my cost or benefit!
Something that we need to be careful about is Lagrangian multiplier's sign. The sign of the multipliers can be positive, negative or in some case we may not have any sign (Lambda = 0) (I used Lambda because usually it used for Lagrange multipliers).
Usually you are trying to maximzie or minimize a objective function, I call it here: f(x,y) 
'f' is my objective function and 'x' and 'y' are variables. I show the constraint function here by: g(x,y). The constraint function as its name shows is limited or constrained by something.
Sometimes for example when we want to go shopping we have limited amount of money and then we are constraint by the amount of money we want to spend, or in many problems we are limited by time, especially when we have exams!!  ;)
g(x,y)<b
('b' here is our constraint.)
other situation is when we are limited at the other end, for example you should be 16 at least to get the driver license, or another example is, you want to do business, you bought something for lets say for 5 box, and you want to make at least 10% benefit, so your price should be greater than $5.5, and if you can sell your thing for more than that you are lucky!
g(x,y)>b
it is worth to mention that here I just have two variables but you may have thousands of them. The concept is the same.
You may have one or more than one constraint function and also you may have more than one objective function that called multi-objective function which we will talk about that later on.

You can get information about your function by the sign of the Lambda.
Lambda is how the objective function changes if I change the limitation (constraint) one unit
Lambda = df/db

In general there are two cases:
1) we are maximizing:
     1-1)  g(x,y)<b
               Lambda is positive
      1-2) g(x,y)>b
               Lambda is negative
2) we are minimizing:
      2-1)  g(x,y)<b
               Lambda is negative
      2-2) g(x,y)>b
               Lambda is positive

so for example when we are maximizing and g(x,y)<b, we are trying to reach the max but the constraint limits us and going uphill our slope will be positive.
If you want to visualize the case 1-2, its like the case 1-1, we want to go uphill again but its like we are going backwards, so if you walk backward and look to your front your slope is negative but the truth is you are going backward toward the summit.
Cases, 2-1 and 2-2 are the same, for example for the case 2-1, we want to minimize and if you walk downhill, when you look at your feet, your slope is negative. The backward walk is again is the same for the case 2-2.

Hello world!

This is my first post here and in this blog I will try to post more about modeling and optimization methods. The name of the blog is scary for me at least, I don't know exactly why I chose that name. Many people like to have interesting and sophisticated names and many times especially if you are in graduate school you will be hear such names which will scares you off, 

 
BUT don't worry everything here is about simplification. I am a visual person or at least I consider myself a visual person and I am trying absorb concepts by visualizing and it works many times because almost everything we have are in 3D world (not really!). However sometimes, usually in optimization world things are in more than three dimensions while visualizing again can help us to understand the problem. 
I have a little question for you, what is your impression from picture above?! Yes, you are right, its a ghost, a baby ghost!!
Actually, its just a picture of a cute baby, nothing more!! which turned to black and white and focused by a red circle! Many times we see things like that, even a simple problem, or a very lovely baby! In my opinion sometimes its good to be like that and I think its because of our imagination. Imagination is infinite (that's impressive for me that we can't count it)  and as Einstein said: Logic will get you from A to Z; imagination will get you everywhere.” 
Anyways, by giving the picture I just tried to show you that sometimes fancy names are scaring us, like the title of this blog with courier font which is very nerdy, while if we just try to see the concept there is nothing scary over there.

Other thing I would like to mention in my first post is mistakes that might happen when I will try to explain something or algorithm. I am in Civil and Environmental Engineering and I will try my best to have perfect posts, meanwhile if you are in OR or CS or anyways see some mistakes and errors, I would be more than happy to correct them and learn more from you, so just leave a comment and we can work on them.