Tuesday

Data visualization


Visualization is important to me. I know myself as a visual person. I like to imagine what I see and see what I imagine. See one of  the reasons here. Also in presentations, visualization can help audience to better understand concepts and walk through the data presenter tries to explain. This one is a good example:



Should we limit ourselves to 3 dimensions? 




(Imagining Ten Dimensions in Two Minutes)

You can find the longer version here.

JavaScript(JS) has changed many visual effects and made it much easier for delivering data in visual. You can use JS in you website (examples are here (1,2,3)). But many of us don't use JS. Also we would like to use free softwares. Here is a list of good ones. 

(By the way, as a graduate student I read lots of papers and many times I wish I could understand better writer's point of view in the paper. I put most of the things I would like to read in my Kindle. First of all I can have tons of papers anytime I want (even in bed in the middle of the night when everywhere is dark and cold, everything is one click away). Using Kindle, I can comment on papers and most importantly as a non-native English speaker, I can use dictionary just by holding my finger on the word I don't know. I see a world which papers are interactive. When you hit a point on the graph its shows where that data came from. You can go through the data and see the raw inputs. You can manipulate them, play with them and compare them with the baselines you have. Future scientific papers will be more exciting.)

In addition, visualization in multi-objective optimization will be very helpful. Matlab users may have a look at this. I found Dr. Reed's AeroVis software fascinating. Please let me know if you have more suggestions. 

P.S. Another one.

Thursday

Useful Softwares

So if you are in the grad school you like to use softwares for everything you do (at least its true for me). Sometimes takes me more time to learn how to use a new software for the first time (and also second time ...) than just do the thing manually. You will get to the balance someday. I had professor back at University of Tehran and he had an idea that your work will be more useful if you could make a software out of it (and also sell it). Another great lesson I had was from Prof. Kim at NC State. The notion was: it would be great if you could make policies out of the results of your ideas and works. Its a long road, I know. But, "A journey of a thousand miles begins with a single step" (Lao-tzu). 

As a student I like free softwares. Being free doesn't mean that the software is not alive (I mean something like the costumer service the commercial ones have). A good example is R. It has a huge community and contributers that support and develop the software and its packages. As yet, I never encountered a method or program which I couldn't find a package of it in R's website (or something very similar to that). 

I gathered a list of useful softwares that I'm using often. Please let me know if you have any other recommended software you like to be added to this list.





MathType

Math Text Formatting (Office and EPS files) Student or .edu  http://www.dessci.com/en/products/mathtype/ 
CutePDF

PDF Writer.edu or .gov http://www.cutepdf.com/
Crimson Editor / Emerald Editor

TeamViewer

Text Manipulation




Remote Desktop
Free




Free 
http://www.crimsoneditor.com/




http://www.teamviewer.com/en/index.aspx
MATLAB


Octave


Matrix Manipulation - Test Platform

Like Matlab but free
(with almost similar syntax)
Student or .edu

Free
http://www.mathworks.com/products/matlab/


http://www.gnu.org/software/octave/
PuTTy

SSH MIT (Open Source) http://www.chiark.greenend.org.uk/~sgtatham/putty 
X-Win 32


X Windows Environment for Remote Access Student or .edu http://www.starnet.com/products/xwin32/ 
Notepad++

Text Manipulation Free http://notepad-plus-plus.org/ 
WinSCP

FTP Access Free http://winscp.net/eng/index.php 
Julia

Firefox

Foxitreader

blender

ImageJ



AeroVis



Freemind


Lightworks


TeXnicCenter


CCleaner

Programming Language

Web browser

PDF reader

Creating 3D graphics

Image Processing
Program


Complex multi-dimensional visualization software

Mind mapping application

Video and Audio editing system

LaTeX editor


Cleans temporary and potentially unwanted files
Free

Free

Free

Free

Free



.edu



Free


Free


Free


Free
http://julialang.org/

http://www.mozilla.org/en-US/firefox/new/

http://www.foxitsoftware.com/Secure_PDF_Reader/

http://www.blender.org

http://rsbweb.nih.gov/ij/



http://reed.cee.cornell.edu/index.php/Resources



http://freemind.sourceforge.net/wiki/index.php/Main_Page


http://www.lwks.com


http://www.texniccenter.org


http://www.filehippo.com/download_ccleaner/

t / i \ m e

If you are organized person, you have a big advantage in graduate school. Other aspect of being successful is getting the most out of your time. Many of us are doing research and usually if you are a little bit curious, you will be end up two hours of online searching for things that are not have strong correlation with what you really start to search for! I'm totally agree that your point of view got broaden, however, you are loosing time on other aspects that might be more important.
The first thing you can say is, OK, so what you are doing here. You are writing here which means you are spending too much time that might not used in your research!?!
My answer would be: You got me, call it even! OK!
Not really! I think if you document your online searches and readings from all non-sense, you are getting something. At least you can track your improvement in "online time spending". 

If you are in Engineering school the first thing may come to your mind would be: how can I build a tool to help me. 
Other than having a notepad and google calendar and other nice software, you can use workrave. Its a very simple and effective software for keep focus on work (and free).

Many grad students are using headphones to keep their concentration. If you are listening to an old music and you know already all the words its helpful but if you are listening and trying to know the meaning of each word, I think you would loose some of your concentrations (this is just my experience . You can use some Binaural bits, especially the Gamma ones. Because the Alpha and Betta frequencies make many people feel sleepy.

Find out you are a morning or afternoon person (your prime time). In this way you can concentrate on hard things you need to do in the time your body is more ready. In addition, planning the day just before going to bed is great. Many times I don't even write what I'm suppose to do. I just think about things I need to do tomorrow and feel good about them. Easy haa!

Also having scheduled meal and sleep will help you to get the most out of your time.



Monday

because we are greedy


So in one of previous posts I was talking about "Performance vs. Convenience" and need for a new programming language, but WHY? 

because we are greedy.

We are power Matlab users. Some of us are Lisp hackers. Some are Pythonistas, others Rubyists, still others Perl hackers. There are those of us who used Mathematica before we could grow facial hair. There are those who still can’t grow facial hair. We’ve generated more R plots than any sane person should. C is our desert island programming language.
We love all of these languages; they are wonderful and powerful. For the work we do — scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing — each one is perfect for some aspects of the work and terrible for others. Each one is a trade-off.
We are greedy: we want more.

We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

(Did we mention it should be as fast as C?)

While we’re being demanding, we want something that provides the distributed power of Hadoop — without the kilobytes of boilerplate Java and XML; without being forced to sift through gigabytes of log files on hundreds of machines to find our bugs. We want the power without the layers of impenetrable complexity. We want to write simple scalar loops that compile down to tight machine code using just the registers on a single CPU. We want to write A*B and launch a thousand computations on a thousand machines, calculating a vast matrix product together.

We never want to mention types when we don’t feel like it. But when we need polymorphic functions, we want to use generic programming to write an algorithm just once and apply it to an infinite lattice of types; we want to use multiple dispatch to efficiently pick the best method for all of a function’s arguments, from dozens of method definitions, providing common functionality across drastically different types. Despite all this power, we want the language to be simple and clean.
All this doesn’t seem like too much to ask for, does it?
Even though we recognize that we are inexcusably greedy, we still want to have it all.

What you just read was the answer for "Why Julia was created".
You may read one of the developers interview's here and a presentation here.

Julia seems to be a promising language for scientists and researchers.
Time will show how performance and convenience will meet each other finally. 


SVD film (1976)


"Exploration vs. Exploitation" & "Performance vs. Convenience"

If you are using optimization in your research you probably heard about "Exploration vs. Exploitation". You cant increase both of them together. The exploration's goal is to select the samples that explore and stretch the search space as much as possible. However, in exploitation phase we are trying to reduce  the search space and focus to select samples near the optimizer. Often the cost of evaluating the objective function is high and in many cases we need a trade-off between them.  

This idea is true in programming too. In many researches, we are looking for higher performance, then we need to program in a low level languages (like C or Fortran), anyhow, implementing and working in this languages and testing different ideas are not easy, and the research will turn to be more in computer science and the ability to code in low level languages. This makes many researchers to go for convenience and try to find a trade-off between performance and convenience.

Many researchers are using Matalb and R. In many cases people are using Matlab for linear algebra and also R for statistics. You can use Matlab for statistics too, but to be honest its frustrating (at least in my experience)  because R has many ready to go packages for almost every statistical algorithms, functions, distributions and etc. You can find codes online for all of them which makes it very easy to implement and change these codes to your costume codes. 

Matlab and R have some drawbacks. The most important impediment of both of them is speed. Beside of that Matlab is a commercial software and for R you need to install some IDE environment for coding (like RStudio) for better development and debugging. 
So you may say OK, why not try some tricks in Matlab like creating MEX files, vectorization and parallelization. The truth is, you can do all of that and its different from case to case, however, in some instances will not help that much. 

If you are a Matalb user and you like to keep programming in your syntax, you can try Armadillo, which is a C++ linear algebra library with Matlab's syntax. Its free and it uses LAPACK (so should be fast). Other option could be using ROOT which is developed in CERN (Switzerland). CERN  is famous for its particle physics laboratory. I don't know by using ROOT how much you can  speed up your program, although the ROOT's syntax is not like Matlab and is written in C++.  What about DylanGOO and NewtonScript?
NumPy could be a good program language which combines the functionality of both Matlab and R. 
But which one is better?
The bottom line is, at some point you will need the third language (like C or Fortran) for speed up and higher performance.

In the next post I will talk about new language which seems to be a promising  language for scientists and researchers. 
See you soon!  :)