We will use scilab to analyze and plot astronomical data after you complete this tutorial. Scilab is a powerful, free data analysis program with many similarities to matlab, with application to any kind of data analysis you may wish to do in the future. For now, we will just do a little experimentation to see how the program works. You must literally type in ALL the example input below in order to see how things work for yourself.
If you don't already have a pdf creation program, download and install a free one
(e.g., novapdf or
pdf995) before starting this tutorial.
Part I: Installation
Part II: Getting Started and Recording Your Work
Part III: Reading and Plotting Data
Scilab Quick Reference
Part I: Installation
To install the latest version of scilab, ignore the links at the top of the page and find your
operating system icon (penguin for Linux, flag for Windows, etc.),
then click on the correct version for your laptop. To determine
whether you have a 32 or 64 bit machine under Windows, right-click My
Computer and go to properties to find out. Ignore all the stuff about
Source versions and Nightly Builds. You should now have an exe file
(the installer program) wherever you put your downloads. Before
running it, if you are running Windows 7 or Vista, you will probably
need to disable "User Account Control" then reboot. You can turn UAC
back on when you're done using scilab if you like -- OASIS says "For the
professional, UAC is a nuisance, but for the average user, UAC
probably does some good by protecting from Bad Things." See this
URL for the how-to. Now finally you can click or double click the
exe file to run the software installation. NOTE: Contrary to previous
instructions, you do not need to "run as administrator" at this stage
-- turning off UAC takes care of this problem. If you have a Mac with operating systems 10.5 or 10.7 (Leopard or Lion) you will need to use 5.4.0(Beta). If you have 10.6 (Snow Leopard) version 5.3.3 is fine.
Part II: Getting Started and Recording Your Work
After installing scilab, fire it up (with UAC **turned off** if you have Windows Vista or Windows 7 -- see part I for how to turn UAC off). You will see a window labeled "Scilab Console" and it will show an arrow "-->". This is where scilab is waiting for you to type commands, which tell it to load data, do math, plot data, etc. After every command, you always hit the enter key (return key).
Type "x=5" -- this is the command to create a variable named x and give it the value 5. Scilab echoes the assignment of the variable x back to you. If you don't want to see the result, type a semicolon afterward. Try it: type "y=4;". Now type, "x+y;". You haven't assigned the result to another variable name, so scilab uses the default variable name "ans". If you now type "ans" without a semicolon, you'll see scilab echo back the result. The variable "ans" gets overwritten all the time, so it's better to name any result you want to keep, e.g. "z=x+y". Now z will not be overwritten unless you do it yourself.
Notice that if you use the arrow keys, you can bring a previous command to the current command line, where you can then edit it and hit return to execute it. Go back to the command "x+y;" and change it to "junk=x+y;". You've created the variable junk, but it didn't echo back because of the semicolon. To see it, just type "junk".
Scilab can work with arrays of numbers, such as columns of data or tables of data (rows x columns). For example, type "x=[1 3 5; 2 4 6; 10 11 12]" and look at the output. Notice that scilab happily overwrote the old definition of x with no error message -- watch out for overwriting variables by mistake! The semicolons after the 5 and 6 indicated that you were starting new rows. If you want to pick out one or more rows/columns in the array, you must use "indices" (a.k.a. "subscripts") and the colon ":" to identify the portion of the array you want -- rows first, columns second. For example, compare the results of "out=x(1:2,1:2)" with the result of "out=x(3,:)". In the first example the colon acts like a dash specifying a range, i.e., read "1:2" as "1 to 2." A naked colon means "everything," for example in "x(3,:)" it means "all columns." The first command says you want a new variable called "out" to equal rows 1-2, columns 1-2 of "x", while the second says you want "out" to equal row 3, all columns of "x". Each number in an array is called an element -- can you write a command to select the element of "x" in the 2nd row, 3rd column, and assign it to "y"?
The colon can also be used to generate a series of numbers, either in +1 increments (the default) or in increments you specify. Compare the output of "x=1:10" and "y=10:-1:1" and "z=1:1:10". The middle number is the increment, unless it's missing, in which case it's assumed to be 1.
Besides the colon, the "rand", "zeros", and "ones" commands are useful to make custom arrays, respectively filled with random numbers, zeros, and ones. For example, type "x=rand(5,5)" and "y=ones(4,3)". Now type "z=x(1:4,3:5)+y". Examine the result carefully -- why was it necessary to use subscripts on x before adding? Try "z=x+y". It gives an error -- why?
Although scilab can do advanced math, we won't need that, so you should just remember a few simple operators and functions:
e times 10-to-the (e.g. 2.e4 = 2 .* 10^4)
d same as "e" but double precision
abs() absolute value
sqrt() square root
log() natural log or ln
log10() ordinary log (opposite of 10^)
Notice the periods "." before the "*" and "/" -- these are VERY IMPORTANT. Without them, scilab will try to do matrix multiplication and division, which is way too fancy for our needs. The operators listed above do math "element-wise", meaning if you, e.g., multiply two single column arrays, the two first elements will multiply, the two second elements will multiply, the two third elements will multiply, etc. This behavior is analogous to the way addition worked when you typed "z=x(1:4,3:5)+y" earlier.
Now using parentheses and simple math, you can create your own functions. For example, suppose you'd like to define a column of data (one-dimensional array) that obeys the equation c=lambda*nu over a range of lambda from 300-700 nm. You can type "lambda=300.+(0:50:400)" first, then "nu=3.e17 ./ lambda" (where the speed of light is 3 x 10^17 in units of nm/sec). The output should be nu in Hertz (1/sec), but it's a little confusing: take a minute to understand it. Scilab doesn't want to repeat the exponent, so it gives a multiplier out front (1.0d14), then the array of numbers to multiply by that. Let's see if it makes sense: the first lambda value was 300, so nu=3.e17/300=1.e15. This is the same as 1.0d14 * 10 (the first element shown by scilab). If you are worried about making a mistake in your head, you can force scilab to calculate the first element for you: just type "nu(1)".
Use parentheses liberally! It is very easy to do different math than you intend. Notice that "lambda=300.+0:50:400" does not work properly; you must say "lambda=300.+(0:50:400)".
You might like to compute some overall properties of a data set. We'll save some tricks of this type for later, but try these simple ones: sum, max, min, median, mean. You can see how these functions work by creating an array of random numbers (e.g., "x=rand(10,10)") and then computing each statistic (e.g., "val=mean(x)"). Compare with "val2=sum(x)/100)" -- here 100 is just the number of elements in x, i.e., 10x10. The "size" function could tell you the size of a variable if you didn't already know. If you want to see how "size" works or what else scilab can do, check out the complete list of scilab commands here: scilab manual.
Logging Your Work
Now that you've seen the basics, we're going to start recording your work. To do this, go to the
file menu and select "change current directory." This will prompt you
to select a directory in the scilab program directory, where you can
make a new folder called "myfiles" using the little sparkling folder
icon at the top right of the window, then click "Open." All your
scilab input/output will now go in this folder.
Next, go to the Applications menu and select SciNotes (a.k.a. Editor if you have an older version). You can copy and paste successful commands into this file and re-execute and save them all at once using F5, or save without executing using the File menu.
will submit your SciNotes file for this tutorial as part of your
You should copy and paste main console output from your commands into the SciNotes file and type "//" followed by a space before each line of the output. The "//" tells scilab this line is just a comment that it should not try to execute when you type F5. You can also add any notes or comments of your own on a line prefaced by "//".
Add a comment with your name to the top of the file.
Finally, it's time to show off your new scilab skills "for the record."
(1) Using the colon operator, create an array called "myarray" that has the same length
as the number of letters in your last name and counts up from 1.
(2) Create a second array that is the square root of the first. Call the second array "rootarray". How many elements are in "rootarray"? If it's not the same as the number of letters in your last name, you have a problem.
(3) Compute myarray divided by rootarray. You can name the result "ratio". Careful! Check that myarray has more than one element. If it doesn't have the same number of elements as the number of letters in your last name, go back and review the section on "Simple Math" above.
(4) Multiply ratio times rootarray. Again, be careful that you use the correct multiplication operator -- if you get one number, that's wrong. Once you've got a full array, examine it. Does the result make sense?
(5) Add a comment to the SciNotes file to answer the question from (4), i.e. explain why the result makes sense.
The final version of your SciNotes file should contain only successful commands and their output -- leave just your most brilliant work for the grader.
Part III: Reading and Plotting Data
First, download testdata.in into the
directory where you keep your scilab files. Reading the data
is now simple: just type "data=read('testdata.in',-1,2)". This
command creates an array with as many rows as the input file and 2 columns (so yes, you do need to know
how many columns are in the input file before reading it -- in this case we constructed the file to have 2 columns). If you prefer to specify the number of rows
to read, replace -1 with a number -- this option could be useful if
you have a very large file, and you just want to play around with the
first few rows of data. Note that read *assumes* your data is in
column form, so if there's a header with column names, you should
remove that before reading.
Now, you have all your data in one array. If you want to work with different columns, it is helpful to name them and extract them from the array. For example:
To plot temperature vs. humidity, you can just type "plot(humidity,temperature)" where the desired x-axis is listed first. This should pop up a plotting window with the data points connected by lines -- rather a mess.
To beautify this plot, we can specify the output more: "plot(humidity,temperature,'b.',"MarkerSize",2)" will use blue dots with dot size 2 (most obvious colors work, e.g. r for red, g for green). Type this in, then look back at the plot window. Unfortunately, the mess is still there, we just overplotted points on top of it. Type "clf();" to clear the plot window, then try the same thing again: "plot(humidity,temperature,'b.',"MarkerSize",2)". This should look much better.
Now, to add axis labels and a title, type "xtitle('Fantastic Plot #1','humidity','temperature')" in the Scilab Console. If you didn't want (for example) the overall title you could just put nothing between the quotes: "xtitle('','humidity','temperature')". Note that the first title is two single quotes next to each other. You can also change the axis labels using the options under "Axes properties" in the "Edit" menu on the plot window. Try playing with the little magnifying glasses too, using the magnifying glass on the left to zoom in on the humidity range from 10-40 and the temperature range from 80-100, thus excluding the one outlying data point in the bottom right of the plot. Of course, scientific integrity demands that you should never cut out a "bad" data point from a real data set without explaining why you're doing so, and having a very good reason!
Suppose you wanted to subselect certain data from your dataset for a legitimate reason, for example, let's say you just want to look at the temperature on days with humidity less than 20%. Rather than looking through the data, you can use the "find" function to select out those particular days. Type in "sel=find(humidity < 20)" without the semicolon, so that you can see what sel is. What are the numbers in sel? These numbers are the indices of the data points that meet our criteria (humidity is less than 20%). To check that sel does indeed find the data points where the humidity is less than 20%, type "humidity(sel)". Now come up with a command to show the temperature values where the humidity level is < 20%. To check your answer, the temperature values should be: 89 and 93.
You can join multiple selection criteria together by using the "&" sign. Let's say rather than zooming in on your plot like we did earlier, you decide you just want to plot the data that meet the same criteria: i.e., temperature ranges from 80-100 and humidity from 10-40. To start this selection, write "sel2=find(temperature > 80 & temperature < 100)". Go ahead and overplot this selection: "plot(humidity(sel2),temperature(sel2),'r*',"MarkerSize",2)". You should find that the overplotted symbols range from 80-100 in temperature. Finish the selection to include the humidity range. Overplot using 'g+' (green plus signs). Save this final find command in your SciNotes file.
To save your plot so you can submit it with your SciNotes file, first retitle with plot with your name and the assignment, e.g., "Jane Doe Scilab Tutorial", then export it (the zoomed in version, with the bottom right point cut out) to a file. You can do this from the File menu, choosing "Export to" and then changing "Files of type" to "PDF image". Give the file a name, hit save, and you're done.
Close scilab, then open your SciNotes file in a text editor and print it out, and also print the plot you made. You should submit both your SciNotes file and the plot as PDF files.