OSP600 Lab 6 Vectorization Lab

In this lab we were asked to create a very simple program that will make sure the compiler will use  vectorization optimization when the program is compiled. This is the program :

lab6Program.PNG

This is a simple program where two arrays filled with random values are added together and stored in the third array. the program complied above with the -O3 flag will cause the code to be vectorizied. When obj dumping the compiled code above we get the following result. I will only show the <main> program as well as commented lines on what I think it happening the between the commands.

part1vector

vectorpart2

As for result on what is important within the vectorization. The following are the important instructions for vector additons :

LD1 {Vt.<T>, [base]

ADD Vd.<T>, Vn.<T>, Vm.<T>

ST1 {Vt.<T>}, [base]

Those instructions load the vector register with 1 element strctures from memory. To add them together we use the following instructions :

SXTL Vd.<Td>, Vn<Ts>

SXTL2 Vd.<Td>, Vn<Ts>

ADDP Dd, vn.2D

Those take signed integers in the first half of the vector, extend them , and store them in destination vector, does the same thing to the  second half and adds them together and stores them in destination register.

 

Release 0.2 bug fixes

For the 2.0 release I decided to take an issue regarding weird behavior with right click menu on firefox on a mac.

The link to the issue can be found here : https://github.com/mozilla/thimble.mozilla.org/issues/1728

I think this issue will help me get more familiar with the way thimble and bracket interact and will help me explore more of the options on how to figure out an issue within the open source project.

Since this issue may be somewhat small depending on its fix, I will most likely end up grabbing another one once this is all figured out. The next issue I want to tackle, I want it to have more of a complicated fix as in more coding challenge rather than a small condition check.

One small issue I may run into is the problem being only persisted on one operating system. I do own a mac and my first task will be trying to replicate the issue, however I may need to switch gears depending whether or not this issue is able to be replicated on same system/ different system.

I will update on my findings and the next issue that I find on later blog posts.

 

 

Release 0.1 Fixing a bug in Thimble

For the first release of class OSD600, I choose to fix a bug found in the thimble project ran by mozilla. The issue located here: https://github.com/mozilla/thimble.mozilla.org/issues/1632

Was a minor bug in the code editor which caused the font size to have a minimum of 1px and maximum of 72px when using the increase/decrease font size buttons. The first real challange was installing the necessary components in order to reproduce the bug locally. To go about this, I installed thimble and brackets locally and after some debugging was able to get the server running and able to reproduce the issue.

thumblerunninglocally

Now to figure out on how to fix the issue with the font sizes, I backtracked from using the button directly and tried to connect it to the code. I found possible 4 places where either the font size gets set or where a function is being called in order to connect to a  listener to handle the method. After much investigation and looking up I stumbled across this piece of code.

bugIssue.PNG

So as you can tell someone already has the minimum and the maximum set for the font size, as such the solution was simply to change them to appropriate values. In this case 8 for minimum and 32 for maximum.

After fixing the bug I committed the changes locally and submitted the pull request at :

https://github.com/mozilla/brackets/pull/572

As of today the request has been merged to main branch and the issue is now fixed!

If you would like to work check out the thimble project here are the two links :

Thimble : https://github.com/mozilla/thimble.mozilla.org

Brackets : https://github.com/mozilla/brackets

SPO600 Algorithm Selection Lab

Today’s lab purpose was to see the difference in selecting algorithms when dealing with a problem. The challenge was to create a program that is capable of adjusting volumes on a sequence of sound samples in a range of 0.000 to 1.000.

Designing the code – Solution A

For first design, we created a simple table that stored a range of random value ( o-36750 ) to simulate a sound sample. Using that table we multiplied each index by the specified volume increment. The program looks as follows :

lab5simple

Note : We only want to time the difference in calculating the time for actually processing the sound adjustment and not the initialization, hence the timer starts after all initialization is complete.

Designing the code – Solution B

Second design involved creating a table with pre-calculated values that we use to modify the sound chunk with.

lab5table

Note : As you can see in this variation of the program, most of the calculations are done before our timer, and we only need to have the access to the index which holds our results.

Analyzing The Result

  1. Various optimization levels had different results, with -O3 optimization flag being the fastest improvement by about 6 times over zero optimizations.
  2. Distribution of data does not matter, as in both cases the code executes every peice of information in the array.
  3. Samples fed at 44100 samples per second at two channels, can be handled by both algorithms as current rate of samples per second is 125 million.
  4. Memory footprint of the second approach is slightly larger due to having an extra pre compiled table that it needs to loop up from.
  5. The performance results are as follow : Lab5Result.PNG
  6. The simple multiplication solution is faster as an algorithm in this case, but I believe it is due to processor being able to calculate the needed data in a very fast time, on a slower processor I think we can see the table algorithm catching up if not even overtaking in terms of speed, as long as there is enough memory to supply the needed pre-computed values.

OSD Lab 3 – Setting up Thimble and Working on a local machine

Thimble is an open source project by Mozilla, to provide a simple live code editor for html pages. To develop and work on thimble, one must be able to set it up locally. In order to get it to work I had to download and install the following software :

  1. Node.js
  2. Virtual Box
  3. Vagrant
  4. Brackets

Node.js is required in order to launch a copy of brackets locally, it was simply installed from the main website. Virtual box and vagrant go hand in hand with that vagrant uses virtual box to run a script in order to set up the virtual machine, downloading the required software and providing live updates when the changes are made to the files. Finally the last piece if a copy of Mozilla brackets fork, which is a embedded program in the thimble that actually does the editorial work.

After downloading all of those program and launching their respective services, I was able to launch thimble locally.

ThumbleRunningLocally.png

SPO600 Lab 4 Compiling C Code

This post is going to focus on compiling C code with different options and seeing what the compiler does to the end result, by looking into objdump files of the compiled program.

The Program we will use is :

#include <stdio.h>

int main() {
    printf("Hello World!\n");
}

Originally we will compile it using the following options :

-g               # enable debugging information
-O0              # do not optimize (that's a capital letter and then the digit zero)
-fno-builtin     # do not use builtin function optimizations

The 6 Options we will try and modify the process with are :

(1) Adding the compiler option -static.

(2) Removing the compiler option -fno-builtin.

(3) Removing the compiler option -g.

(4) Adding additional arguments to the printf() function in your program.

(5) Moving the printf() call to a separate function named output(), and call that function from main().

(6) Removing -O0 and add -O3 to the gcc options.

Option 1.

Adding the -static options into our compiler options, has made the program size at least 900 times larger. When using objdump on the program we get to see all the libraries linked to C get directly compiled into our program. This does a number of things, one it makes the program larger as we no longer have the linker, all related functions must be inside the program itself. Two the program may run faster, because we have no need to go elsewhere to look for function and can just directly call functions from inside the program, the speed should increase, but it is hard to tell on a small program such as this. Overall we can say -static expands all related libraries and no longer has need for linker file.

Option 2.

Removing the compile option fno-builtin, from the looks at the objdump from removing this compiler options, it seems the compiler tries and finds a more direct function call for what it is trying to do, so instead of printf() it tries to find the underling function that it can call to improve the performance.

Option 3.

Removing the compile option -g gets rid of all extra code that is inserted by the compiler for debugging purposes, some of the notable items are line numbers and debugging statements. Useful in helping reduce the size of the compiled program.

Option 4.

Adding additional arguments to the printf() statements, adds extra registers to the program that are used to store the arguments, however after a certain count of arguments the program puts them onto a stack, this will start reducing the performance of the program, as the cpu will have to get them from memory.

Option 5.

Moving the printf() to a separate function and calling it from there, makes no real difference except for the fact that main now calls the created function first, and then calls the printf(), this has a small increase in compiled source code size.

Option 6.

Removing -o0 and adding -o3 tells the compiler to optimize the program, here is a small list of  what each stage does :

0 – No optimization
1 –  Do some optimization
2  – Do all the safe optimization
3  – Disregard safety do all possible optimizations

To finish up, just want to point out that the compiler does a lot of work, that we may not even know about. Giving compiler free reign can have it do drastic things to help optimize it for the machine to run. There are a lot of options, and in some cases those options can have a big performance update on your program, so playing around with them may be worth your time.

SPO600 Lab 3 Assembly Language

This has been a fun experience writing in assembly language. I have always been interested in seeing how assembly is written and how it operates. This lab has helped put a lot of insight into how assembly works and the benefits of writing in it.

The task involved creating a simple program that is able to print text and number in a loop, like this :
output

The program basically start at 0 and loops until 30, printing “Loop: + Current Iteration Number”. To achieve this result the following was the program made in x86 assembly.

x86code.PNG

Now I will break the code down into parts and try to explain to best if my ability what the code is doing.

start

This is the initialization part of the program. We set our loop range with start and max, and insert the value of start into registry r15. We also add 0x30 to registry r12, this is the ascii value of 0 in hex, we will use this value later on in order to check if our quotient is 0, in order to get rid of leading 0s when printing.

loop.PNG

Now here is the juicy part, containing almost all of the logic behind the program. Starting of we have a mov command to clear out rdx registry, this is needed in order for div to work properly with remainders. After that we have our division logic, simply storing values and calling div. Note we store both quotient and remainder, since we will need both of them in order to print double digit values. We then convert quotient and remainder to ascii and store remainder at msg position 7. In order to suppress leading 0 we check to see if quotient is 0 or not, that is where we use the ascii 0 value. If it is 0 we skip adding it into the msg, however if it is not o we add it right before remainder at msg position 6.

print.PNG

Now that we have our msg modified we are good to go on printing it on to the screen. Simply adding stdout parameters into necessary registers and calling syscall, we are able to print 1 iteration of the msg onto the console. After the print is finished we have to do a loop check, think of it as do() while() . We simply increment the counter and compare it to the maximum number of times we want to run, if it does not match we return back to the loop tag and start the process again.

exit

Finally we have the exit command, and our ending .data which holds the string we used to add the counter to and the length of the msg itself for stdout parameter.

That’s it for the programs logic, now let’s quickly go over the small difference in the aarch64 example of the same program. Here is the code used to run the same program in aarch64 environment :

aarch64.PNG

I will not go in explaining the program, as it is almost exactly the same logic as the on in x86 assembly. I will however point out some of the difference between the two assemblers. One big difference is syntax, in x86 we use value->registry, while in aarch its registry<-value. There are some difference in the commands and registries, for example we can use 30 registers compared to 15 in x86, making it nicer for bigger programs that do not want to start using memory. Honestly the difference is mostly in the way syntax works, the logic behind the assembly language is the same in both architectures. However I personally found it a bit easier to type in x86, but I think it was because I did most of my examples using it.

Thanks for reading the post, hope it way useful in one way or another.

Contributing to open source elasticsearch project

Project

elasticsearch is a node.js client that enables one-one mapping with REST API and other official clients. Currently is downloaded more than 250,000 times on the npmjs site. When I was going thought it I saw that it was missing the keyword field in their package.json file, so I decided to add the common keywords to that file and request a pull of my changes.

The project can be found at : https://www.npmjs.com/package/elasticsearch

My changes

The changes I made were a simple addition of a couple of keywords :

capture

I made those changes directly via github and submitted by pull request here :

https://github.com/elastic/elasticsearch-js/pull/494

The pull request currently has an error because of unsigned document, however i’ve now signed it and looking to see if the merge will happen.

Gimp and Swift Software Contribution Comparison

Gimp

There are plenty of ways GIMP community allows one to help develop their software. Everything from programming a new feature, to adding documentation and writing tutorials. Focusing on programming and modifying the source code, GIMP uses GIT as their source control,with GNOME foundation to host all the code repositories. GIMP community also uses an IRC channel and a developer mailing list. Once your are ready with your changes you announce them in the channel and the mailing list for them to be known to the developer community. To submit your changes, you will have to create a patch file, send the patch file to bug/enhancement request on Bugzilla or the mailing list. Once the changes are approved they are then merge into the code base and you have become a GIMP contributor. I looked at some of the commits made on the git side of GIMP, specifically this commit, from the looks of it the commit is reviewed and using the mailing list and the patch report, and then pushed onto the main branch.

GIMP development overview : https://www.gimp.org/develop/

Swift

The second open source project I looked at was swift. Swift community also allows you to contribute to them in many ways. Again looking into the code development side of contribution. Swift uses a developer emailing list for communication, but mostly Github as their main source of code review and approval. Looking at this pull request, the contributor first documents his changes and a group of verified personal go over them, see if there are any merge conflicts and/or bugs. And approve the pull request to be merged into the main branch.

Swift development overview : https://swift.org/contributing/

 

SPO600 Software Install Comparison

The difference in installing software from GNU license and my pick MIT license was not much. Both provided a well made makefile, however while installing httptunnel I was provided a configuration file, which helped create the makefile for the software.  For the other license I choose a software named streama which is based on MIT license. The process for both was pretty much standard. Go and get the binaries and run install that comes with the packages. If you want to learn more about the two softwares you can get a better look at them here :

streama : https://github.com/dularion/streama

httptunnel : https://www.gnu.org/software/httptunnel/