SPO600 Lab 4 Compiling C Code

This post is going to focus on compiling C code with different options and seeing what the compiler does to the end result, by looking into objdump files of the compiled program.

The Program we will use is :

#include <stdio.h>

int main() {
    printf("Hello World!\n");
}

Originally we will compile it using the following options :

-g               # enable debugging information
-O0              # do not optimize (that's a capital letter and then the digit zero)
-fno-builtin     # do not use builtin function optimizations

The 6 Options we will try and modify the process with are :

(1) Adding the compiler option -static.

(2) Removing the compiler option -fno-builtin.

(3) Removing the compiler option -g.

(4) Adding additional arguments to the printf() function in your program.

(5) Moving the printf() call to a separate function named output(), and call that function from main().

(6) Removing -O0 and add -O3 to the gcc options.

Option 1.

Adding the -static options into our compiler options, has made the program size at least 900 times larger. When using objdump on the program we get to see all the libraries linked to C get directly compiled into our program. This does a number of things, one it makes the program larger as we no longer have the linker, all related functions must be inside the program itself. Two the program may run faster, because we have no need to go elsewhere to look for function and can just directly call functions from inside the program, the speed should increase, but it is hard to tell on a small program such as this. Overall we can say -static expands all related libraries and no longer has need for linker file.

Option 2.

Removing the compile option fno-builtin, from the looks at the objdump from removing this compiler options, it seems the compiler tries and finds a more direct function call for what it is trying to do, so instead of printf() it tries to find the underling function that it can call to improve the performance.

Option 3.

Removing the compile option -g gets rid of all extra code that is inserted by the compiler for debugging purposes, some of the notable items are line numbers and debugging statements. Useful in helping reduce the size of the compiled program.

Option 4.

Adding additional arguments to the printf() statements, adds extra registers to the program that are used to store the arguments, however after a certain count of arguments the program puts them onto a stack, this will start reducing the performance of the program, as the cpu will have to get them from memory.

Option 5.

Moving the printf() to a separate function and calling it from there, makes no real difference except for the fact that main now calls the created function first, and then calls the printf(), this has a small increase in compiled source code size.

Option 6.

Removing -o0 and adding -o3 tells the compiler to optimize the program, here is a small list of  what each stage does :

0 – No optimization
1 –  Do some optimization
2  – Do all the safe optimization
3  – Disregard safety do all possible optimizations

To finish up, just want to point out that the compiler does a lot of work, that we may not even know about. Giving compiler free reign can have it do drastic things to help optimize it for the machine to run. There are a lot of options, and in some cases those options can have a big performance update on your program, so playing around with them may be worth your time.

SPO600 Lab 3 Assembly Language

This has been a fun experience writing in assembly language. I have always been interested in seeing how assembly is written and how it operates. This lab has helped put a lot of insight into how assembly works and the benefits of writing in it.

The task involved creating a simple program that is able to print text and number in a loop, like this :
output

The program basically start at 0 and loops until 30, printing “Loop: + Current Iteration Number”. To achieve this result the following was the program made in x86 assembly.

x86code.PNG

Now I will break the code down into parts and try to explain to best if my ability what the code is doing.

start

This is the initialization part of the program. We set our loop range with start and max, and insert the value of start into registry r15. We also add 0x30 to registry r12, this is the ascii value of 0 in hex, we will use this value later on in order to check if our quotient is 0, in order to get rid of leading 0s when printing.

loop.PNG

Now here is the juicy part, containing almost all of the logic behind the program. Starting of we have a mov command to clear out rdx registry, this is needed in order for div to work properly with remainders. After that we have our division logic, simply storing values and calling div. Note we store both quotient and remainder, since we will need both of them in order to print double digit values. We then convert quotient and remainder to ascii and store remainder at msg position 7. In order to suppress leading 0 we check to see if quotient is 0 or not, that is where we use the ascii 0 value. If it is 0 we skip adding it into the msg, however if it is not o we add it right before remainder at msg position 6.

print.PNG

Now that we have our msg modified we are good to go on printing it on to the screen. Simply adding stdout parameters into necessary registers and calling syscall, we are able to print 1 iteration of the msg onto the console. After the print is finished we have to do a loop check, think of it as do() while() . We simply increment the counter and compare it to the maximum number of times we want to run, if it does not match we return back to the loop tag and start the process again.

exit

Finally we have the exit command, and our ending .data which holds the string we used to add the counter to and the length of the msg itself for stdout parameter.

That’s it for the programs logic, now let’s quickly go over the small difference in the aarch64 example of the same program. Here is the code used to run the same program in aarch64 environment :

aarch64.PNG

I will not go in explaining the program, as it is almost exactly the same logic as the on in x86 assembly. I will however point out some of the difference between the two assemblers. One big difference is syntax, in x86 we use value->registry, while in aarch its registry<-value. There are some difference in the commands and registries, for example we can use 30 registers compared to 15 in x86, making it nicer for bigger programs that do not want to start using memory. Honestly the difference is mostly in the way syntax works, the logic behind the assembly language is the same in both architectures. However I personally found it a bit easier to type in x86, but I think it was because I did most of my examples using it.

Thanks for reading the post, hope it way useful in one way or another.

Contributing to open source elasticsearch project

Project

elasticsearch is a node.js client that enables one-one mapping with REST API and other official clients. Currently is downloaded more than 250,000 times on the npmjs site. When I was going thought it I saw that it was missing the keyword field in their package.json file, so I decided to add the common keywords to that file and request a pull of my changes.

The project can be found at : https://www.npmjs.com/package/elasticsearch

My changes

The changes I made were a simple addition of a couple of keywords :

capture

I made those changes directly via github and submitted by pull request here :

https://github.com/elastic/elasticsearch-js/pull/494

The pull request currently has an error because of unsigned document, however i’ve now signed it and looking to see if the merge will happen.

Gimp and Swift Software Contribution Comparison

Gimp

There are plenty of ways GIMP community allows one to help develop their software. Everything from programming a new feature, to adding documentation and writing tutorials. Focusing on programming and modifying the source code, GIMP uses GIT as their source control,with GNOME foundation to host all the code repositories. GIMP community also uses an IRC channel and a developer mailing list. Once your are ready with your changes you announce them in the channel and the mailing list for them to be known to the developer community. To submit your changes, you will have to create a patch file, send the patch file to bug/enhancement request on Bugzilla or the mailing list. Once the changes are approved they are then merge into the code base and you have become a GIMP contributor. I looked at some of the commits made on the git side of GIMP, specifically this commit, from the looks of it the commit is reviewed and using the mailing list and the patch report, and then pushed onto the main branch.

GIMP development overview : https://www.gimp.org/develop/

Swift

The second open source project I looked at was swift. Swift community also allows you to contribute to them in many ways. Again looking into the code development side of contribution. Swift uses a developer emailing list for communication, but mostly Github as their main source of code review and approval. Looking at this pull request, the contributor first documents his changes and a group of verified personal go over them, see if there are any merge conflicts and/or bugs. And approve the pull request to be merged into the main branch.

Swift development overview : https://swift.org/contributing/

 

SPO600 Software Install Comparison

The difference in installing software from GNU license and my pick MIT license was not much. Both provided a well made makefile, however while installing httptunnel I was provided a configuration file, which helped create the makefile for the software.  For the other license I choose a software named streama which is based on MIT license. The process for both was pretty much standard. Go and get the binaries and run install that comes with the packages. If you want to learn more about the two softwares you can get a better look at them here :

streama : https://github.com/dularion/streama

httptunnel : https://www.gnu.org/software/httptunnel/

A quick look at TensorFlow

With the growth of machine learning, many people are looking into libraries and tutorials on learning more about machine learning.

TensorFlow is no exception, originally developed by engineers of Googles Brain Team, is an excellent python library providing necessary operations to do numerical computation using data flow graphs. TensorFlow was open sourced a year ago by Google, who has used it internally for their machine learning and deep neural network learning. TensorFlow is written is C++ and is highly optimized for deploying computation onto multiple CPUs and GPUs. On GitHub TensorFlow has a 600 member contributor team, with over  12,000 commits making it one of the more popular projects listed on GitHub.

If you are interested in learning more about TensorFlow, you can visit their main website at : https://www.tensorflow.org

If you want to contribute and look at the source code, you can find more information on the official TensorFlow GitHub page at : https://github.com/tensorflow/tensorflow

For more hands on tutorials and starter  here are some good starting places : Official TensorFlow Getting Started : https://www.tensorflow.org/get_started/

Website dedicated to Teaching TensorFlow : http://learningtensorflow.com/

Introduction

Good day everyone,

My name is Andrey Bykin, and this blog will be focused on my experience with open source development.

Little bit about me, I am a student attending Seneca College. I have been programming for about four years now, mostly with languages like C++, Javascript, Java. As a start to this blog, it will mostly focus on the open source development course that is currently offered in Seneca.

I will try to write meangfull content about open source projects, and personal encounters I will have in this journey. Hope you stay a while and listen.

Andrey Bykin
andreybykin.wordpress.com