[ale] Hello World - in C# - in Mono - in Ubuntu is done

Thu Sep 16 22:45:32 EDT 2010

On Thu, 2010-09-16 at 11:14 -0400, Ron Frazier wrote:
> I have just completed my first Hello World program in C# in Mono in
> Ubuntu.  I even customized it for this group, as indicated by the
> image below.
> 
>                hello world.png
> 
> By this evening, I should have a 1 million object astronomical entity
> tracking system which can accept 2 million hits / second on a website.
> OK, I made that up, maybe I won't finish that today.  8-)
> 
> Seriously, my goal is to be employable as a C# programmer within 6 mo
> to 1 year spending a few hours per day.  I spent several years
> programming Clipper for Delta Air Lines, so I've been through this
> before.  What I haven't done before is the object oriented stuff and
> the GUI stuff and the multi-threaded stuff and the web site stuff.
> So, I know there's a big learning curve.  I'd appreciate any help
> anyone familiar with C# is willing to give.
> 
> I know C# is usually a Microsoft thing, but I understand that quite a
> bit of Linux development is using it too, and that it makes a good
> cross platform language. 

C# (and the CLR as a whole) is a very expressive and useful language.
If anything it solidifies the belief that Microsoft should have remained
a company devoted to developing software for developers, as was its
roots.

I've used it for a few projects, and I rather like it.  However, if I
may, I would like to offer a little bit of advice:  Learn C.  Yes, I'm
serious.  Learn C99 at least---it's useful, and you'll find a very
strong development environment built around GCC and the C programming
language such as it is used in GNOME and its entire stack.

Why do I say this?  It is because learning to program in C is helpful to
your development as a programmer in nearly every single programming
language that you can imagine.  There are some constructs in Java or C#
or Python or whatever nifty-language-of-the-month you can find that are
insanely easy to use and look very inexpensive if you look no further
than the source code.  But as you get really familiar with the C
programming language you begin to understand that this comes with a
cost---and that cost can sometimes be a million or more lines of
assembly language code.

If you don't believe me, check out Hello World written in C:

#include <stdio.h>

int
main(int argc, char *argv[]) {
  if(argc > 1)
    fprintf(stdout, "Hello, %s!\n", argv[1]);
  else
    fprintf(stdout, "Hello, World!\n");

  return(0);
}

This program looks pretty simple, right?

See what it compiles to:

└─(22:23:%)── gcc -g0 -O0 -o hello hello.c
└─(22:23:%)── ls -l hello
-rwxr-xr-x 1 mbt mbt 8564 2010-09-16 22:23 hello*

Looks small enough, right?  In fact, it's not that large at all, it's
only a little larger than 8 KiB.  This is for a 64-bit dynamically
linked ELF file, though, so that means that there is something that is
absolutely larger about it.  This little native-code executable contains
14 functions (easily seen from "objdump -d hello", though the output of
that command is too large to include here).  The function named main
consists of 31 assembly language instructions (6 of which are the
prologue, 2 of which are the epilogue, and 3 of which are NOP after the
epilogue) and of those 4 are branch instructions (CALL or JMP or
variants thereof).

Let's see what happens if we include the entire code for this little
example program in a single executable:

└─(22:24:%)── gcc -g0 -O0 -static -o hello hello.c
└─(22:28:%)── ls -l hello
-rwxr-xr-x 1 mbt mbt 741585 2010-09-16 22:28 hello*

Hrm.  We have just reached the point where Hello, World will no longer
fit on a 720 KiB floppy disk (which provides 737,280 bytes of storage
without being formatted with a filesystem on it).  Ouch.

This file contains 60+ functions (I stopped counting at 60) and the
largest one that I have encountered is too many assembler instructions
for me to count.  Now this is the cost that we pay (at least in part)
for having the very feature-rich, very robust C runtime library provided
by the GNU project.

Now, think about higher-level languages that are used.  Since we're
talking about C#, let's take a look at the Mono runtime.  The Mono
runtime's main executable (at least on my system) is:

└─(22:35:%)── ls -l $(which mono)
-rwxr-xr-x 1 root root 2536688 2010-04-22 12:30 /usr/bin/mono*

That's over 2 MiB!

It's linked with 10 libs:

└─(22:35:%)── ldd $(which mono)
	linux-vdso.so.1 =>  (0x00007fff1a9f1000)
	libgthread-2.0.so.0 => /usr/lib/libgthread-2.0.so.0
(0x00007fda12d65000)
	libglib-2.0.so.0 => /lib/libglib-2.0.so.0 (0x00007fda12a87000)
	librt.so.1 => /lib/librt.so.1 (0x00007fda1287e000)
	libdl.so.2 => /lib/libdl.so.2 (0x00007fda1267a000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x00007fda1245d000)
	libm.so.6 => /lib/libm.so.6 (0x00007fda121d9000)
	libc.so.6 => /lib/libc.so.6 (0x00007fda11e56000)
	libpcre.so.3 => /lib/libpcre.so.3 (0x00007fda11c28000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fda12f8d000)

(And that's to say nothing of any P/Invoke invocations used to pull in
extra libraries such as GLib or GTK or what-have-you, and that doesn't
count the class libraries that are part of the managed-code runtime
library.)  So we can see already that this is very expensive.  Now, as
is the case with shared library systems a lot of those things are
already going to be present in the virtual memory system before Mono is
ever started, so at most before your managed code can run we're looking
at loading 2.5 MB of code, plus the base runtime class libraries,
however large they are.  The JIT will then allocate memory for use when
compiling all those methods into a native-code assembler language (the
CLR functions by reading bytecode, essentially a high-level assembly
language, and then translating those method-by-method into native binary
opcodes possibly with an intermediate step at assembler source code
itself).  If you're talking about running only a single application that
uses Mono, this can sound quite expensive---because it is.  If you are
running many managed code applications, it becomes cheaper (although not
by much, unless the class libraries are AOT compiled so that they are
available as native system shared libraries; if that is not done, then
every instance of the CLR (AppDomain) will consume all the memory
required for JIT compiled versions of the code it is executing.

That means that even a very small Hello World program written in C# will
be loads larger than the ~740 KiB Hello World in C.  There is,
unfortunately, no way around that at present.  If C# were something that
could be compiled by the GNU compilers as a native language and there
were a native-code runtime for the system as there is with Java (well,
as there mostly is with Java), then this might improve things a bit (at
least if the class libraries weren't lumped together as is the case with
the GNU classpath libraries when they're compiled to native code).

In any case, it's something to think about.  If you become very familiar
with programming in C, it can't hurt---and it will likely help you a
great deal.

Perhaps one thing to mention:  I'm not saying that you should think
about this all the time and use it to optimize code before you've even
found a bottleneck!  What I am saying is that if you are aware of what
these things can turn into as they get closer and closer to the bare
metal, you'll be more easily equipped to ask the right questions when
you do have a performance bottleneck of some sort, and you'll be more
likely to look further than just your own written code when you do,
because you'll understand that every layer adds sometimes more overhead
than you can possibly imagine even as a very experienced programmer.

Anyway, enough with the going on and on---congratulations for starting
down the road with a new programming language, and may you find other
languages to learn to be fluent in as well.  The more the merrier!

	--- Mike