[ale] I've decided again to learn programming again

Sun Oct 16 00:13:28 EDT 2011

On Sat, Oct 15, 2011 at 11:58:28AM -0400, planas wrote:
> On Sat, 2011-10-15 at 10:47 -0400, Michael B. Trausch wrote:
>>  On Fri, Oct 14, 2011 at 06:42:11AM -0400, Ron Frazier wrote:
>>> Maybe, someday 8-( I'll see a GUI hello world that I created on a
>>> tablet device.  The very large learning curve for this is
>>> intimidating and frustrating.
>>
>> There is nothing "simple" about creating graphical user interfaces.
>> That is one reason that we have created such high-level
>> abstractions for them.  HTML, for example, and XML are often used
>> to describe GUI environments because doing so programmatically can
>> be lengthy and tedious.  The event-driven approach to programming
>> in GUIs is also a bit difficult to get a handle on when what is
>> usually taught first is procedural programming.  (That said, once
>> one "gets" event-driven programming, I think they find that they'd
>> rather not do it any other way!)  However, with event driven
>> systems you have to be aware of the events your program will have
>> to process or your program could "fail catastrophically" (e.g.,
>> crash) and that's of course no good.
>>
>> To do "Hello World" in plain C using the Win32 API, if memory
>> serves, requires somewhere between 100-200 lines of code.  To do
>> the same in plain C on GTK+ is less (maybe 50 lines, if I remember
>> correctly), owed in part to the fact that GTK+ has a completely
>> different design and you don't have to configure every little thing
>> before displaying something.
>>
>> You can, in a high level langauge such as Java, display a Hello,
>> World on both of them in around 20 lines of code.  Though when you
>> think about it, that is a very heavily abstracted 20 lines of code.
>> Very easy for us programmers.  Turns into millions of lines of
>> assembly code on the computer's side, though.
>>
>> But that's what all abstractions do: make things easier for us, and
>> slightly more difficult for the computer (meaning that it requires
>> more memory, more processing power, or more secondary storage than
>> it would otherwise).  But then again, if those abstractions prove
>> to be too costly, you get rid of them in the late optimization
>> stages of a program's lifecycle.
>>
>>
> As the available computers become more powerful, the overhead costs
> of using abstractions is less meaningful for most applications. I
> can remember when a computer with 64K of RAM was hot-stuff with an
> 8-bit processor. Now 64-bit processors with multiple Gb of RAM and
> massive hard drives are available and abstractions causing much less
> noticeable penalties. That is assuming the penalties are even
> noticeable to the user.

My point was more along the lines that it takes time to learn how it
all works even though we have all that abstraction.  It's not really
simple, and it helps to understand that.  The better one understands
how much the abstraction costs, the better code one will write
(although perhaps not at first; it takes quite a long time to truly
understand the costs of various things).

Then again, this is why we have things like profilers.

Note that I am absolutely not advocating that one go out of the way to
figure out how "expensive" everything is up front.  That'd be the
wrong way to do it.  It *is*, however, helpful to understand what the
abstractions you are using are actually doing under the hood, though,
because sometimes that will be knowledge that you can use to avoid a
performance _crisis_ ahead of time.

As a rather contrived and simplified example, imagine that you were
writing a program to extract and delete paragraphs from ODT documents.
There are multiple ways to do it:

 * You can use an ODT library that will handle all the details for
   you.  Let's assume that it uses primary storage to do all its work,
   which is efficient enough for most of the ODT files that are out
   there in the wild.  You'd be working at the level of abstraction of
   the "document", which probably means that all your XML is parsed
   and stored in RAM when you've successfully opened the document.

 * You can use an I/O library with transparent support for ZIP files.
   These provide a very useful abstraction, treating a ZIP file as if
   it were a directory on the filesystem, more or less.  Let's assume
   that it, too, uses primary storage.  In this case you'll be dealing
   with the various XML and binary files within the ZIP file yourself,
   without help.  Saves you some memory and gives you more control, at
   the cost of more work.

 * You can extract the ZIP archive to a temporary location in
   secondary storage.  This is more efficient in the situations where
   memory pressure is already high or the file size (or number of
   objects, or both) is large enough that it would cause memory
   pressure to become higher.  It would take a little bit longer to
   work with, since you have to worry about more of the details
   yourself.  Only slightly, though, since the only thing that you're
   doing is exracting / recompressing on disk instead of in memory.

 * You can, if you are very crafty (or stupid, a glutton for
   punishment, looking to secure some sort of bragging rights or
   having to deal with an utterly unreasonable corner case), do all of
   your modifications in-place and make only updates to the ZIP
   container that the files are in.  Doing it this way would cost you
   almost nothing in terms of either RAM or disk space.  However, it
   would be **VERY** costly in terms of development time, because now
   you're talking about twiddling bits mostly in-place on disk, not
   duplicating information and so forth.  You are essentially trading
   the use of RAM and disk for a very efficient (and hopefully bug
   free) program or library that will take a very long time to create
   and prove works correctly.

Of course, these aren't the only possibilities.  You can have anything
in-between the two ends of the spectrum there.  On one end of this
spectrum, you have a very RAM-hungry system that doesn't take much
time to develop, and at the other end you have a very efficient
program that may take months or even years to fully develop and test.
(In other words, almost nobody has a reason to be at that end of the
spectrum.)

Often, you don't have any want or need to be at the fully-abstracted
or the never-abstracted ends of the spectrum.  It's often nice to be
in the middle somewhere, and that's one reason (among many) that one
can often find many libraries to do the same thing, but taking
different approaches.  And also why you can find many programs that
use one library but not another, while "competing" programs do things
a different way.  And why still some others roll their own when trying
to solve the same problem in yet a different way.

So really it's not _all_ about the abstraction, but often higher level
abstractions are more expensive to use in some way or another (but
saving developer time greatly), whereas lower level ones are almost
always more expensive to use in terms of developer time.  And
sometimes, high-level abstractions just aren't well-suited for a
particular instance of a problem because of primary or secondary
storage consumption, CPU consumption or other performance
characteristics.

So yes, it's absolutely useful to have an understanding of the lower
levels of a system and how "expensive" its abstraction is.  :-)

    --- Mike