[ale] semi [OT] making learning ruby programming fun?

Mon Mar 25 19:39:52 EDT 2013

On 3/25/2013 15:48, Ron Frazier (ALE) wrote:
> Hi Alex,
>
> Thanks for the reply.  I sense that you're a C programmer / fan; and
> that's totally fine.  If you ask 50 programmers how to do something,
> you'll get 85 answers.  So, we may never agree on some of these
> points.  Above all, I want my language to help me solve the
> application problems I'm trying to solve, not entrap me and bog me
> down while watching out for booby traps.  I think Go and Ruby are
> built using that philosophy.

I'm not, really.  I just was arguing the case that garbage collection
should not be relied upon.  It is already the case that garbage
collection doesn't always work because it's a fair assumption to say
that a program doesn't necessarily know *when* a bit of memory is no
longer needed.  Such is the case already with higher level GC routines
(ala Firefox and the like) but it also happens at lower levels.  Being 
bogged down is not the way to look at it.  Learning to write good, clean 
code is what anyone should do.  It's not as hard if you start that way 
instead of trying to catch up later.  Clean code writing transfers to 
nearly every language.  Bad code writing eventually makes your forehead 
bloody.

Back to GC, there's a specific overhead involved in allocation of 
memory.Performing the allocation, destruction and reallocation starts to 
take cycles.  The main process would be to allocate what you need and 
then deallocate when you don't need it anymore.  But, that's not 
necessarily when the scope changes.  In some cases you may want to 
allocate some variables persistently during execution and then release 
them only upon terminating the application.  This could be especially 
important if a function is called frequently where the overhead of 
allocate/deallocate could manifest.  At the very least knowing how the 
memory is utilized could just make you code a faster program.

> I have a couple of more comments below.
>
>
> Alex Carver <agcarver+ale at acarver.net> wrote:
>
>> On 3/25/2013 07:39, Ron Frazier (ALE) wrote:
>>>
>>> According to the link posted above, the top 10 languages, and
>>> some of
>> the reasons I've rejected some of them, are as follows.  No offense
>> is intended to anyone that programs in these languages.
>>>
>>> 01) Java - security problems
>>>
>>> 02) C - not modern garbage collected
>>
>> I would argue that it's YOUR job as the programmer to collect your
>> own garbage.  My landlord doesn't take the trash out for the other
>> tennants, why should they take my trash out?
>>
>> Cleaning up after a program because of something out of the
>> ordinary (unexpected crash, kill -9, whatever) is different than
>> just failing to
>>
>
> Actually, in the 21st century, I believe the programmer should NEVER
> have to think about things like memory allocation ... EVER!  There's
> no reason for it unless you're doing something in assembly.  I never
> had to mess with it in Clipper.  Presumably, I wouldn't have to in
> Java, Go, or Ruby either.  You declare a variable, or perhaps
> instantiate it, memory get's allocated.  It goes out of scope, it
> get's deallocated.  It's needed again, it get's allocated again.  If
> it's a static program wide variable, it stays allocated as long as
> the program runs.  This type of low level resource allocation should
> only be attempted manually when it's impossible for automation to do
> it effectively.  It's more important for the programmer to figure out
> the logic that his program needs to have to work, not to figure out
> how to make the computer work.  We don't mess with that level of
> housekeeping in allocating sectors on the HDD any more.  I see no
> reason to do it with memory either.
>
>> deallocate the RAM that you allocated during program execution.

In this era of big data, memory allocation is very important and should
not be ignored.  To ignore it (or be ignorant of it) is to invite sloppy
coding.  I'm no saint, I've done plenty of sloppy one-off's to achieve
some end result and let the machine handle the cleanup when the program
ends.  But, for anything big that I've written, I do take time to watch
the memory footprint and dump things when I don't need them.

We may have gigs of memory available on the machine, but our data still
outstrips our memory.  It's been that way for decades and will always be
that way.  People generate far more data than our computers can possibly
hold in RAM (ignoring swap/disk space).  In my daily work, my data
logging systems will accumulate about 20 GB of related data per process 
run.  When it comes time to crunch that data I can't possibly read it 
all in at once, I don't have enough RAM to do that.  So I have to 
consider very carefully how memory is used and when I can reuse RAM to 
achieve my goal without crashing the machine.  The death throes of a 
machine filling its RAM is wild.  Swapping to the point of a disk that 
never stops flitting heads around because nearly everything was swapped out.

>>>
>>> 06) PHP - security problems per http://en.wikipedia.org/wiki/Php
>>> "About 30% of all vulnerabilities listed on the National
>> Vulnerability Database are linked to PHP."
>>
>> You skipped the next line:
>>
>> "These vulnerabilities are caused mostly by not following best
>> practice
>>
>> programming rules: technical security flaws of the language itself
>> or of its core libraries are not frequent (23 in 2008, about 1% of
>> the total)."
>>
>> PHP, like many other languages (except Java) isn't inherently
>> vulnerable.  Instead it's how the programmer wrote the code.  I
>> would put good money down that the most widely used "vulnerability"
>> in PHP is
>>
>> an SQL injection attack because someone forgot to sanitize user
>> input. You can't blame the code because there's a bunch of built-in
>> sanitizing
>>
>> routines plus ways to roll your own if you don't like theirs (I've
>> done
>>
>> both).  I could easily argue that C (practically the core of all
>> programming short of assembly, FORTRAN and COBOL) is also
>> vulnerable because I can do something stupid like
>> free(raw_user_input) that the compiler might willingly let me do
>> and blow up the machine (there's also the fork() bomb that will
>> blow the machine up, too.)
>>
>>
>> The few times that a core PHP module was vulnerable to something
>> saw a fix come pretty quickly or at least a way to avoid the
>> issue.
>>
>
> I don't have any personal experience with PHP.  I did, actually, see
> the line you quoted.  However, I have to wonder if, with a stat like
> that, if I have to spend copious amounts of time worrying about
> whether I'm following all the conventions, whether or not the
> intrinsic structure of the language leads the programmer to commit
> more errors.  If so, I'd rather have a language that leads me to
> commit less errors.

You should spend that time with any program that takes user data.
Failure to sanitize user data is your fault period.  The user is always
stupid and/or malevolent.  The conventions are to clean up your input.
How you want to do that is up to you but you still have to do it no
matter what language you use.  There's nothing intrinsic about PHP
leading the programmer to commit an error regarding sanitizing inputs.
It can happen just about anywhere.  I gave an example in C (attempt to
execute a free() command on raw user data, computer go boom).  In PHP
it's dreadfully easy to santize, just one function call.  For example,
suppose a user is adding a name to a database.  The SQL might be:

INSERT INTO users SET name='$username'.

Problem is that if I failed to sanitize $username, a malicious end user
could submit a username of exactly:

"; DROP TABLE users;

and poof, my users table is gone.  There's no fault in the SQL language,
I followed the SQL syntax.  SQL can't know that I'm sending user data
that needs to be sanitized or if I'm sending concatenated commands (the
semicolon).  The fault lies in the fact that I, as the programmer,
failed to strip out or otherwise escape that quotation mark fed in by
the malicious user before sending it on to the database engine.  The
result is a dead database (and fodder for an xkcd comic:
http://xkcd.com/327/).

If you commit this error, that's entirely your fault and you can't hide
behind the language.

Same thing could happen (and usually does) with buffer overruns that
result in arbitrary code executions (almost always the latest security
exploit lately).  The programmer failed to examine and cut off the
user's input to the size of the memory (another place where it pays to
understand memory allocation) and *POOF* arbitrary memory gets
overwritten and suddenly the computer starts to execute random code.

Overall it has nothing to do with the programming language making the 
programmer commit errors.  It has everything to do with whether the 
programmer was even paying attention to what they were doing.  If I'm 
inserting SQL, I need to sanitize properly one way.  If I'm taking in a 
regular expression, I need to sanitize another way.  If I'm just writing 
a simple shell script, I still have to sanitize the data so the shell 
script doesn't run off into the weeds and blow the hard drive away.

>
>>> 10) Perl - does not have safe I/O and system calls per
>>> http://en.wikipedia.org/wiki/Comparison_of_programming_languages
>>
>> Not failsafe just like C which means that if you fail to test a
>> return value from a function that failed then it blows up.  Hmm,
>> sounds like PEBKAC/PICNIC to me.  Goes along with garbage
>> collection, it's the programmer's job to make sure that errors are
>> handled.  Critical unforeseen errors (RAM stick chokes and wipes a
>> block of memory) is up to the kernel but anything else is your
>> responsibility.  I/O path didn't open because of a typo?  Your
>> responsibility.  Database connection didn't open because the user's
>> password is wrong?  Your responsibility.
>>
>>
>
> Bottom line, if I have an I/O or system call that fails, and I
> haven't handled the exception, I want it to crash, or produce an
> error.  The last thing I want it to do is to keep running and then
> later produce some mysterious behavior that is nearly untraceable.
> If nothing else, maybe my generic exception handler will say "error
> at line xyz, program terminated" and then the program dies.  If it
> happens during my testing.  I can fix it right there.  If it happens
> in the customer's hands, I'm sure I'll hear about it.  But, at least
> I have a chance to fix it and I know where to look.

I believe that's technically what happens to Perl.  It just dies without 
explanation.  I know C will typically experience a segmentation fault 
and just die without a single explanation because I've done it countless 
times (wrote to null instead of a string pointer, poof, unceremonious 
segfault).

> It gets back to what I said at the top.  I want the language to be
> helping me solve my application problem, not necessitate that I'm
> continually worrying about falling into booby traps.

The language can only help you so far.  It's up to you to handle a lot 
of things that the language simply can not know a priori.  It can make 
some reasonable guesses and some languages even let you provide hints 
but it can do nothing more.