Wednesday, December 21, 2011

Case sensitivity

Eiffel was born as a case-insensitive language. So it would allow to write this code:
cLaSS nice_AnD_READable
INHERIT foo
FEAture BOOO is oD pRiNt("Boo!") end
eND
And using it elsewhere like:
local MY_obj: NICE_and_readABLE
do
CREate my_obj
my_OBj.booo
....
Nice, isn't it?
Now such code quite obviously will give an headache to people reading it and it will make life harder to all tools. While the latter "would not count" if those requirements will make code easier to read it is quite the opposite: making the language case-insensitive does make the language less readable.
The idea was - if I recall correctly OOSC -  to turn casual case-errors into warnings allowing the developer to concentrate on the logic instead of the case of the identifiers.
Actually style counts, so rightfully the definition of the language comes with a very precise style guide:

  1. all classes names shall be uppercase
  2. all constants shall be lowercase with the first letter uppercase
  3. all other identifiers and language reserved words shall be written lowercase
GNU SmartEiffel "only" makes the style rules mandatory so that the code above will read:
class NICE_AND_READABLE
inherit FOO
feature boo is do print("Boo!") end
end
...
local my_obj: NICE_AND_READABLE
do
  create my_obj
  my_obj.boo
...
This may look quite strict and at first it feels so; my experience tells me that this discipline is actually good for the developer writing the code, those reading the code allowing for a smoother reading.

16 comments:

  1. OK, so I am about to make a subjective remark. Gnu Eiffel is doing the right thing with case.

    This version is "Eiffel-esque" in my take on the spirit of the Eiffel language, which is to say, strict in a good way, because the rules are right to the maximum degree on all scales, NOT just the big picture with sloppy mismatched details.

    Architecture is important! Not just "strategic" matters.

    ReplyDelete
  2. there is another rule, but that one is not enforced (but used throughout the library): formal generics are terminated by an underscore

    ReplyDelete
  3. My personal rule is to make the formal generic read like a generic name in English; so instead of LINKED_LIST[E_] I tend to write LINKED_LIST[AN_ITEM] .
    I know my preference is not "canonical" but it make the rest of the class easier to read, at least IMHO.

    ReplyDelete
  4. I think it's less a matter of personal choice than of consistent rules. It's a lot easier to read code if it's always formatted the same way.

    In that matter, SmartEiffel rules are quite precise. I should write a post about it.

    ReplyDelete
    Replies
    1. Oh Cyril, I am a sinner! I left thy Holy Church of Emacs to embrace the vile cult of Vim!
      Now I cannot format my code in a holy way but I have to use ancient incantations that twist the code and make it looks heretic....

      Leaving funny jokes about Emacs being a religion, I shall debug the vim Eiffel formatter as it constantly format my code wrong...

      Delete
    2. I absolve you if at least you use se pretty ;-)

      Delete
  5. A quick question ...
    if I have a function redefined into a constant ...

    how should I write the code for the caller .. with a first uppercase character or a lowercase?

    ReplyDelete
    Replies
    1. Constants really means "something that is originally defined as constant".
      The original definition is for something "that may/will be computed" so my guess is that this case is like:

      class FOO feature bar: INTEGER is do .... end

      class BAR inherit FOO feature bar: INTEGER is 12 end

      You may write somewhere:
      local f: FOO; b: BAR
      do
      create b
      f := b
      print (f.f.out) -- will print 12
      end

      In that case you may not know it is sometimes constant.
      *BUT* we may also conceive the counterexample of :

      class FOO

      feature {ANY}
      g: INTEGER is 42
      f: INTEGER is
      do
      end

      end -- class FOO

      class BAR

      inherit
      FOO
      redefine f, g
      end

      feature {ANY}
      f: INTEGER is 12
      g: INTEGER is
      do
      end

      end

      So things are getting murky..... I don't have a defined opinion now... 8-/

      Delete
    2. Actually, I would like that the compiler enforces the user to use correct case, or raise warning.
      But in practice, I am not sure this is really doable.

      The problem with constant is one, should I do a renaming with Liberty Eiffel such as ?

      class FOO
      feature
      bar: INTEGER is
      do
      Result := 123
      end
      end

      class FOOBAR
      inherit
      FOO
      rename
      bar as Bar
      redefine
      bar
      end
      feature
      Bar: INTEGER is 123
      end

      And then the caller .. should I use
      foo: FOO
      foobar: FOOBAR
      ...
      i := foo.bar
      i := foobar.Bar

      So this mean, I have to know FOOBAR implement `Bar' as a constant ?

      In fact, I guess this particular case comes from a "not so good" style related to constant or once ... I guess we should not recommend to use first letter as uppercase, especially if the constant feature is a one letter name ...
      This is just a nightmare when you want to do refactorying, and really .. the caller should not care if a feature is implemented as a constant or a function...

      Now, if the compiler enforces or report warning if a class has lowercase character, or similar if feature has uppercase character .. that's ok for me, this is acceptable.

      Even if sometime when wrapping a C library, it might be convenient to have for instance c_FOOBAR to wrap the value of FOOBAR macro. For me, this can help, but once again I don't have a strong position on that.

      So if a case-sensitive language .. means the compiler checks that the follows strong style/rule that's ok ...

      BUT if this means the compiler understands
      foobar: FOOBAR
      fooBar: FOOBAR

      As 2 differents entities ...

      or even class FooBar and class FOOBAR as 2 differents classes, then I think this is really bad.

      So for me Liberty Eiffel is not a case-sensitive language, it is a language that requires the code to follow style for feature and class names.

      Do you agree with my analyze?

      Delete
    3. Yes, I do agree on your analysis: Liberty Eiffel is a language that requires the code to follow style for feature and class names.
      I've been pondering about it for a while and I've come to the conclusion that the style guide is a good one but we actually require the language to be case-sensitive. See the comment of Frank he is entirely right.

      Delete
  6. I forgot to sign my previous "anonymous" comment
    -- Jocelyn

    ReplyDelete
  7. I would like to suggest that user defined variables should be enforced as being case sensitive. Otherwise you will exclude a very large number of possible users. These are scientific and engineering programmers.

    If you look at http://cheminfo.chemi.muni.cz/ianua/epr/tab/Scientific%20Abbreviations%20and%20Symbols.pdf, you will see the reason immediately.

    In much of this work, roman, greek and other alphabetic characters are used (are required) to make manipulation of expressions managable.

    I believe that Eiffel provides a very powerful tool for physical and other simulations.

    Requiring programs to be case sensitive will include these users.

    ReplyDelete
    Replies
    1. Every scientific field uses a lot of symbols: civil engineering, for example in the Eurocodes uses a lot of symbols like this.
      So we must go beyond case-sensitivity and ASCII, a standard which is almost half-century old.
      I think we shall - at least - write source code in Unicode and lay down some style guide for its usage. More on this soon...

      Delete
    2. I fully concur with your comment but felt that such a request at this stage might be less than helpful.

      Delete
    3. I agree that lacking the possibility to express symbols in their original and common form can be pain.

      However I think the risk to have myAttribname, and myAttribName and that user use one instead of the other (developer of this class, or users of this class) is greater.

      Not sure the gain to have all symbols available in the lang compensate the high risk of bugs (as long as human do the coding...)

      So far, unicode is supported for operator (in ECMA Eiffel), which is much better than before for math expression.

      Now I guess, we can argue for ever, I can already see sometime some mistakes due to using foo_bar in place of foobar, so I can not even imagine if the language distinguish FooBar from foobar, from fooBar, from FOobaA, from fOobAr or foObAr
      I can understand "a" need for that, I hope that no one will actually use all those variants of foobar in the same scope.

      What would be more critical for me is that reserved word for keyword, not being able to name an attribute "class" or "feature" is annoying, but still I can live with that. So I guess scientific can also accept to write Epsilon instead of ε (I could be wrong about that)

      It would be interesting to know about any programming language that allow such advanced use of Unicode in a written language.

      Any reference to such programming language ?

      Delete
    4. I don't know any fitting programming language - I'm a dusty'n'dirty civil engineer after all - yet all our mumbling about case sensitivity of Eiffel somehow remind me of my AmigaOs filesystem which is case-insensitive with case-preservation, almost the same rules of original Eiffel.
      Case-insensitive with case-preservation (with warnings as you proposed) would robs us the possibility to write E for energy and e for the base of natural logarithms barring us to write E := K*e^(x-y)
      We may conceive an hypotetical math-saving rule allowing the usage of these simbols matching the [:upper:](_[:alnum:])* regex (E_foo or E_my1 or G_m but not GM_asd

      Delete