Tuesday, December 20, 2011

diff "Eiffel: the language" "GNU Eiffel"

We have been asked for some informations about the differences between the language originally described in "Eiffel: the language" (1992 by Bertrand Meyer) and those accepted by the current GNU/Smart/Liberty Eiffel compiler.
Such a request is not only reasonable but requires some answer: there have been several additions and quite a few changes to the language.
The main changes are:
  1. Creation procedures
  2. agents
  3. conformance of agents 
  4. anonymous, in-lined agents (a_command(12, "Foo", agent (x: INTEGER): REAL is do Result:=x.to_real ^ 2 end )
  5. insert, also known as "non-conforming inheritance"
  6. case sensitivity
  7. inspect allow for integer intervals... 
  8. FLOAT is replaced with REAL
  9. There is no NONE (pun intended :-)
  10. other I don't recall now.
To keep things tidy I am writing a separate post for each point.

Monday, October 3, 2011

Hello World… again

The first version of the Liberty interpreter was stillborn. It took a year to utter Hello World… and never said anything more.

The second version, based on SmartEiffel, is alive and kicking, after 2 weeks gestating.
The "runner" is screaming: Hello World!

Monday, September 26, 2011

Overflow and correctness

Eiffel is a programming language that strives for correctness.
In C instead most of the times speed is regarded as more important than correctness (paraphrasing a comment of Eric Sosman ).
Most Eiffel compilers use C as an intermediate language (Liberty/SmartEiffel, ISE), so a programming language striving for correctness relies on one that trade correctness for speed.
I've been thinkering about
I think we cannot blame C as this is a issue that has no "solution"; let's see a little panoramic of the efforts poured into something "as simple" as correctness in integer calculations:
Sadly there is no magic wand for this issue.
An answer may be saturation arithmetic; digging my fading knowledge of assembler, mainly from my days on 6502 I was wondering why no one seems to use the overflow flag present in almost all CPU: it seems there is no portable way to do it access it. Oh it would be so nice to write in INTEGER_GENERAL 
«infix "+" (another: like Current): like Current is external "built_in" ensure has_overflowed=False end»
I naively thought that an arithmetic overflow would have triggered a SIGFPE signal in POSIX systems. It seems I'm wrong:
class OVERFLOW
    -- Test integer overflow
insert PLATFORM
creation make
feature make is
        local i,j: INTEGER
        do
            i := Maximum_integer
            j := i+1 -- SIGFPE expected
            ("i:=#(1), j=i+1=#(2)%N"# (&i) # (&j) ).print_on(std_output)
        end
    end
When compiled with "compile  overflow.e" Liberty Eiffel correctly says:
*** Error at Run Time ***: Require Assertion Violated.
*** Error at Run Time ***: no_overflow
3 frames in current stack.
=====  Bottom of run-time stack  =====
<system root>
Current = OVERFLOW#0x925b038
line 5 column 9 file /media/Liberty/tybor-liberty/work/tybor-sandbox/overflow.e
======================================
make OVERFLOW
Current = OVERFLOW#0x925b038
i = 2147483647
j = 0
line 9 column 4 file /media/Liberty/tybor-liberty/work/tybor-sandbox/overflow.e
======================================
infix + (infix + INTEGER_32)
Current = 2147483647
other = 1
Result = 0
line 21 column 80 file /home/paolo/current-liberty/src/lib/numeric/integral.e
=====   Top of run-time stack    =====
*** Error at Run Time ***: Require Assertion Violated.
*** Error at Run Time ***: no_overflow
but when we compile it with "compile --boost overflow.e" it happily says: "i:=2147483647, j=i+1=-2147483648" which is obviously wrong. You have to compile it with "compile --boost overflow.e --cc gcc -ftrapv" to laconically receive the answer
"Received signal 6.
Eiffel program crash at run time.
No trace when using option "-boost"
Which is not what I would expect, since signal 6 is ABRT on my system.


infix "#": Eiffel's printf

It's really nice to be able to write in Liberty/SmartEiffel:

("i:=#(1), j=i+1=#(2)%N"# (&i) # (&j) ).print_on(std_output)
("obviously foo := #(2), bar := #(1)%N" # &42 # &17).print_on(std_output)

which may look cryptic to most Eiffel programmers, but may look familiar when translated into C:
printf("i:=%d, j=i+1=%d\n", i,j)
printf("obviously foo := %d, bar := %d\n", 17,42)
Beside looking a little more convoluted it actually has quite a few advantages:
  • like QString::arg of Qt fame it allows positional arguments,
  • it does not rely on variable argument function calls,
  • it does not allocate an unnecessary array to hold the arguments; each call to # actually return a ROPE which does is an ABSTRACT_STRING holding references to two substrings; the prefix "&" operator returns a LAZY_STRING (praises to Adrian for good idea) which will not be converted to a string until it gets iterated over, usually printing or copying it.
  • it allows to write things like:
    local s,t,u: STRING
    do
    s := "Eiffel"; t:= "beautiful"
    u := s | " is a " |t| " language"
    assert (u.is_equal("Eiffel is a beautiful language")
    t.prepend("really ")
    assert (u.is_equal("Eiffel is a really beautiful language")
    -- which may also obtained with
    u := "#(1) is a #(2) language" # s # t
    end
  • it does retain type-safety (AFAIK)

Sunday, July 17, 2011

SmartEiffel as Liberty core

OK guys, I have permission from Dominique, and I guess it is better anyway.

SmartEiffel is back in the trenches. It is now completely part of LibertyEiffel.

Here is the plan:
  1. really integrate SmartEiffel into Liberty (done)
  2. separate the C backend from the AST (fix and use the acyclic visitor)
  3. remove or rewrite the Java backend (using the acyclic visitor)
  4. rewrite the interpreter as a SmartEiffel backend (maybe using the acyclic visitor)

Extra points to fix:
  • sedb — it lost some of its power in the latest releases (some break points disappeared)
  • Liberty core — will be removed; that code is dead.
  • some bugs in inline agents parsing (SmartEiffel segfault)
  • some bugs in boost mode (invalid C code)

The Liberty libraries (both native and wrappers), on the other hand, are here to stay. They deserve being enhanced.

Spread the news :-)

Sunday, December 26, 2010

Coroutines

The linked commit introduces coroutines in the Eiffel world.

The concept is implemented in the Liberty repository, but that should work in a pristine SmartEiffel too.

Enjoy!

Friday, September 17, 2010

I was told Eiffel wasn't like C++...

Many Eiffel enthusiasts - I shall put myself in the group - often write about the many pros of our beloved language when compared to C++.
One of those aspects that I was sure Eiffel handled better than C++ was templates, known as generics in Eiffel.
C++ templates has been often accused to lead to code bloat and large executable, since the compiler generates specialized code for each type used in a template.
As an example when you use QList as QList<int>, QList<MyClass*>, QList<AnotherClass*> you will end up having three specialized copies of each and every function defined in QList, a class that has more than seventy functions. 
This justify why C++ executables are often considered "fat".
I don't know why but I have been always convinced that generics in Eiffel didn't show this pattern.
SmartEiffel proved me wrong.
I suspected it compiling my own wrappers-generator and getting  a 5,2 Mb executable from more or less 4450 lines of code as reported by command "wc".
Ehi! This is more than 1200 bytes of binary code for each and every Eiffel line, empty lines and comments included!
No, my coding style can't be so awesome and this tool is nothing special.
It just shall not be this big.
So I dived into generated source code.
At the beginning of wrappers_generator.id which contains a description of the compiled classes from the C and Eiffel compiler point of view I can read something like:

478 "HASHED_DICTIONARY[XML_DTD_ATTRIBUTE,UNICODE_STRING]" l
#
555 "HASHED_DICTIONARY[COMPOSED_NODE,UNICODE_STRING]" l
#
404 "HASHED_DICTIONARY[WEAK_REFERENCE[ANY_LINKED_LIST_NODE],STRING]" l
#
467 "HASHED_DICTIONARY[RECYCLING_POOL[PROTOCOL],STRING]" l
#
365 "HASHED_DICTIONARY[POINTER,STRING]" l


So I looked into wrappers_generator_T478.c wrappers_generator_T555.c  wrappers_generator_T467.c wrappers_generator_T365.c which are the C code containing the translations for those.
Please note that with the exception of POINTER, each and every classes referred in those incarnations of HASHED_DICTIONARY are reference classes that gets converted into a type-less C pointer ("void*" for the C-fond).
Now I looked into the function called Txxxcreate_with_capacity ... I wasn't too surprised to find a couple of pages of machine-generated C code that is exactly the same as all thos Txxx pointers are actually void pointers.
So in the end SmartEiffel actually implements generics in the very same (bloated?) way as C++ implements templates.
Now I guess that people smarter than me have made many researches on the topic but let me wonder whenever Liberty may avoid this.
And I know it is avoidable for reference classes.
Think a little about GLib collections and how they implemented generic object-oriented containers in C. They do not have the compiler to generate templates or generics for them, so they will end up with exactly one "binary" implementation of "replace" (g_hash_table_replace) for every kind of hash table.
That is how I would like to implement generic classes.
The only tricky part is that when you will invoke feature "foo" of the parametric type (ITEM in class LIST[ITEM]) you need to query the runtime for the address of ITEM_foo, so you will need a full fledged object-oriented type system.
I'm thinking about eventual implementation of multiple-inheritance runtime. I'm investigating a variant of Gobject interfaces. In fact if we turn each and every attribute query into an actual function call multiple inheritance looks somehow like using multiple interface; this will have a deep impact of performance since every access to any structure field will be turned into a deferred/virtual/indirect function call; actually I suspect that the link-time function in-lining of LLVM may be the silver bullet. Otherwise it won't be feasible, except if we want to "degradate" Liberty to an interpreted language.
A little ending note: when dealing with "expanded" classes, or with objects passed by value "template instantiation" is the only feasible way to go, so C++ was right.
Luckily (Smart)Eiffel prescribe that reference classes are always passed by reference and expanded always by value. This greatly simplify the work for the compiler to translate source code but most importantly greatly simplify things for our neurons.
Please feel free to correct any wrong deduction of this little note.