This article focus on programming in C# with Mono. It contains a compilation of extremely useful tips and workarounds, especially for people used to lower level programming, like C programmers. Since the Mono documentation is still far from finished, and I found from my experience that it is still very hard to find information and help with C# issues when using Mono, I compiled a series of tips that I gathered from my experience with Mono.
1st Tip: Another good place to find information.
For most of the people out there, this will save them a few bucks in buying books about C# or a few trips to the public library. For others, this will just save them a few hours Googling. Many of you will find yourselves in the same situation: I only had a IDE (Kate, in my case), Mono and Monodoc. Well, I had Google too. Monodoc is not bad, but is rather insufficient. It is extremely incomplete. So, I found myself peeking at the MSDN site, where they have all the class docs. Here is a direct link for the class docs.
Note: Before you attempt to use any class, method or property, you should check monodoc first. Why? Because even though monodoc does not document everything, it has the pointers to every implemented class, method and property. Even though it does not explain every one of them, it will tell you if they are implemented, not implemented or incomplete. If you come to use XML with your Mono program, you will notice, when you take a look at monodoc, that a few methods and properties in the most important XML classes are not implemented. I’m talking about XmlDocument.GetElementById, for example. This method is marked unfinished and actually, it is. I was not paying much attention to monodoc when I tryed to used it, and I wondered for hours why my program kept crashing with null pointer exception in the call to that method. Then, when “re-browsing” monodoc I noticed the Unfinished tag. I tryed to get a workaround for this. Well, I coudn’t. At least, I didn’t try hard. So beware, that there are a lot of unfinished methods around.
2nd Tip: Before starting to use Mono.
Before starting to use Mono, there is a special tweak you can prepare if you use Linux and have superuser permissions on the machine. I allways found it anoying having to type mono programname.exe
everytime I want to run a .Net program. So, I used a trick, that takes advantage of a Linux kernel feature (if you’re used to compiling the kernel, its the Misc Binary Format Support). Remember that you MUST have support for it in the kernel (in general, the distributions come with support for Misc Binary Format activated, but if not, you will have to recompile the kernel).
This trick allows me to call a program just typing ./programname.exe
(if you’re using BASH, you won’t have to type much :)). When you’re root, edit your /etc/fstab and add the following line
none /proc/sys/fs/binfmt_misc binfmt_misc defaults 0 0
With this line, the binfmt_misc virtual filesystem will be mounted in /proc, and you will be able to instruct the kernel to interpret different binary file formats. I believe Java programmers have been using this, and some distributions used to come preconfigured to suport this for use with Wine, to allow a user to easily run Win32 binaries.
Now that you’ve edited fstab, you need to mount the filesystem. Just run this in your shell:
mount /proc/sys/fs/binfmt_misc
Don’t worry, you will not need to run this everytime you’re computer boots because it is already in your fstab. Next time your computer boots it will mount it “automagically”.
Now, you have to tell the kernel that there is a special kind of binaries he has to be aware of, and that there is a proper interpreter for them. You can do that by executing the following line in your shell, as root, of course:
echo ':CLR:M::MZ::/opt/gnome/bin/mono:' > /proc/sys/fs/binfmt_misc/register
Notice that you have an MZ in that line. MZ is the identifier for executables in Microsoft OS’s. That MZ stands for Mark Zbikowsky, if i’m not mistaken, which was the responsible for the creation of the original EXE format. For those of you using this trick with Wine, some of you may have noticed that this can interfere with Wine and cause conflicts. There are ways of going around this, although I’m not discussing them here.
Once you’ve done this, try running a .Net program in your shell, just by calling it’s name: ./programname.exe. Didn’t it work nice? If you do not want to have to type this command every time your computer reboots, you should add it to your /etc/rc.d/rc.local (actual file and path may vary across distributions).
3rd Tip: Executing external programs (shelling).
One of the things I needed to do in a small program I made, was to be able to execute an external command. I searched all the Monodoc info, and even went to MSDN. But, I had little luck understanting. I finally found somewhere where it was explained how to do it. Use the following code:
<pre>
System.Diagnostics.Process proc = new System.Diagnostics.Process(); // (1)
proc.EnableRaisingEvents=false; // (2)
proc.StartInfo.FileName= "ls"; // (3)
proc.StartInfo.Arguments = "-lh"; // (4)
proc.Start(); // (5)
proc.WaitForExit(); // (6)
if(0 != proc.ExitCode) Console.WriteLine("Error"); // (7)
</pre>
The code is very easy to understand. You create an object of the Process class of assembly System.Diagnostics (1), you let him know you don’t want any events raised (2) and you let him know the command (3) and the arguments (4). Then you tell him to start executing the program (5) and to wait for its completion (6). When complete, you can test the exit code for errors (7). How different can it be from system("ls -lh");
? 😉
4th Tip: Using hashing classes
If you come to need to hash data, for example, digest a password with MD5 to save it, you will need to use the HashAlgorithm class. It is very simple to use. First you need to create an instance of the class for the hash algorithm you need (it can be one of a few, including MD5, SHA1 and others):
HashAlgorithm digest = HashAlgorithm.Create("MD5");
In the preceeding line we created an instance of HashAlgorithm for MD5 “digestion”. If you need to use any other algorithm, just replace the “MD5” for the correct one. I’m not sure if it is imperative to be in uppercase.
Then you can digest the data. ComputeHash expects a byte[], so, assume that buffer is a byte[].
byte [] byteHash = digest.ComputeHash(buffer);
The byteHash variable will then hold the result of the hashing. It is also very probable that you want to apply it to a String object. In that case you will need to convert from a String to a Byte[]. There are three aproaches you can use: 1) you use the ASCIIEncoding encoder; 2) you use the UTF encoders; 3) you create your own encoder that converts literaly the string to a byte[] without performing any conversion.
The ASCIIEncoding and the UTF encoders are already part of the class library, so you do not have to create them. With the ASCIIEncoding it is very easy. Assuming stringToHash is the string you want to hash, do it like this:
<pre>
// create a new encoder instance
ASCIIEncoding encoder = new ASCIIEncoding();
// geting the byte[] array needed for the hashing process
encoder.GetBytes(stringToHash, 0, stringToHash.Length, buffer, 0);
</pre>
You can then use buffer directly with ComputeHash.
But, if you use ASCIIEncoding, you cannot use international characters with it, because that encoder will replace everything above charcode 127 with a question mark. On the other hand, you could do it with UTF encoders like this (example for UTF8):
<pre>
// create a new encoder instance
UTF8Encoding encoder = new UTF8Encoding();
// geting the byte[] array needed for the hashing process
encoder.GetBytes(stringToHash);
</pre>
Notice that with UTF8Encoding you didn’t have to pass all those parameters you had to when you used ASCIIEncoding.
UTF encoders will work but, if you plan on using them, for example, in password challenging in autenticating a connection, you cannot be sure of the encoding used in the other side. It could be anything. In my case it would probably be CP860, ISO-8859-1 or ISO-8859-15. Since the UTF encoders would replace every char above 127 with the UTF-8 code (in the UTF8 encoder case) or every char of the charset with an UTF16 (in the UTF16 encoder case), there would be problems.
So, you’re left with creating your own encoder. I created one which I called NullEncoding, because it doesn’t encode… just converts directly between a string and a byte[], and vice-versa:
<pre>
public class NullEncoding
{
// will give you the bytes from a string
public static byte[] GetBytes(string s)
{
byte [] result = new byte[s.Length];
for(int i = 0; i < s.Length; ++ i) result[i] = Convert.ToByte(s[i]);
return result;
}
// will give you a string from the bytes
public static string GetString(byte [] b)
{
string result = String.Empty;
for(int i = 0; i < b.Length; ++ i) result += Convert.ToChar(b[i]);
return result;
}
}
</pre>
It is a little bit different from the other encodings, but it gets the job done. If you payed attention, you will notice the static in the declaration of the methods. This way we do not need to create an instance of the class to encode our stuff. So we can do just like this:
<pre>
HashAlgorithm digest = HashAlgorithm.Create("MD5");
byte [] byteHash = digest.ComputeHash(NullEncoding.GetBytes(stringToHash));
</pre>
You may also have noticed the Convert.ToByte and Convert.ToChar. If you’re a newbie, check monodoc, as an exercise.
5th Tip: Gtk# – The missing widgets
I started programming Gtk with C#, using Gtk#. When I was using a Gtk.Notebook widget, I wanted to place a container widget (an Hbox, in this case) with an icon and a text label as the label for the sheet tabs. But when I runned my program, which would show the window with the ShowAll() method, I noticed that the widgets in the notebook tab wouldn’t get shown. I tryed using just a text label as sheet tab label, and strange enough, the label alone would get shown. As I didn’t have any experience with Gtk until then, I cannot tell if this is a problem related with Gtk itself or with the Gtk# classes. Apparently, the notebook, for the wigets in the tab labels, was only calling Show(), and not ShowAll(). This resulted in only the base widget being shown. The Hbox was being shown, but not the widgets inside. So, remember to have your program calling ShowAll explicitly for the container widget inside the notebook tabs, if you want to have everything visible.
6th Tip: Gtk# – One thing you cannot do in C
One thing you cannot do in C is to derive a window class to create your window. This because C… has no classes. But in C# you can. In fact, you are advised to do it.
When creating a window, make a class that inherits from Gtk.Window. Create your own constructors, and make them set up the window appearance for you, instead of having it done in Main or somewhere else. Don’t forget to call base() in the constructors to make sure everything is done correctly.
This works, and it is extremely usefull, because it allows you to have your code better organized, and to isolate components, so that you can reuse them (hey, that’s part of why OOP was born :D).
Here’s a short example:
<pre>
public class MyForm: Gtk.Window
{
// replace "someimage.png" with a image path of your choice
public Gtk.Image image = new Gtk.Image("someimage.png");
public Gtk.Label text = new Gtk.Label("Welcome to my form");
HBox hboxBase = new HBox();
// constructor
public MyForm(): base("This is my form")
{
// setting up form aspect
this.DefaultWidth = 600;
this.DefaultHeight = 300;
this.AllowGrow = false;
this.AllowShrink = false;
this.Modal = true;
this.SkipTaskbarHint = true;
this.Add(hboxBase);
hboxBase.PackStart(image, false, false, 0);
hboxBase.PackStart(text, false, false, 0);
}
}
</pre>
If you now do this:
<pre>
MyForm form = new MyForm();
form.ShowAll();
</pre>
You will have a window showing with the title “This is my form”, and in the window body, you will have the image of your choice on the left, and a text label showing “Welcome to my form” on the right. Also notice that it won’t show anything on the taskbar about this window because we’ve disabled it with this.SkipTaskbarHint = true;
. Take time to try this and to experiment various combinations of widgets. Also, in monodoc there are some tutorials on Gtk# worth taking a look. Notice that all widgets belonging to that form are inside the class, and that it was not necessary to make the base HBox (hboxBase) public.
That’s all folks. I hope you will enjoy this and find it useful.
About the author:
My name is João Manuel Moura Paredes, and I’m a student at Faculdade de Engenharia da Universidade do Porto (Engineering School of University of Porto) in Portugal. I’m almost 22 years old and have been a programmer for more than a decade. I have experience in several programming languages (specially at low-level programming) and databases. I amexperient a few OS’s, specially in Linux. I am the founder and chair of Chefax I&D student group at my University, and am currently chair of NEACM, our local student chapter for ACM (Association for Computing Machinery).
If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSNews.
Thanks for the great tips! Sure they will come in handy, especially that misc-binary thing.
Bin_fmt is a nice little deal, but I think it can cause problems if you use wine(I don’t) because the exe headers are the same. Just a caveat.
Yes, that was explained in the article.
“One thing you cannot do in C is to derive a window class to create your window.”
Yes you can. Apps do it all the time. This is the whole point of the GObject system.
No, you cannot derive a CLASS. C doesn’t has classes. The GObject system is about structures as pseudo-classes.
Some article corrections…
3rd tip: System.Diagnostics.Process is in the System assembly, not the System.Diagnostics assembly (which doesn’t exist).
6th tip: You certainly can create your own Window from C. It’s not as easy as in C#, but it is possible. (If it wasn’t, then how was GtkWindow declared — in C — to be a derived class of GObject?) Object derivation requires that you provide two structures, the instance structure and the class structure. It becomes class derivation when the first instance of these structures is of the appropriate base class type:
struct MyForm {GtkWindow parent; /* other members… */};
struct MyFormClass {GtkWindowClass parent; /* other class members… */};
Other steps will also need to be taken. See: http://www.le-hacker.org/papers/gobject/ch05.html#howto-gobject.
I always found it anoying having to type mono programname.exe everytime I want to run a .Net program
Me too. But the article didn’t explain what to do about it. It still ends in .exe .
It’s not as easy as in C#, but it is possible.
Its not easy to do surgery on yourself in a bath-tub filled with ice while looking up at an overhead ceiling, but its possible.
For some people, though, its just as good as “You can’t.”
Thank god for C#/Gtk#, or I would never have written GUI-based programs for Linux, ever.
No, that wasn’t an attack on you, that was just me being EXTRA witty.
3rd tip: System.Diagnostics.Process is in the System assembly, not the System.Diagnostics assembly (which doesn’t exist).
According to Monodoc, and according to my own tests, System.Diagnostics exists, and Process is there.
6th tip: You certainly can create your own Window from C. It’s not as easy as in C#, but it is possible. (If it wasn’t, then how was GtkWindow declared — in C — to be a derived class of GObject?) Object derivation requires that you provide two structures, the instance structure and the class structure. It becomes class derivation when the first instance of these structures is of the appropriate base class type:
I took a look at GObject documentation, and as far as I could see, GObject are not classes, but pseudo-classes based on structures, as Jay Developer said.
You usually either create a shell script that wraps the binary and mono and you put it on /usr/bin or you tell Gnome to run all .exe files with mono. However, the second solution is not very good if you also have WINE installed.
Perhaps a bit off topic, but last week i experimented with some elementary asp.net files on Mono-Linux (i tried both xsp.exe and mod_mono). Especially web services seemed to run very slow in comparison to Asp.net-Windows. Is Mono still missing some important performance code with regard to web services or do i need to tweak it somehow?
Gobject is an object system with classes, instances, properties, methods, inheritance, signals and slots. It supports everything that is expected from an object system. Can anyone explain what’e “pseudo” about it?
Sure, it’s ugly, and proper OO support in the language is much nicer, but the fact remains, you can subclass GtkWindow and it will semantically be exactly the same thing as subclassing in C++/Java/C#/whatever.
I’m not defending GObject, i think it sucks, but’s that’s no reason to spread lies about it, no?
I meant “pseudo”, not as an insult, but as a way of saying that those “classes” are not based on the language direct support, but on using some kind of workaround. I’m not trying to say they don’t work, or that they do not provide all the features classes are supposed to.
Hi
You should be aware that pseudo actually means false. Gobject is very much a existing useful system and thats the only way c could have object oriented mechanisms for gtk+
Jesus Christ! you can derive classes in Fortran if it has a proper object library. Has it occured to the author that GTK+ is an object oriented library written in C?
I mean to say, before C#/GTK# were ever conceived, GTK+/C/gobject had support for classes, methods, slots, signals, inheritance etc.
And what on earth is psuedo classes? If there is anything psuedo about languages, it is the level of abstractions present in object oriented languages. Classes in object oriented languages are a combination psuedo structures, psuedo function pointers and psuedo pointers to structures.
In fact, fundamentally, a class is refined data structure. Any C programmer worthy of typinp on keyboard should be able to implement classes in raw C, using data structures and pointer functions. Which, essentially, if how the gobject library was designed if memory serves me right.
Oh boy, I feel bad for the next generation of programers. I mean, do I really need to tell a programer that object orientation is just an abstraction model and that you can write a project in an OO pattern in any language?
Sure C# makes it easier and cleaner to write GUIs, but I think it is important that programmer really understand what OO is, and frankly it is more of an ideology than a characteristic of a language.
Yes, you can derive window classes in C.And GTK+ makes it easy to do so.
That’s the problem: language. In Portuguese, pseudo actually may mean something else, like “almost”. In fact, it can mean from “false” to “disguised”, not forgetting the “apparent” and “look-alike”.
Ok, fair enough. “Pseudo” is a vauge word.
Regarding the speed of Mono. How fast does it run today? Do they expect to reach performance parity with commercial VMs anytime soon? Reading the white papers from sun on their HotSpot VM makes it clear that creating a high performance optimizing VM is no small feat. It took them several years and lots of skilled developers to reach where they are today. And it still suffers from poor startup performance and excessive memory use, although perfomance is pretty impressive.
The CLR is different from java in a number of ways, but the fact remains: dynamic optimization is tricky, to say the least.
AOT compilation helps with the memory consumption and startup performance, but languages as high-level and relatively dynamic as Java/C# needs dynamic optimization to be competitvie with C/C++, performance-wise.
Especially web services seemed to run very slow in comparison to Asp.net-Windows. Is Mono still missing some important performance code with regard to web services or do i need to tweak it somehow?
Yes. Natively compiled code . Microsoft has hyped runtime code, but they cheat by making pseudo-x86/COM/CLR binaries – and they tend to do this whether they are managed or not. Microsoft cheats, and the Mono people have fallen for the .NET/CLR hype. It is not fast enough for a lot of applications.
The web services bits in Mono are probably very experimental at this stage. They are concentrating on other things.
This irrational fear of high-level programming constructs is ridiculous. C programmers limit themselves completely unnecessarily. Would you want to:
– Implement multimethod dispatch in C? Do you think your one-off algorithm would run anywhere near as fast as an optimized dispatch? Don’t think you need multimethods? Ever use visitor pattern?
– Implement dynamic typing in C? Do you think your performance will be anywhere near as fast as what you can get when you have a good type-inferring compiler? Don’t think you need dynamic typing? Ever used a union of different pointer types?
– Implement lambdas in C? You’re into “generate machine code in C-strs” here. Do you think it’ll perform anywhere near as fast as it does when you have a closure-optimizing compiler? Don’t think you need lambdas? How many one-off callbacks do you have? How many nested loops?
– Hand-expand macros? Back to generating machine code here. Don’t think you need macros? Do you *like* manually writing code to (say) parse an XML DOM tree?
– Implement predicate dispatch? Pattern matching? Do you realize how much your performance will suck without a compiler that optimizes those predicates? Don’t think you need predicate dispatch? How many huge ‘if…else’ trees do you have in y our code?
– Manually perform storage-use optimizations? Manually generate copy-down methods? C generic data structures are much slower as a result of having to deal with everything via void*, and having to use function pointers for predicates.
– Implement inheritence? Limited types (constrained types)? Union types? Product types? Type constrained variables? Do you know how much your performance will suck compared to a language that has a proper class-hierarchy optimizing compiler? You’ve already admitted you need inheritance. Oh, how much does your performance suck when you don’t have an optimizing compiler to optimize-out all those indirect function calls?
These days, its only moderately useful to understand what’s going on at the machine level. The C abstract machine looks nothing like a modern processor anyway. Its completely silly, however, to *program* at the machine level. Especially when the end-result is likely to be slower than letting a near-omnipotent optimizer beat the code into shape.
Programs must be written for people to read, and only incidentally for
machines to execute.
–Abelson and Sussman
If you want to implement OO in C you don’t use function pointers, unless you want per-instance methods, and neither C++/Java/C# supports that (highly dynamic languages as Python and Ruby does). I think this misconception is based on the fact that in C++, methods look exactly like function pointer struct memebers, but they aren’t.
You use a per-class table where you look up the right method, and invoke it. Imagine you have a class with one instance variable (an integer, 4 bytes), and 20 methods.
With per-object function pointers, each instance of that object would occupy 84 bytes!! In C++, the pointers would be stored in a per-class table (the vtable), and the object would occupy 4 bytes for the integer, plus a couple of bytes overhead (depending on the situation, at least 4 bytes for the pointer to the vtable).
That’s a 10x difference, now wonder that most languages don’t have per-instance methods.
Don’t confuse the assembly and the namespace.
The System.Diagnostics.Process class lives in the System.Diagnostics namespace, and in the System.dll assembly.
From Monodoc:
Namespace: System.Diagnostics
Assembly: System 1.0.3300.0 (in System.dll)
Culture: neutral
Remember: assemblies are the units of packaging (.dll, .exe, etc.), and namespaces are to save us from typing long names all the time. 🙂
Sorry. My confusion.
It sounds like your complaint isn’t about the lack of the executable bit, but of the presence of the .exe extension.
You’re in luck: the file extension isn’t necessary. mono my-program doesn’t require the file extension, and neither does ./my-program (with binfmt_misc set up properly). Such is the joy of Unix file sniffing for file type detection. Yay.
However, removing the extension limits portability. One of the nice properties of CLI assemblies is that they can (in theory) run unchanged between Linux and Windows (and other operating systems Mono supports). If you remove the extension, then other operating systems (Windows) might not recognize the file type, rendering the program useless until it’s renamed. So that we can easily copy programs between platforms, we continue to use the .exe extension.
As Eugenia pointed out, shell scripts can also be used, so Unix users get the extension-less program names, and Windows users can continue with their blissful existence. 🙂
What on earth are you babbling about?
If you really knew what you where talking about you would know that:
* Type inferring has nothing to do with dynamic typing.
* Parse a DOM tree? Do you even know what the DOM is? If you can point me to a parseable DOM tree i will gladly make a parser in assembly for you . If you mean an XML document you should say so. No one refers to an XML document as a DOM tree, especially when one means it as an input to a parser.
What language are you talking about anyway? What language has multimethods, dynamic typing, type inference, lambda expressions, a macro system, pattern matching, inheritance, constrained types, product types etc…Haskell? Wouldn’t classify that as faster than C, not even close. Maybe faster than python.
In latest zsh on a shell level you can associate executable simply like :
alias -s .exe “mono”
bah. i’ve built myself a microsoft-free existance, but the walls are starting to crack.
now i’m going to build a development platform and design myself a next-generation language with C-based syntax. see you all in 15 years.
This irrational fear of high-level programming constructs is ridiculous. C programmers limit themselves completely unnecessarily. Would you want to:
What irrational fear? I don’t fear using high-level programming constructs, it’s just that 90% of them have no practical uses.
– Implement multimethod dispatch in C? Do you think your one-off algorithm would run anywhere near as fast as an optimized dispatch? Don’t think you need multimethods? Ever use visitor pattern?
I’ve never used a visitor pattern. In what application is it normally used and how?
– Implement dynamic typing in C? Do you think your performance will be anywhere near as fast as what you can get when you have a good type-inferring compiler? Don’t think you need dynamic typing? Ever used a union of different pointer types?
For the love of God, why would I want dynamic typing in C? And no I have never had to use a Union of different pointer types. How and when is this applicable?
– Implement lambdas in C? You’re into “generate machine code in C-strs” here. Do you think it’ll perform anywhere near as fast as it does when you have a closure-optimizing compiler? Don’t think you need lambdas? How many one-off callbacks do you have? How many nested loops?
I really don’t know. I haven’t carried out any benchmarks so I can’t speculate. Also compiler optimization isn’t my forth. But on average, C code is usually faster than many higher level languaged generated code, irrespective of the implementation facilities of lambdas in C. And C compilers are a lot more mature and generate faster code than any other language I know that implements lambdas.
– Hand-expand macros? Back to generating machine code here. Don’t think you need macros? Do you *like* manually writing code to (say) parse an XML DOM tree?
Is that even possible?
– Implement predicate dispatch? Pattern matching? Do you realize how much your performance will suck without a compiler that optimizes those predicates? Don’t think you need predicate dispatch? How many huge ‘if…else’ trees do you have in y our code?
So, say, I write a pattern matching algorithm in Perl, and one in C, are you insinuating that the Perl code will undoubtedly be faster than the C code? You need to specific with your accounts and provide practical instances where C code clearly sucks in pattern matching or predicate dispatch.
– Manually perform storage-use optimizations? Manually generate copy-down methods? C generic data structures are much slower as a result of having to deal with everything via void*, and having to use function pointers for predicates.
Your sources and benchmarks are welcome. Until then, I hold your statements as unproven theories.
– Implement inheritence? Limited types (constrained types)? Union types? Product types? Type constrained variables? Do you know how much your performance will suck compared to a language that has a proper class-hierarchy optimizing compiler? You’ve already admitted you need inheritance. Oh, how much does your performance suck when you don’t have an optimizing compiler to optimize-out all those indirect function calls?
I don’t really see the performance hits you are clamoring about. In my experience languages that have all these properties built in are much slower than C which doesn’t. Which performance hit are we talking about? What applications suffer from all these performance hits? Where are your benchmarks?
These days, its only moderately useful to understand what’s going on at the machine level. The C abstract machine looks nothing like a modern processor anyway. Its completely silly, however, to *program* at the machine level. Especially when the end-result is likely to be slower than letting a near-omnipotent optimizer beat the code into shape.
I disagree with you. It is very useful to understand what is going on at the machine level. You learn to write optimized and efficient code, and you actually have a thorough knowledge of what your code is doing at every stage. This skill is excellent for optimizing, debugging, maintanance and trouble shooting your code. Which is 90% of what you’ll be doing in your coding career anyway. Ultimately, the errors your debuggers, optimizers, profilers, memory tools spit out are more often than not machine code jargon.
I also disagree that coding at the machine level is slower and less optimized than coding at a ridiculously abstract level. In fact, I don’t know where you get that notion from. If that was so, we all be writing out drivers, kernel, compilers, higher level languages and libraries in Python.
Finally, I don’t consider C to be a machine level language. In my opinion, it is a high level language, that people can read. I have no fear in using higher level languages, many of the features present in higher level languages are useless to me. Just like Microsoft Office, 95% of the feature available in it are absolutely useless to me.
Data structures, variables and methods/functions, that’s all what programming is about, every other feature academic verbosity and needless complexity.
What the hell is everybody’s problem with Mono programs ending in .exe? Guess what, it’s part of the .NET spec. In reality this is no different than putting a .py, .pl, .ru, .c, .cpp, or .sh at the end of a file. I like the idea of Mono programs ending .exe with the only issue being that it is difficult to distinguish between apps that Wine should execute vs. apps that Mono should execute.
Hello,
Mono performance is fairly good. In general, you get much better performance from CLI-based systems because they fixed
a few design issues in the VM that have a big impact on performance (for example having valuetypes, default virtualization mode for methods, etc).
So, yes, HotSpot is a terrific JITer, but the good news is that Mono can give a reasonable JIT speed (70-80% of it) with 30% of the work. With more full time JIT hackers coming on board that gap should close pretty quickly.
At to Web Services: our support is fairly good, it has gone through extensive testing, no need to spread fear and doubts. Am sure there are still some unresolved issues (like with any other piece of software), but am confident that we can solve those.
Miguel.
* Type inferring has nothing to do with dynamic typing.
Type inferencing is a method, in dynamically typed languages, of increasing the performance of dynamically typed code. The compiler uses a type-inference algorithm to determine what method calls can be bound statically. Unlike type-inferencing in static languages, it is not fully complete — ie. when type inferencing fails, it falls back to dynamic typing, rather than failing altogether. Now, when you fake dynamic typing in C (via void* and tagged unions), you don’t get the advantage of the compiler doing this for you.
* Parse a DOM tree? Do you even know what the DOM is? If you can point me to a parseable DOM tree i will gladly make a parser in assembly for you .
Don’t be childish. Of course I know what a DOM tree is. An XML-DOM parser will spit out a DOM tree after parsing an XML file. Usually, the DOM tree itself is rather unweildy to use. Thus, it is usually parsed into some internal program representation. This is precisely the step Xen tries to automate. Well, you can do the same thing with macros, too.
What language are you talking about anyway? What language has multimethods, dynamic typing, type inference, lambda expressions, a macro system, pattern matching, inheritance, constrained types, product types
Not a specific one. I wasn’t actually trying to push a specific language, just to juxtepose high-level languages against C. Dylan and Common Lisp have everything on that list except product types and pattern matching (though you can fake pattern matching with macros to a great degree). Haskell has everything except the macro system, dynamic typing, and multimethods. The new Nemerle language for .NET has most of these features too.
I’ve never used a visitor pattern. In what application is it normally used and how?
What kind of code do you write? Visitor pattern is pervasive in many classes of programs. Any time you need to dispatch on the type of more than one variable (eg: a game engine where you need to compute intersections between triangles and circles, circles and squares, etc, etc), you end up using visitor pattern, or some logicallly degenerate case of it. Multimethods make the visitor pattern completely unnecessary.
http://exciton.cs.oberlin.edu/javaresources/DesignPatterns/VisitorP…
For the love of God, why would I want dynamic typing in C? And no I have never had to use a Union of different pointer types. How and when is this applicable?
Again, what kind of programs do you write? This is *very* common in C programs. Look at the definition of XEvent in Xlib.
http://tronche.com/gui/x/xlib/events/structures.html
Its a type tag plus a union of all the possible sub-structures. This is a shoddy, error-prone way of faking dynamic typing.
Is that even possible?
Remember the articles about Xen? Instead of writing a whole bunch of functions to turn your DOM representation into your own internal tree of objects, you can use syntax extensions to C# and have the compiler automatically generate the parser. Well, you can do the same with macros, and you don’t have to hack support for a specific language into the compiler.
So, say, I write a pattern matching algorithm in Perl, and one in C, are you insinuating that the Perl code will undoubtedly be faster than the C code?
Obviously, you don’t know what pattern matching means in the context of programming languages. Look it up.
I don’t really see the performance hits you are clamoring about. In my experience languages that have all these properties built in are much slower than C which doesn’t. Which performance hit are we talking about?
What languages are you experienced with? Perl and Python are not in the same target market as C. Lisp, ML, Scheme, and Dylan are. These languages will get you very close to C in performance:
http://www.bagley.org/~doug/shootout/index2.shtml (see the Ocaml results, in particular).
Doug’s benchmarks don’t take into account advanced usage, so the result is skewed towards C, but high-level languages make a very good showing. In practice, good developers in these languages get over 80% the performance of C.
I disagree with you. It is very useful to understand what is going on at the machine level.
Then read up on processor architecture. C tells you nothing about what the machine is doing under the hood. Modern architectures are nearly as alien to the C machine model as they are to the Lisp machine model. And most experienced Lisp, et-al developers have a good understanding of what their code is doing at the C-level, because they are familiar with their compilers. They don’t need to actually *program* at that level the whole time to get that understanding.
I also disagree that coding at the machine level is slower and less optimized than coding at a ridiculously abstract level. In fact, I don’t know where you get that notion from. If that was so, we all be writing out drivers, kernel, compilers, higher level languages and libraries in Python.
Python is a terrible example. Its not even natively compiled! Languages with native compilers (Lisp, Scheme, Dylan, ML, among others) are much faster, and much closer to C.
I’m not saying that you cannot get C code to run faster than the equivilent Lisp, et-al code. You can go really low-level with C. However, the time-investment required for that level of optimization is ridiculous.
Consider: when you’ve got a dynamic type (eg: the XEvent structure), do you generate hand-optimized functions for each possible case? Do you use generic list functions (based on void*), or do you have hand-written list-functions optimized for each type? I highly doubt it. Hell, even the Linux kernel doesn’t do that!
The point is, a high-level language compiler has a lot more freedom to optimize, because it has more information (by definition, ‘high level’), and because it doesn’t have to deal with the weaknesses of the C memory model (basically, the memory representation of objects must be determinate, which rules out huge classes of optimizations). The type of tricks GTK+ uses to fake object-orientation (weakly typed pointers, lots of casting, indirect calls via function pointers) are precisely the cases where a C compiler will *always* pay the cost of dynamism, wheras a more advanced compiler will be able to optimize most of it away.
The end result is that while in the optimal case, C code will be faster, in the general case, the code from a high-level language compiler will usually be faster. See:
http://www.algo.be/cl/TEE-lisp/31837187622993390/index.htm
The basic result was that the median C++ program was much slower than the median Lisp program, even though the fastest C++ program was faster than the fastest Lisp program. This is because developers are not all superhuman, and when you have a mix of programmers of different capabilities, you can’t do the sorts of manual optimization C programmers have to do.
Data structures, variables and methods/functions, that’s all what programming is about, every other feature academic verbosity and needless complexity.
If you think that, no amount of argumentation will persuade you. We cannot see eye-to-eye. I know both C/C++ and Lisp, while you apparently have no real experience with the latter. I suggest you go learn Lisp. Until then, consider this: coding in Lisp (or Dylan or Scheme, or ML if you like static typing) is a lot like coding in Python. Its really fast and really productive. At the same time, advanced compiler technology means that its also results in very fast code. You can see, then, why I consider these languages to be much better than C for application development.
Congratulation to all of you who are involve in MONO development. What a nice piece of development. Hope MONO will sucess in archieving its goal.
it needs to be taken one step further into a true RAD environment similar to Delphi.
Kylix was a good try, but the deployment issues where insane.
If monodevelop adds a nice GUI forms builder similar to CSharpBuilder on windows we could see a explosion of apps for Linux.
The supposed skinny on a gui builder for MonoDevelop is that glade isn’t the best solution because of issues with custom controls or something. Supposedly, a new gui builder is going to be done, but hasn’t been started and obviously will be a while before it’s done. It won’t be specific to gtk#.
Not being an expert concerning the legal issues surrounding Mono, could someone help me understand why they think Microsoft will allow Mono to coexist alongside their proprietry implementation. I understand the main argument is that the CLI and the C# language is standardised through the ECMA, but given that one of Microsoft’s executives recently stated that:
‘We’d have been dead a long time ago without Windows APIs’ [ref: zdnet.co.uk]
Why is anyone in doubt that Microsoft would permit their lock-in to be diluted with a free, open equivalent? Microsoft can’t be daft enough to hand as much power to the open source community without having something else up it’s sleave.
Just a question from someone not very knowledgeable about Mono.
Steve
What kind of code do you write?
I write all sorts of embedded software.
You confuse me by unrestrictedly interchanging object oriented jargon in higher level languages with those of procedural languages. For example, when you say, “Union of different pointers,” it would have made a lot more sense if you just said type definitions or typdefs.
So also is your definition of high level languages. Some languages have been relegated to the realm of the academic sphere. Who uses Scheme, Dylan, ML, Lisp, etc for general purpose stuff? Only hobbyist and academic scholars care for those languages. In practice and reality, C, C++, Java, Python, Perl, C# are the languages used by majority of the professional software vendors. Not Lisp, Dylan, Ocalm, ML or your niche programming languages.
I’ve had no practical experience with Lisp. I learnt lisp on my own as an academic pursuit. It’s a great language, but I have never been compelled to write one single application in it. This is not a feature fest contest, or best language contest. It is about what is practical and relevant.
You might consider Lisp great for application development, I don’t. Compared to other languages like Java, C#, Python, C or C++, how many specialized APIs does Lisp have? Can I write GTK+ applications in Lisp? How about low level code? Do software vendors provide API bindings in Lisp? How many programmers actually know Lisp as compared to C/C++/Java/C#? Are there opengl bindings for lisp? How many hardware platforms can Lisp run on?
Lisp is a great language, but great languages don’t necessarily make good applications or practical usage. Nothing has convinced me to write a single application in Lisp and I know the same is true for many other programmers who know Lisp. That’s because I don’t care for the language or compiler advancement itself, I care for the development platform and practical usage. Lisp is not practical for me to use.
Your software vendor would have to be drunk to tell you to code their next big project in Lisp. Who will maintain it 5years from then when you’ve left? What is the demand for Lisp programmers? I’m sorry, as much as I like Lisp, it doesn’t qualify as a practical general purpose language and the same goes for Ocaml, Scheme and Dylan.
As for the languages I know. I know C, Fortran, Forth, C++, Perl, Pyhton, Bash, Java, C# and Lisp. For most of the work I do I use C and occasionally C++. If I ever get interested in application development, I wouldn’t be using Lisp, Scheme, Dylan, or Ocaml either. With outstanding development frameworks like .NET/Mono, why should I? And since I’m quite versed in C and C++ I’d be better of using GTK+ to write my higher level applications anyway, if I have the need to.
And once again, I see no performance hits with many GTK+ applications despite the fact that it is object oriented C. According to your postulations, GCC isn’t smart enough to optimize GTK+ applications because of inherent weaknesses in C’s memory model and also because of it’s fake object system. My experience has been the opposite. I see no performance penalties in GTK+ object system. In fact, up until recently, GTK+ applications were small, fast and efficient. I remember clearly that GTK1 apps in particular where faster and smaller than apps written in object oriented languages.
The same holds true today too. GTK+2 albeit slower than GTK+1 apps still are still smaller and faster than comparable applications written in object oriented languages and I’m yet to see visible performance hits as result of GTK+ doing object orientation in C. I don’t buy theories anymore, give me practical instances where object oriented C applications suck terribly as compared to the ones written in Lisp, Dylan, Ocaml and Scheme, then I would shut up.
There’s a couple issues involved in the legalties of Mono.
Most likely ECMA 335 and 336 are clear if someone can confirm that it’s Royalty Free.
The other issue concerns the APIs that are not covered by the ECMA specs. These APIs mostly cover ASP.NET, ADO.NET, windows forms, and some microsoft-specific apis. These aren’t a necessity for open source systems. Most of these will be implemented, but can be backed out if necessary.
Another thing you can look at is why hasn’t Microsoft gone after Wine, Crossover, etc… yet? I guess one answer is that they don’t pose a threat to microsoft…yet. The other answer could be that there still seems to be a question regarding the patentability of APIS. Remember, Mono is a clean-room implementation of the APIS. They’re not ripping off Rotor code.
If Gnome were to choose Mono as a standard runtime and didn’t use any of the non-ecma apis then they would probably be in the clear. In reality though, people like Havoc Pennington don’t care if ecmca 335 and 336 are royalty-free. People like him hate Mono because the tech was invented by Microsoft. So Gnome will probably continue along the path of cobbled together, home-grown crap that will take years to implement, will be buggy as hell, and most developers just won’t touch it.
Don’t get me wrong, I like Gnome – I’m using it as my primary desktop, but Miguel de Icaza choosing c over c++ for the Gnome language was just a bone-headed move plain and simple. It’s no secret that he does not like C++, but because of his personal preferences and not technical merit he chose to use a language that isn’t suited for desktop development.
Like most Lisp people, Raynier just can’t help himself from not doing a little Lisp advocacy when the subject of computer languages come up. Lisp people just can’t get it through their head that Lisp is fine as a research language, but it had 40 years to prove itself as a mainstream language and never did. If you read some of the Lisp mailinglists you would think that all programming problems would have been solved years ago if everybody has been using Lisp. It’s just a way for certain people to see themselves as elite.
Why “/opt/gnome/bin/mono”??
One more time, Mono is NOT GNOME!!!!
I know the Mono developers want it, but please.. Mono is NOT GNOME!!
Some times I think that Mono is a parasit project
That’s where I DECIDED I would install it… It’s where I install all Gnome and Ximian stuff.
Not in any way the default install path.
I write all sorts of embedded software.
I suppose it depends on the precise sort of embedded software you’re talking about. There are places where I would agree that C isn’t a huge hinderence (writing an FFT, for example). However, I also write embedded code (probably completely different — we’ve got radios with hundreds of megs of RAM where I would have loved to have more than the spartan feature-set of C++ available to me.
So also is your definition of high level languages. Some languages have been relegated to the realm of the academic sphere. Who uses Scheme, Dylan, ML, Lisp, etc for general purpose stuff?
This discussion was not about what is a practical language for a particular project. This discussion was in response to:
“Oh boy, I feel bad for the next generation of programers. I mean, do I really need to tell a programer that object orientation is just an abstraction model and that you can write a project in an OO pattern in any language?”
You essentially said: “language doesn’t matter, concepts do.” My point was that language *does* matter, because it can make it orders of magnitude easier to implement certain concepts. This discussion is about an entirely abstract subject — we’re not talking about practical implementation here.
Can I write GTK+ applications in Lisp? How about low level code?
Yes.
Do software vendors provide API bindings in Lisp?
Some do, some don’t.
How many programmers actually know Lisp as compared to C/C++/Java/C#?
Not enough. But that’s not as huge a hurdle as one might think. There are a lot of situations where having a small team of highly-skilled developers is better than having a large team of commodity programmers. In these situations, it is easy to add the constraint that the programmers know some suitable high-level language. Several corporations understand that using a better language can be great for productivity, even if it limits your available developer pool. Ericsson, for example, uses Erlang throughout its enterprise. How many Erlang programmers are there compared to C++ programmers?
Are there opengl bindings for lisp?
Yes.
How many hardware platforms can Lisp run on?
The best free Lisp (SBCL), runs on Alpha, PowerPC, x86, and SPARC, and MIPS. It supports FreeBSD, Solaris, NetBSD, OpenBSD, and MacOS X. A Windows port is in the works. Commercial Lisps run on a wide range of architectures (including Windows). A good Windows commercial Lisp can be had for $250 (Corman).
And once again, I see no performance hits with many GTK+ applications despite the fact that it is object oriented C. According to your postulations, GCC isn’t smart enough to optimize GTK+ applications because of inherent weaknesses in C’s memory model and also because of it’s fake object system.
You wouldn’t. Language speed has almost no impact on GUI performance. That’s why Python makes a perfectly good GUI language for most tasks. If you wanted, however, to use high-level OO constructs in speed-critical, you’d see these performance hits. Lisp has been used on popular PS2 games (Jak and Daxter), to make development easier without affecting performance.
See, C programmers have this attitude about C: they have very fast code most of the time, and if they want to use high level (read: productivity-enhancing) features, they can always roll their own. Except for certain problem domains (your’s would probably qualify, though), this is precisely backwards. You want those high-level features to improve productivity, and want to be able to code in a low-level way for that 1% of each program that needs it. Thus, its better for overall performance when you can let the optimizer deal with that other 99%, and give your full attention to that 1% that needs it.
Like most Lisp people, Raynier just can’t help himself from not doing a little Lisp advocacy when the subject of computer languages come up.
I’d love to see how you came to that conclusion, given that I didn’t even mention Lisp in my original post! When it did come out, it was in the context of talking about compiler optimizations. Lisp is where a lot of the research on high-level optimizations focused on.
I also find it interesting that I mentioned ML and Dylan a great deal too, because they were relevent in talking about compiler optimizations. The fact that you latched onto the “Lisp” in that list of languages says more about you than it does about me.
Look: I make it a point to not insert random Lisp plugs into discussions. When I do mention it, it is because it is relevent to the topic at hand. For example, in this discussion, I mentioned it because I wanted to show root the downsides of having to roll your own high-level features in C. I mentioned it before in reference to Xen, because Lisp macros serve as a much cleaner way to implement domain-specific-languages. And I feel obliged to mention it whenever somebody claims Java and C# are innovative, because that is just simple misinformation.
Lisp people just can’t get it through their head that Lisp is fine as a research language, but it had 40 years to prove itself as a mainstream language and never did.
Why do you think that is? I have yet to see a convincing technical argument on this point. Nearly always, the argument descends into “Lisp is a great language, but…”
This is clearly flamebait, but if Lumbergh gets to read it before it gots modded down, I’m happy:
Lumbergh: I find it highly interesting that in the Novell Mono thread, the *only* person to mention Lisp was *you*.
Lumbergh: I find it highly interesting that in the Novell Mono thread, the *only* person to mention Lisp was *you*.
There are at least 3 mentions of Lisp in your second @Root post, and at least one mention of Lisp in your @Fredrick post before I even showed up on this thread. Try again. It seems that your Lisp advocacy is so engrained that you don’t even remember when you mention it.
Try reading my post again. Does this *look* like the Novell Mono thread to you? It seems that your anti-Lisp reactionism is so engrained that you forget how to read when you see “Lisp” in a sentence.
Also, note that Lisp in pretty much every instance in this thread, Lisp is mentioned only as one of several languages that have advanced compilers. Why aren’t you calling me an ML fanboy?
Don’t you find it interesting that Root’s reply to you concentrated mainly on the comparison of Lisp to other mainstream languages when you supposedly never mentioned Lisp. Remember, I was responding to Root. It just happens that I noticed your mention of Lisp several times in previous posts and I’m well aware of your Lisp advocacy in other threads.
Obviously, you still don’t get it. I never said I didn’t mention Lisp in this thread. I said I never mentioned it in the thread about Novell and Mono a few days back. *You’re* the one you brought up Lisp. And I wasn’t comparing Lisp to meanstream languages. I was saying that high-level languages give the optimizer much more room to work, and using Lisp, ML, and Dylan for real-world examples. I would have used C#, but no C# compilers have those sorts of optimizations!
Ok, when you referred to the Novell/Mono thread I didn’t realize you were talking about a previous thread because that thread wasn’t really about Mono and I was replying to the idiot that claimed “why doesn’t open source come up with something better”, as if it’s just a matter of a couple weekends and a sourceforge page to come up with some kind of managed environment that will compete in the mainstream of .NET and Java.
I referred to your comment about open-source already coming up with something better over 10 years ago and then cited examples that I wanted you to reply to since you didn’t give any of your own. I threw Lisp into that mix because you invariably bring up Lisp in language discussion.
I stand by comments in the previous thread that if open source had come up with something so much better 10 years ago then it would be in wide-spread use today. There’s more to development then just the language itself. There’s the API’s, developer tools, Docs, the ability to integrate with the native os easily, gui bindings…. The bandwagon effect is even important, because as a developer you want to be assured that the supporting cast for the runtime/language will continue to be developed, debugged, improved upon. And usually that takes some sort of organization with a lot of resources to guarantee that….probably one of the reasons why Qt (developed by a commercial company with full-time developers) is so much better than gtk+, even when you factor out that c++ is just a better language for gui development.
Listen, I have nothing against the language Lisp. I’m sure it has interesting features that more mainstream languages can use if they choose. Most of my experience with a lisp-derived language is in scheme while studying “Structure and Interpretation of Computer Programs”. But as my previous reasons stated there’s a lot more to using a language effectively then just the language proper. And you have to admit that _historically_ Lisp has had speed issues.
Anyway, I might go emerge some variant of Lisp to play around with tonight and check out some tutorials to see what the elitists are always clamoring about. What version of lisp would you recommend that has decent libraries and is an easy introduction to some of the more interesting features of Lisp?
I stand by comments in the previous thread that if open source had come up with something so much better 10 years ago then it would be in wide-spread use today.
“Better” does not necessarily lead to popularity. There are lots of “better” technologies that died because of market reasons.
There’s more to development then just the language itself. There’s the API’s, developer tools, Docs, the ability to integrate with the native os easily, gui bindings….
I’d say that’s more a matter of “more practical” then “better.” “Better” to me means technically better, better in the abstract. I would never call C# better than Lisp, but I’d be the first to admit that there are many cases where it is more suitable, because of its market support and huge library.
In any case, neither of these conversations has been about “better.” The last one was about innovation (Lisp had the features first), and this one was about availability and optimization of high-level features.
And you have to admit that _historically_ Lisp has had speed issues.
Maybe. Good native Lisp compilers came out with MacLisp in the 80’s. Their primary user was the Macsyma project, who used it for numerical code. They managed to get Lisp code to run nearly as fast as FORTRAN.
What version of lisp would you recommend that has decent libraries and is an easy introduction to some of the more interesting features of Lisp?
Clisp for playing around, SBCL for industrial-strength stuff. DrScheme is a good intro environment, but its Scheme rather than Common Lisp. As for libraries, check dev-lisp, there are lots of cool things in there.
http://www.cliki.net/Library
The tip about encodings could do with a lot of work. (It also has little to do with Mono specifically – it’s just a normal .NET thing.)
A few things (which apply to .NET in general – I would certainly imagine they’d all work in Mono):
o Rather than use new ASCIIEncoding etc, just use Encoding.ASCII – it’s a singleton, so it won’t end up creating objects for you needlessly.
o Your article implies that the interface for ASCIIEncoding is different to the one for UTF8Encoding. It’s not. They both inherit the same methods from Encoding, and you can use the same code for both.
o Your NullEncoding is lossy, as you’re using one byte per character rather than two. Why not just use UnicodeEncoding instead, which is basically the “right” way of doing what you were doing in the first place?
See http://www.pobox.com/~skeet/csharp/unicode.html for more information.
A few things (which apply to .NET in general – I would certainly imagine they’d all work in Mono):
Yes, that’s true, in general.
Your article implies that the interface for ASCIIEncoding is different to the one for UTF8Encoding. It’s not. They both inherit the same methods from Encoding, and you can use the same code for both.
According to Monodoc, it is different. I was basing myself in Monodoc, because the article is about Mono.
Your NullEncoding is lossy, as you’re using one byte per character rather than two. Why not just use UnicodeEncoding instead, which is basically the “right” way of doing what you were doing in the first place?
Lossy? Are you sure? Because I tryed it, and it works. If you payed a little more attention to the text, you would see what I explained about UTF encoders. Sometimes you just CAN’T use Unicode. And would you explain why is UnicodeEncoding the “right” way? Is it the right way if I send, through a socket for example, data to a server that does not support Unicode and it receives Unicode data? Is it still the “right” way? And is it the “right” way if you whish to use the exact codepage you are using, like many other people around the world?
According to Monodoc, it is different
Did you check for the inherited members of ASCIIEncoding? I suspect not, as if you had, you’d see that it inherits byte[] GetBytes(string).
Lossy? Are you sure?
Yes, your NullEncoding is lossy. Or rather, it would be if it didn’t fall over. It would have to be – it’s encoding length*2 bytes of information in length bytes. Did you try it with all characters, including those > 0x7f? I suspect not.
I’ve just tried it, and (as hinted above), it falls over with an OverflowException with characters > 0x7f. Here’s the test program:
class Test
{
static void Main()
{
string test = “u008x0”;
byte[] bytes = NullEncoding.GetBytes(test);
string test2 = NullEncoding.GetString(bytes);
Console.WriteLine (test==test2);
}
}
Your NullEncoding wouldn’t solve anything though – you still don’t know what the other side is going to use. If you can guarantee that the two sides use the same thing, I’d recommend using Encoding.Unicode or Encoding.UTF8. If you can’t guarantee what each side will use, and you can’t specify it, you’re stuffed. How exactly do you believe NullEncoding helps?
One more comment on your NullEncoding implementation: creating a string using concatenation like that becomes painfully slow as the size of the string increases. In this case, you could actually just allocate a char array to start with, fill it, and then create the string directly. In general, it’s worth using a StringBuilder (and initialising it with an appropriate size if you know one).
See http://www.pobox.com/~skeet/java/stringbuffer.html for a fuller example of why you shouldn’t create the string in that way – it’s Java-based, but the principles are the same.
You’re wrong. Yes, I’ve tryed it with characters > 0x7F, and for a simple reason. I’m Portuguese. about 1/50th of the characters we use are > 0x7F. I’ve tryed it with my own name which is written João, with an a with tilde. And it worked fine. And when I tryed it with the ASCII encoding and Unicode encodings, it just came a “?”.
Your NullEncoding wouldn’t solve anything though – you still don’t know what the other side is going to use. If you can guarantee that the two sides use the same thing, I’d recommend using Encoding.Unicode or Encoding.UTF8. If you can’t guarantee what each side will use, and you can’t specify it, you’re stuffed. How exactly do you believe NullEncoding helps?
Well, but that is if YOU CAN guarantee. But I was NOT talking about when YOU can guarantee. I was talking about when you CANNOT guarantee.
One more comment on your NullEncoding implementation: creating a string using concatenation like that becomes painfully slow as the size of the string increases. In this case, you could actually just allocate a char array to start with, fill it, and then create the string directly. In general, it’s worth using a StringBuilder (and initialising it with an appropriate size if you know one).
Yes, I’m aware of that. But have you ever heard of KISS (Keep It Simple Stupid)? If I’m trying to make the reader understand what he has to do, why should I complicate further?
As far as I’m concerned, you’re just trolling.
So did you actually try my test program?
If your tests with UnicodeEncoding didn’t give the original stringyou clearly weren’t testing properly, as UnicodeEncoding *isn’t* lossy.
You still haven’t explained how you expect NullEncoding to preserve all the data with a compression ratio of 2:1. Are you aware that chars in .NET are 16 bit? If you are, please explain how you expect to compress 16 bits into 8 bits without any loss. If you’re not, please read some books or documentation before writing any more “tips” on the subject.
You still haven’t explained how you expect NullEncoding to help if you *cannot* guarantee what encoding is used on each side – if you don’t know what the other side expects, you just can’t get the data to it accurately. NullEncoding just *doesn’t help*. You don’t seem to have even read the text you quoted.
As for not using StringBuilder – you could actually have made the code simpler *and* more efficient by just using a char array. I really don’t like presenting horribly inefficient code (which you’re recommending others use!) without at least giving warnings about its inefficiency.
As for whether or not I’m trolling – I’m trying to correct some of the horribly misguided impressions you’ve given. The fact that you think NullEncoding isn’t lossy shows you don’t know *nearly* enough to be writing about the subject. To my mind, if you’re going to write garbage, you should expect to be called on it.
Oh, and does your lack of comment about the ASCIIEncoding interface mean you’ve got a clue on that topic now, and accept that you can just use GetBytes(string)?
I see my test program hasn’t quite posted properly – I should have previewed it, apologies for that. The string should be “u0080”, not “u008x0”. (Not entirely sure how it cut and pasted to that, to be honest.) Hopefully you saw what I was getting at anyway and changed it when testing it yourself.
I still don’t understand why you didn’t see an OverflowException when testing with your own name though, as the docs for Convert.ToByte(char) specifically say it will throw an OverflowException if “value is greater than Byte.MaxValue”. Byte.MaxValue is 127, and the character for an a with a tilde is (according to http://www.unicode.org) 0xe3.
There’s no compression 2:1. Since I use the ISO-8859-15 codepage, and since in that codepage all characters have a value <= 255, then, what happens is that I’m converting from Unicode to ISO-8859-15 directly. So, if any other program converts an Unicode encoded file or string to ISO-8859-15, will you say the program is compressing 2:1?!?
Second, for byte, MaxValue is 255, not 127. For char, MaxValue is 127.
According to Monodoc:
From the Byte class documentation:
The Byte data type represents integer values ranging from 0 to positive 255 (hexadecimal 0xFF).
From the Convert.ToByte(char) documentation
OverflowException The numeric value of value is greater than MaxValue.
Bearing in mind that I clicked on the MaxValue link, I was brought to another page on the Monodoc documentation:
System.Byte.MaxValue Field
Contains the maximum value for the Byte type. [Edit]
Value: 255
public const byte MaxValue
Remarks
The value of this constant is 255 (hexadecimal 0XFF). [Edit]
About the ASCIIEncoding stuff: Why did I get an error compiling when I tryed the method directly, and why in Monodoc, the method is explicitly documented for the UnicodeEncodings, but not for ASCIIEncoding?
And besides, the difference is there for another reason, it is to explain that there are 2 alternative methods in the general Encodings that they can use, one for each type of use.
It seems to me that you don’t know anything about converting data of different size, how unicode works and about how things are converted to and from unicode.
The characters you happen to be using may all be Unicode < 256, but that’s not true of Unicode strings in general. So yes, your encoder is lossy for general strings. It may not be lossy in your particular case, but that’s not the same thing. (Also, I don’t believe all characters in ISO-8859-15 *are* less than 256 in Unicode. In particular, I believe it includes the Euro sign, which is 0x20ac in Unicode.)
Apologies for the braino on Byte.MaxValue – don’t know what I was thinking there, or what happened when I got an OverflowException (typo, no doubt). Try it with string test = “u0100” and you’ll get the exception for certain though. So yes, your encoding will fall over with perfectly valid strings. That’s hardly encouraging, is it?
Char.MaxValue *isn’t* 127, btw- it’s 65535. SByte.MaxValue is 127.
As for why you got a compilation error with ASCIIEncoding: I don’t know, I’d have to see your code. The reason you can explicitly see the overridden methods in UnicodeEncoding is that they are overridden there rather than using the base implementation. That doesn’t mean the base implementation isn’t there though. Here’s an example program which compiles fine for me. Please try it and see if it compiles for you too:
using System;
using System.Text;
class Test
{
static void Main()
{
byte[] b = Encoding.ASCII.GetBytes(“hello”);
Console.WriteLine(b.Length);
}
}
The above compiles for me even with a relatively old version of Mono. It should compile for you too. This goes against what you claim in the article. Care to explain?
It seems to me that you don’t know anything about converting data of different size, how unicode works and about how things are converted to and from unicode.
LOL. I’m not the one claiming that an encoding which takes a string of UTF-16 characters and maps it to the same number of bytes will work without being lossy…
It’s fine to show that there are various different methods you can call on Encodings – but when you start claiming that you can only use a certain method on one kind of encoding when it’s actually a member of the base class, you’re just being inaccurate.
You also still haven’t explained how you expect your NullEncoding to help when you don’t know what encoding the other side is going to use. (Basically your encoding is an ISO-8859-1 encoding, which can be got using Encoding.GetEncoding(28591). Of course, you’d know that if you read the link I posted earlier, which also pretty much disproves your claim that I don’t know anything about how Unicode works.)
(Also, I don’t believe all characters in ISO-8859-15 *are* less than 256 in Unicode. In particular, I believe it includes the Euro sign, which is 0x20ac in Unicode.)
I didn’t said that all characters in ISO-89859-15 are less than 256 in Unicode. I said exactly the opposite: that all characters in ISO-8859-15 are below 256 in ISO-8859-15, because ISO-8859-15 is a 256 char encoding.
Char.MaxValue *isn’t* 127, btw- it’s 65535. SByte.MaxValue is 127.
Yes, I’ve mixed up.
You also still haven’t explained how you expect your NullEncoding to help when you don’t know what encoding the other side is going to use. (Basically your encoding is an ISO-8859-1 encoding, which can be got using Encoding.GetEncoding(28591)
Because since you don’t know what the other side is using, you just use it as it comes and let the classes do the rest for you.
And the point in the article is not to teach all about encodings, but to let people understand a bit of the theory behind it. You too, that have also written articles, should be know that is difficult to explain theory and practice at the same time. It is easy to be misinterpreted. In this case sometimes there may have been confusion because of my poor english, but there are always some people that simply do not pay attention.
And some other times, you just have to twist the reality a bit, so that people can understand the first bit. Later, when they have already learned a bit more, they can “undo” that small deviation of reality and understand every piece missing. Didn’t your math, phisics and chemestry teachers do it, when you were in school? Didn’t they ever made you assume things that weren’t real, so that you might understand others that were and for which they couldn’t explain it any simpler because you still didn’t have the knowlege for it?
For example didn’t they start telling you that gravity acceleration was allways 9.8m/s² to later tell you that in fact it varies according to the distance to the Earth center? Didn’t it work to let you understand the rest?
There’s a difference between the approximations made in physics and what you did here. You basically said, “Here’s a good class to use” when the class you gave was horrendously inefficient, and breaks on any string with a character in which has a Unicode value > 255. Also, when I write an “approximation to the truth” I try to always make sure that I state that it’s an approximation, so that people don’t hold it to be the truth. Where I can, I direct them to more detailed resources if they wish to learn more.
Your previous post about values in ISO-8859-15 not being > 255 was confusing. I see what you meant now, but that’s irrelevant. What’s important is whether or not any characters in the string have a Unicode value > 255. If they have, your class breaks with an exception. If it didn’t throw an exception (if you used a cast instead of Convert.ToByte, for instance), it would have to lose data instead, because even if *you* don’t happen to use Unicode characters > 255, other people do. Pretending they don’t is dangerous.
Because since you don’t know what the other side is using, you just use it as it comes and let the classes do the rest for you.
What you get is bytes. You have to convert those into characters. If you don’t know the encoding on the server, you simply cannot do that accurately. Your class doesn’t help with this like you claim it does. It does nothing useful.
And the point in the article is not to teach all about encodings, but to let people understand a bit of the theory behind it.
Except you didn’t. If anything, you confused things more, as far as I can see. You provided code which attempted to solve a problem to which there is no solution, and the code you gave is inefficient and breaks on any character data not in the first 256 Unicode values – something you either weren’t aware of, or just failed to mention.
You claimed that people couldn’t use code that they could use (Encoding.GetBytes(string) where the encoding in question is an ASCIIEncoding) you implied there are no existing encodings other than the UTF ones and ASCII (there are plenty) and generally painted a pretty sorry picture. Why not just point people to an article which explains Unicode in a reasonable way? http://www.joelonsoftware.com/articles/Unicode.html is a good article to point newbies at, for instance.