More On Java Generic
I came across this interesting problem when I was dealing with Java's generics. Below is the code snippet that will break:
package sandbox;
import java.util.ArrayList;
import java.util.List;
public class Test {
private static abstract class A<T> {
protected T a;
public T get(){ return a; }
public List<String> getList(){ return new ArrayList<String>(); }
}
private static class B extends A<Integer> {
public B( int k ){ a = k; }
}
private static class C extends A<Double> {
public C( double k ){ a = k; }
}
public static void main( String... args ){
List<A> list = new ArrayList<A>();
list.add( new B(1) );
list.add( new B(2) );
list.add( new B(3) );
list.add( new B(4) );
A a = list.get( 2 );
for( String k : a.getList() ){}
}
}
Well, the above code will not compile. The reason is on the last line of code, the getList() returns a List instead of a List<String>. The reason is because A was declared raw, hence the generics in the getList was discarded as well. The issue was reported here, but was concluded that this is not a bug. The work around for this issue is to use A<?> instead of A on the second last line. This somehow triggers javac's generic nerve and all will work. If you ask me, this is nonsense.
Permalink Posted Time: 2009-04-14 16:05:03
Blah Blahs On Software Engineering
Almost all engineering disciplines involve architects and producers to design and produce, respectively. Such separation is not always clear when it comes to software engineering, and I believe the reason for this is the loose definition on the term "software production". What exactly is software production? How does one know when software production begins? Or when it ends? For other engineering disciplines, perhaps it's the digging of foundation that marks the beginning of construction; perhaps it's the demolishing of existing structures; perhaps it's the moment when the assembly lines are turned on. If we look at all previous examples, production seems to mean the process of taking an initial design or a prototype of the product and producing copies/copy of it.
Is it sensible to label the replication of CDs software production? Or the duplication of bits across network? Obviously no. So software production does not involve mass reproduction for it is done trivially. This implies that if there is such thing as software production, it happens before shipment. I claim that the implementation process is the closest thing and the most reasonable thing to be called software production. With that definition, we see a problem immediately. All other engineering disciplines conduct tests before production, whereas most software tests happen after (or during) production. In fact, it is not trivial to test a piece of software without the software. But by the time the software is written and can be tested fully, the production stage is over.
One might wonder what the problem is with testing after production, as long as it is tested before shipment, what difference does it make? I am certain that no one will raise such argument when we talk about vehicle productions or building constructions, because it is an economical disaster to build the thing before testing it due to the fact that these products are tangible and very hard to change. The idea many people have about software is that it is written, it is intangible, and it can be changed whenever we want without much effort. This idea is dead wrong. Even with the endless effort of making code more generic, reusable and flexible, large software systems are still very hard to change, sometimes even impossible to change without virtually rewriting the whole thing. This causes modification during production to be very expensive. In addition, due to incredible freedom we have during software design, we tend to be less cautious and more idealistic. This optimism plus the lack of test reinforcements translates into large number of logical faults during production when reality hits, which in turn becomes fatal to the project's success.
So why do we not test the design fully before heading into implementation? I believe there are several reasons. Firstly, it might appear to one that writing design documentations takes almost as long as writing actual code. So it appears to be stupid to write documents instead of doing actual work. This is true. In fact, sometimes writing documentation can take longer than writing the corresponding code. The difference kicks in when there is a change to be made. Changing documentation is far easier than changing implementation. So one can see that by trying to catch all possible logical faults during design is much cheaper than doing it later. This is a known fact published in many books. Secondly, frequent changes to requirement may cause endless changes to design to a point where it seems the project will hit deadline without a single line of code written. This is a reasonable fear and it can actually happen if necessary actions are not taken. With today's advancement in the IT industry, if one does not freeze a requirement at some point in time, nothing will ever get done. By freezing requirement, I do not mean telling the user that this is it and no more requirement can be added or modified. What I mean is that the requirements gathered so far is frozen and sent off for design work to be based on. Any further modifications to requirements will happen in the next version, and it is important to make users aware of that. This allows requirement engineers to gather information on version 3, while architects design version 2 and implementers build version 1. Thirdly, doing a full software design and testing the design properly is hard. It requires the architects to visualize the whole system interface and play out all use cases of the product. Implementation designers need to layout the structures and workings of every module and their interfaces, while making sure no obvious performance bottlenecks exist, while making sure the system does what the architecture requires, and while making sure the system is extendible, scalable, secure, and stable. Doing all of these without actual code is very difficult to do right. It is time consuming and error prone.
With the fortunate collapse of the waterfall model quite a while ago, almost all new development models try to push the test phase forward into the cycle as much as possible. This is obviously a good trend. Things like agile development and test driven development embeds testing into 'production' and tries to find problems as soon as possible. The issue is, these tests are still testing actual code, meaning problems will not be found until they are implemented, which means the problem is still hard to fix. But given the ambiguous nature of natural language and that most design works are done in natural language, it is difficult, if not impossible, to test the logic in design. If let us assume for the moment that we posses a formal language that is precise and can be used to do design work, then we can perform logical evaluations on the design. Further more, we can basically translate the formal language into an existing programming language automatically, also known as compiling it. At this point, it is probably clear that the 'formal language' is nothing but another programming language.
Maybe the only way to improve software development process is to invent yet another new programming language?
Permalink Posted Time: 2009-01-13 16:29:17
Java Generics
It is not my intent to insult or make fun of generics in Java. But the nature of its stupidity and annoyance begs me to give it some harsh words. And I can justify all insults with a few simple examples.
The first one is the fact that you have to provide an array of type T every time you turn a list of type T into an array of type T is very stupid. The exact code snippet being:
List<T> oldList = new ArrayList<T>(); T[] newArray = oldList.toArray( magic );
Ok so what's that magic? The magic is a T[]. Except, you can't always have one. This is because if you only have the generic type T, you cannot instantiate an array out of it, hence no magic for you, hence no newArray for you.
The second example is the infamous factory. Consider the following factory class that can create us an instance of almost everything:
public class Factory<T> {
private Class<T> classVar;
public Factory( Class<T> c ){
classVar = c;
}
public T create(){
try{
return classVar.newInstance();
}catch(...){...}
}
}
If anyone were to use this brilliant Factory<T> to make something out of it, he would have to type the class name of the object three times.
Factory<StringBuffer> f = new Factory<StringBuffer>(StringBuffer.class); // Maybe StringBuffer doesn't annoy you enough, what about this? Factory<BasicPopupMenuSeparatorUI> f2 = new Factory<BasicPopupMenuSeparatorUI>(BasicPopupMenuSeparatorUI.class);
Stupid? Yes. Efficient? Not exactly.
The root cause for these almost retarded constraints is that Java generics are simply type erasures, almost like macros in C, that get preprocessed and translated from source code to source code before compilation. Harshly put, it's a syntactic sugar, not a language feature. Correct me if I am wrong.
Permalink Posted Time: 2008-08-27 18:55:24
A Deeper Look into Injection and Relationship
The principle of least knowledge is a fairly well known and ancient rule in the field of software design and modularization. Suppose we have objects A B C D whom only know about the object named next in the alphabet, so essentially A->B->C->D. If A is to use a feature in D, it'd have to go through B and C, which forces it to have the awareness of C and breaks the principle of least knowledge. Actually, the fact that A needs to use a feature in D already breaks the principle. The point of enforcing this principle is so that objects that are not logically related should be easily separate-able, creating a simple relation network between classes instead of a jungle of webs that are tangled and mixed. Does sound like an useful idea. Hence many design patterns emerged to tackle this issue, such as Factory, Facade, Bridge, just to name a few, and of course, dependency injection.
This entry should have really started here, but I figured a little background introduction paragraph will not hurt. Dependency injection essentially allows a class to passively take in its dependency as parameters instead of aggressively going to obtain one in its own execution. A very simple analogy is the difference between a baby that crawls around eating whatever he finds and a baby that sits there and eats whatever you provide him. Certainly the one that sits there and gets fed is much more manageable and predictable than the former, and that is exactly the property we need a module to have in a piece of software. Sounds like injection is the way to go at all times then? No, not really. I shall continue using the analogy to explain the reason for that.
The problem becomes fairly obvious when you have not one baby, but thousands to tens of thousands. If you are a single parent, you have one of the two approaches: prepare all the food the babies need and hand them out at the same time to be fair; or hand out food to a baby as soon as the necessary amount is prepared, which causes unfair starvation. In both cases, it is a lot of work for the parent and chances are a lot of the babies will die from starvation. This is exactly the problem dependency injection can run into. Instantiation of all the dependencies are done by the administrator and then provided to its workers. This kills any possibility of lazy instantiation and can potentially cause memory issues since the dependencies are now in the scope of the administrator instead of the workers.
The fix if fairly simple: inject a factory that will build the workers their dependencies instead of providing a built instance. It is true that this approach will add extra dependencies to the system since now the workers are depended on its original dependency AND the factory to function, but the advantage is that the administrator class only needs one instance of the factory and lazy instantiation becomes possible. But I suppose not all solutions are feasible at all times, so when the situation calls for it, sometimes one has no choice.
Permalink Posted Time: 2008-08-25 12:42:23
Defense of the Gentoos
Gentoo, being the fastest penguin species, is indubitably my favourite animal. They have been around for some 14 million years, they can swim at the speed of 35km per hour, and they flirt by offering nice stones that can be used to build nests. But you may wonder why a software engineer is blogging about penguins. Well, because Gentoo is not only my favourite animal, it is also my favourite operating system.
At this point, Ubuntu fans could be rolling their eyes and hitting that back button to avoid reading the non-binary nonsense I am about to spit out; Mac fans could be childishly laughing at my utter ignorance; and Windows fans probably don't have a clue about what I am talking about. It is indeed true that Gentoo is not the most popular distro, far less than Ubuntu users and probably even Fedora users, it is also true that the Gentoo community went through some fairly dramatic problems in the past couple of years with the leaving of the founder and various key developers retiring from the project. Even with these fatal blows on the distro, Gentoo was never behind other distros, let along broken, claimed by some. As Ms.Linton wrote on linux.com [ref], with the release of 2008.0, Gentoo may be stepping back into the spotlight again.
As many of the digg users here, I have been using the distro for 3 years. It is not the easiest thing to configure and setup correctly and it is not a distro that will just work; sometimes it does take forever to compile and sometimes things will stop and give you a compilation error that reads like machine code. But such instances are not frequent, in fact, they are almost always caused by user errors. The claim that Gentoo is slow because everything is compiled from source is utter nonsense. With the CPU speed nowadays, a complete Gentoo system can be brought online from scratch in about 6 to 7 hours. I personally only truly installed Gentoo once in 2005. Ever since then, I have never had the need to "reinstall" or "upgrade" the system since everything was so continuously up to date. All that was needed was monthly package updates that I can leave itself to do in one night when I go to sleep. In fact, I have cron jobs setup to do these updates in the background automatically without me needing to know about it. Yes, very much like Windows in that sense. Gentoo also really highlights the benefits of a source based distro that it allows very nice customization when building your system. You can tweak the compilation dependencies so you only install the things you need and avoid garbage piling up on your system. This can give someone a very strong sense of ownership since no 2 Gentoo setups are identical. You can go anywhere from a X11-less text based barebones server to a full blown Gnome powered desktop environment with fancy eye candy. For instance, the instant messenger Skype is written on the GUI toolkit qt, I compiled it with the 'static qt library' option that fully encloses the qt library within the Skype application, making sure that I don't have qt floating around in my system for tidy purposes since none of my other GUI apps need it. Also, I run on the very lightweight fluxbox window manager, but I also use tools that come with XFCE and Gnome desktop environment, except I don't need to fully install those 2 environments to make use of the tools I need. All these flexibilities and freedom are consequences of being a source based distro. I believe this is something no binary distros can compare and match against.
Permalink Posted Time: 2008-08-19 17:34:43
Next Page