Feature

Oh PHP empty…

Salah satu kelemahan terbesar PHP menurut saya adalah di null / value checking. Melihat bagaimana PHP adalah interpret language berarti validasi undefined value tidak dapat dilakukan di level compiler. Lalu bagaimana PHP adalah dynamic typing juga membuat null checking lebih sulit.

Null / Undefined

Ada 2 jenis variable kosong, yaitu null dan undefined. Variable null berarti variable tersebut pernah di-declare, dan di-initialize nilainya sebagai NULL. Undefined variable, adalah variable yang tidak pernah di-declare atau sudah di-declare namun tidak pernah di-initialize. Contoh:

$null = NULL; //null variable
$null2; //masih undefined
$null = $null + 1; //nilai $null jadi 1
$null = NULL;
echo $null["key"] === NULL ? "true" : "false"; //hasilnya true
$undefined = $undefined + 1; //error undefined variable

Lucunya di PHP, NULL dianggap sebagai sebuah value. Hal ini membuat operasi matematik / string yang berinteraksi dengan NULL tidak error, seperti contoh penambahan di atas. Untuk operasi matematik, NULL dianggap 0 dan untuk operasi string maka null dianggap string kosong. Bahkan untuk array, NULL hanya akan menghasilkan nilai NULL lagi dan tidak error.

empty / isset

Tadi kita sudah melihat bagaimana null dianggap sebagai value. Tetapi tidak pada pengecekan empty / isset. Anehnya lagi, kedua operasi tersebut tidak 100% berlainan dengan arti kata !empty == isset. Anggap kita menggunakan satu set variable sebagai berikut untuk mengecek validasi empty / isset:

$one = 1;
$two = 0;
$three = "";
$four = " ";
$five = "0";
$six = "false";
$seven = false;
$eight = [];
$null = NULL;
$null2;

Snippet untuk mengecek empty

echo 'empty($one) ';
echo empty($one) === true ? "true" : "false";
echo "\n";
echo 'empty($two) ';
echo empty($two) === true ? "true" : "false";
echo "\n";
echo 'empty($three) ';
echo empty($three) === true ? "true" : "false";
echo "\n";
echo 'empty($four) ';
echo empty($four) === true ? "true" : "false";
echo "\n";
echo 'empty($five) ';
echo empty($five) === true ? "true" : "false";
echo "\n";
echo 'empty($six) ';
echo empty($six) === true ? "true" : "false";
echo "\n";
echo 'empty($seven) ';
echo empty($seven) === true ? "true" : "false";
echo "\n";
echo 'empty($eight) ';
echo empty($eight) === true ? "true" : "false";
echo "\n";
echo 'empty($null) ';
echo empty($null) === true ? "true" : "false";
echo "\n";
echo 'empty($null2) ';
echo empty($null2) === true ? "true" : "false";
echo "\n";
echo 'empty($undefined) ';
echo empty($undefined) === true ? "true" : "false";
echo "\n";

Snippet untuk mengecek isset

echo 'isset($one) ';
echo isset($one) === true ? "true" : "false";
echo "\n";
echo 'isset($two) ';
echo isset($two) === true ? "true" : "false";
echo "\n";
echo 'isset($three) ';
echo isset($three) === true ? "true" : "false";
echo "\n";
echo 'isset($four) ';
echo isset($four) === true ? "true" : "false";
echo "\n";
echo 'isset($five) ';
echo isset($five) === true ? "true" : "false";
echo "\n";
echo 'isset($six) ';
echo isset($six) === true ? "true" : "false";
echo "\n";
echo 'isset($seven) ';
echo isset($seven) === true ? "true" : "false";
echo "\n";
echo 'isset($eight) ';
echo isset($eight) === true ? "true" : "false";
echo "\n";
echo 'isset($null) ';
echo isset($null) === true ? "true" : "false";
echo "\n";
echo 'isset($null2) ';
echo isset($null2) === true ? "true" : "false";
echo "\n";
echo 'isset($undefined) ';
echo isset($undefined) === true ? "true" : "false";
echo "\n";

Hasilnya

empty($one) false
empty($two) true
empty($three) true
empty($four) false
empty($five) true
empty($six) false
empty($seven) true
empty($eight) true
empty($null) true
empty($null2) true
empty($undefined) true

isset($one) true
isset($two) true
isset($three) true
isset($four) true
isset($five) true
isset($six) true
isset($seven) true
isset($eight) true
isset($null) false
isset($null2) false
isset($undefined) false

Untuk isset logikanya cukup mudah, semua variable yang sudah di set dengan nilai apapun terkecuali NULL menghasilkan true. Untuk empty cukup beragam, sebuah variable dianggap empty bila bernilai string kosong($three), string 0 ($five), boolean false ($seven), int 0 ($two) dan array kosong ($eight).

Encapsulate in function

Khusus untuk validasi undefined, saya belum menemukan cara lain kecuali dengan empty atau isset. Contoh dalam snippet kode berikut:

function isNullOrEmpty($str){
	return !(isset($str) && $str != '');
}
echo 'isNullOrEmpty($six) ';
echo isNullOrEmpty($six) ? "true" : "false";
echo "\n";
echo 'isNullOrEmpty($null) ';
echo isNullOrEmpty($null) ? "true" : "false";
echo "\n";
echo 'isNullOrEmpty($undefined) ';
echo isNullOrEmpty($undefined) ? "true" : "false";
echo "\n";

Hanya kode yang menggunakan variable $undefined saja yang menghasilkan PHP notice / error, lainnya tidak. Hal ini berarti kita tidak dapat meng-encapsulate function empty ke beberapa function validasi lain, seperti stringIsNullOrEmpty dan stringIsNullOrWhitespace.

Kesimpulan dan saran

Variable null dan undefined di-handle berbeda oleh PHP. Function empty dan isset juga menghasilkan validasi yang berbeda. Khusus untuk validasi variable undefined, saya belum menemukan function selain empty dan isset yang dapat meng-handle nya.

Sebagai programmer yang sudah lama menggunakan static type dan compiled language (C#), saya sudah terbiasa untuk melakukan variable declaration sehingga kemungkinan terjadinya undefined variable lebih kecil. Cukup baik untuk terbiasa melakukan variable declaration di awal function terlebih dulu untuk meminimalisir validasi empty / isset.

Pergunakan isset bila ingin me-validasi false dan empty value. Pergunakan empty bila sudah tahu kriteria-kriteria validasi yang dilakukan empty.

Advertisements

Why I avoid ORM in my enterprise architecture

Yesterday I have a nice, interesting chat with someone I just met. In those discussion about software development and technology, he asks whether I use Entity Framework for my current architecture or not. I was saying it clearly, I’m not using any kind of ORM nor develop one myself. Before I continue, I want to clarify that I’m not hating the ORM or anything. With my current skill and experience, I’m just not ready for the edge cases (or to be precise, corner cases) that can occur when using ORM.

When asked with such question, I had hard time to give reasons about it. One of the reason is that there will be hard to find developer with expert knowledge of particular ORM, rather than expert knowledge in stored procedure-based command execution. The other is, ORM has leaky abstraction. I’m not saying that other than ORM or any tools that I used until now has none leaky abstraction, but I don’t think it’s worth the effort to do the workaround for ORM’s corner cases, compared to matured, traditional stored procedure and query.

Not all Linq expression is supported to be translated into sql

This statement is primarily based on article from Mark Seeman: IQueryable is a leaky abstraction. In that article, he state that there are some unsupported expression exposed from IQueryable. He states that the unsupported expression itself violates LSP. Moreover, quick search at google shows some stackoverflow questions asking for the NotSupportedException. One of them shows inconsistency with linq2sql, where when someone use string.Contains the ORM throws exception meanwhile using join the expression executed. Another post says the same about EF.

So as we can see, the IQueryable interface used by ORM give us false guarantee or false positive, just because the code compiles but produce runtime exception later. The same case can also happen with SqlCommand. However the error is clear, that it’s either: 1) different sql data type provided to SqlParameter, 2) incorrect parameter provided, 3) wrong sql syntax. However it is the opposite with in-memory IEnumerable lambda/linq, where the expression is fully supported.

Now we have additional step to measure sql performance

ORM translate query expression into sql queries. If you do not master the specific ORM, you won’t know what kind of sql query produced. Moreover, different ORM produced different query. Now if we concern about the sql performance, we have 2 steps to do. First step is to determine which exactly sql query resulted from ORM, while the other is real measurement with indexes, plan, etc.

Common case is N+1 selects issue with ORM in which handled differently by different ORM, and impact the performance.

Data annotation breaks POCO and SRP

This is specific to EF with data annotation. If you prefer to use fluent api, great on you.

Data annotation break POCO, nuff said. Based on wikipedia, POCO or Plain Old CLR Object is simple object which does not have any dependency to other framework/plugin/tools. Even Serializable and XmlIgnore data annotation breaks POCO, and I’ve already stated that cluttering the class to make it xml serializable is usually bad.

It also breaks SRP. Now that your POCO class has another responsible coupled with it. The additional responsible is handling the way ORM mapping from table to the object. It isn’t wrong, but it’s not clean. The worst part is, you need to modify your POCO class to meet mapping requirement.

The original underlying structure of database is relational

Original post by Martin Fowler states that the root of main problem is the difference in underlying structure between application (OOP) and database (RDBMS). The way they handle relations is different and the way they are processing data is also different. Even Jeff Atwood also said that the ORM problem will be clear by either you remove the “object” aspect or “relational” aspect, either by following table structure in application or using ODBMS. ODBMS is great, but it also has cons compared to RDBMS.

When and how many times the query being executed?

I don’t know how good you are with IQueryable abstraction. However even with in memory IEnumerable, there are many times i’ve caught with bug where the iterator is executed multiple times, resulting in inconsistent data properties and increasing cost. With IEnumerable / IQueryable ORM, I don’t know when and how many times exactly the sql query being executed, and it can impact performance considerably.

Additional sources where they have documented the issue

These sources are more like general statements or even the detailed technical one, so I don’t follow it one by one. But it shows you the lists of current problem faced by ORM.

http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch

http://programmers.stackexchange.com/questions/120321/is-orm-an-anti-pattern

Conclusion

As a developer / architect that concern with clean code, consistency, coding standard and convention, I don’t think that ORM will suit me. There is simply too many corner cases that can’t be handled by current ORM, and I don’t like (and don’t have time) to document every cases that cannot be handled, and the solution or workaround. It’ll be a pain to teach to the new developer your ORM case handling standard. Moreover it can cause you more time to fix the problem caused by the ORM rather than it’s benefit.

However if you find yourself confident that you can handle the corner cases, or if you exactly know that the application won’t face those corner case with ORM, then it’s a great tool to be used.

DataBinding is Neat

Data binding is a nice feature and powerful. I am quoting what Martin Fowler has said in his article about state:

One copy of data lies in the database itself. This copy is the lasting record of the data, so I call it the record state. A further copy lies inside in-memory Record Sets within the application. Most client-server environments provided tools which made this easy to do. This data was only relevant for one particular session between the application and the database, so I call it session state. The final copy lies inside the GUI components themselves. This, strictly, is the data they see on the screen, hence I call it the screen state.

Data binding primarily used to solve the synchronization problem between state in GUI (screen state) with application memory state. Any change happen at GUI reflected to memory and vice versa. It is the ideal solution for developer who do not like to do anything twice or multiple times.

Why is databinding is an ideal solution for us developers?

  1. Less code for developers
    It is the absolute benefit for developers since less code means less work, less review and less debugging. As most of developer is often quoted with: It seems that perfection is attained not when there is nothing more to add, but when there is nothing more to remove. In rough understanding, it means faster development.
  2. Consistent
    The code used for handling data binding is same for entire system/framework. It means one solution for a case is also applicable for the same case in other place. Also in some other words, the consistency itself is DRY, since it follows “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system”.
  3. Flexible
    A good databinding feature can give you more flexibility. You can do modification with the display easier and nice, with less code to do.
  4. Encourage Single Responsibility Principle
    It helps to decouple your viewmodel with your UI. Databinding can help to remove the responsibility to handle UI update from your viewmodel, thus encourage SRP.

Conclusion

Databinding can help you to solve synchronization problem between UI state and memory state, and a nice feature to have. If you are doing intensive data entry with updated UI here and there, it can provide you with flexibility and ease of use to freely modify the display.

But does it have any drawback aside from it’s powerful functionality?