Author: Fendy H

Mutability: Array

Mutability in programming is the ability of an object to have it’s state / value changed. Immutable object, on the other hand, cannot have it’s state changed. In php, javascript, C# and Java, most of variables / objects that is commonly used are mutable. Array is one of them.

Array mutability

Let’s see the following snippet of code:

let exec1 = function() {
    console.log("START: EXEC 1");
    let a = [3, 5, 7];
    let wrongOperation = function(arr){
        arr.push(8);
        arr[1] = 4;

        return arr;
    };

    let b = wrongOperation(a);
    console.log("b:");
    b[2] = 6; // mutation
    console.log(b); // results [ 3, 4, 6, 8]

    console.log("a:");
    console.log(a); // results [ 3, 4, 6, 8]
    console.log("DONE: EXEC 1");
    console.log();
};

We can see that any modification to b is changing the value of a, which sometimes is expected, and the other times are becoming a bug. If the result b are returned and modified by another caller. Sometimes in the future, you may wonder why the content of a changed. It will be hard to track where the changes happen to a.

A better design

A better design is indeed, to prevent any changes made to b to be reflected back to it’s original owner, a. This can be achieved by replicating the array argument using concat in javascript, or array_merge in ​php to an empty array. See the example in following snippet:


let exec2 = function() {
    console.log("START: EXEC 2");
    let a = [3, 5, 7];
    let correctOperation = function(arr){
        let result = [].concat(arr);
        result.push(8);
        result[1] = 4;

        return result;
    };

    let b = correctOperation(a);
    console.log("b:");
    console.log(b); // results [ 3, 4, 7, 8]

    console.log("a:");
    console.log(a); // results [ 3, 5, 7]
    console.log("DONE: EXEC 2");
    console.log();
};

Above example shows how the operation copy the argument array first before doing any operation using concat with another empty array. It cause any further modification after that function to not reflect back to original variable a.

Another example can be like this:


let exec3 = function() {
    console.log("START: EXEC 3");
    let a = [3, 5, 7];
    let anotherCorrectOperation = function(arr){
        let newData = [];
        for(let i = 2; i < 4; i++){
            newData.push(i);
        }
        return arr.concat(newData);
    };

    let b = anotherCorrectOperation(a);
    console.log("b:");
    b[1] = 4; // test mutability
    console.log(b); // results [ 3, 4, 7, 2, 3]

    console.log("a:");
    console.log(a); // results [ 3, 5, 7]
    console.log("DONE: EXEC 3");
    console.log();
};

The above example do operations first, then returning the operation result together with the existing array. This is the preferred approach to the other for-push​ directly to argument array.

It’s still just the array that be copied

However, both preferred example above only copied and un-ref the array and array only. The content is still the same, and can be modified. For example, if the array contains javascript objects, any modification to the array member will be reflected back to original variable a:

let exec4 = function() {
    console.log("START: EXEC 4");
    let a = [{v: 1}, {v: 3}];
    let anotherCorrectOperation = function(arr){
        let newData = [];
        for(let i = 2; i < 4; i++){
            newData.push({v: i});
        }
        return arr.concat(newData);
    };

    let b = anotherCorrectOperation(a);
    console.log("b:");
    b[1].v = 4; // test mutability
    console.log(b); // results [ { v: 1 }, { v: 4 }, { v: 2 }, { v: 3 } ]

    console.log("a:");
    console.log(a); // results [ { v: 1 }, { v: 4 } ]
    console.log("DONE: EXEC 4");
    console.log();
};

So you still need to be careful at doing variable modification inside functions. However at least you need not to worry anymore when doing modification to the array.

Conclusion

Design your operation to be immutable and returning copy by default, unless the other behavior is somewhat desired. This can help to make code easier to track, modular and prevent unnecessary bug in the future. All code that is shown in this article can be retrieved at my github repository.

Advertisements

Why should you give buffer to project estimation

“Why is this enhancement needs 1 month? Can’t it be done with just 3 weeks?”

This is often being said by project managers to developers. They like to squeeze estimations as short as possible, then in the end wonder why the project is going late. Overtime happens, code quality reduced, bugs everywhere during development until launching. This is one factor that I, as a developer, see as an “incapable” project manager. Furthermore, client often has the same vision: to reduce the estimation as short as possible. Little did they know that having buffer to estimation can bring many benefits rather than shorter one.

Less overtime

Overtime is unproductive. It has been researched many times, and you can easily looking for those in internet or forums. It is obvious, having more buffer to estimation will lead to less overtime needed, in which overall will prevent productivity dropped in long time.

More often than not, I find that unjustified overtime won’t bring better result in a week / month time span, compared to working normally. That’s because usually the task done in overtime is either wrong, bad quality, or incorrect by requirement, in which usually they will be fixed in next week / month.

On the other hand, justified overtime is a necessity, like when there is a critical bug in production, or when a bug is found at last time before launching.

No spare time for changes

Client is very bad at giving requirement. They don’t know what solution they want. They maybe even don’t know what problem they are facing. In my experience, it is 100% chance that at least one small requirement change and at least 50% chance of one medium / big requirement change will happen during project timeline.

Up to this day, prototyping is very useful, especially in programming world. Usually prototype are made to give client a clear picture of how the program will work and how it will help solve their problem. Requirement change are highly possible to happen at this point.

Setting up tight estimation will be a disaster during this time. With tight estimation, any change can’t be integrated in timeline, since everything has already calculated. This can only lead to bad things: late deadline, non-applicable requirement or bad quality.

No room for mistake

Mistake happen, there is no single software in this world that is bug-free. Even the business flow, a requirement that is given by client, may be incorrect. Setting up tight estimation without calculating for mistake and fixing is planning to fail. More buffer to your estimation means the application can be tested more, and more problems can be fixed during the meantime.

You will never deliver “earlier”

There are two situations possible when delivering, that is to deliver “early” or deliver “late”. Deliver “on time” is almost impossible. Usually the project are already done and ready to be delivered at the specific time, so it counts as “early”.

Now which one is better: “longer estimation but deliver ‘early'” or “shorter estimation but deliver ‘late'”. It’s up for personal preference, but usually I find that deliver “early” bring better impression than “late”. “Early” deliver usually being perceived as faster than the opposite, a physiological  things.

Now with setting up tight estimation you are setting yourself to deliver “late” and never “early”.

It is indeed hard at “client” side

Client is hard to deal with. More often than not they aren’t satisfied with estimation given, no matter how tight that estimation is. Though they want tighter estimation, they cannot give clear requirement. “Just make it so that we input this, that and that and it’s saved! How hard is it?” is usually their saying, without knowing what must be done if it needs to be edited if mistakenly inputted, what if there are changed, which system are affected by the change, etc.

That’s why good client will bring good software as well. If you want to teach a client why a tight estimation is bad, give them a tight estimation. Then everytime there are changes and everytime discussion with client about requirement happens, count for all of that. Then after delivery, make an evaluation session with client and present them how all of those things delay the estimation and make the project late.

The delivery time will be the same after all

Many times I find that tight estimation will be late, and the delivery time is usually the same with when I giving buffer to estimation originally. Inexperienced developer and manager usually prefer to give tighter estimation and underestimating how big time will changes and fixing take. The problem is, how much buffer do they need?

In previous job, I like to add 30-50% buffer to my estimation. Then my PM will try to bargain by cutting 20-30%, then give back some buffer to QA and fixing phase. In the end I assume it’s around 25-40% buffer. With that, I usually deliver 10-20% early, so it means 20-40% is a sweet spot, based on how complex and big the project is. It’s just my preference and personal experience, do not take it as guidance, since everyone is estimating differently.

Now if it’s the same after all, why not try to give longer estimation and more flexibility in requirements and development? It will provide better foundation and code quality in software after all.

Summary

Give buffer to your estimation. It’s benefit is far outweight the false perception that “shorter estimation is faster”. You won’t be flexible or will having good quality if the estimation is tight.

PHP Nested Ternary Operator Order

Now let’s say we have the following pseudo-code that we want to implement in several programming language:

bool condition = true;
string text = condition == true ? "A" : condition == false ? "B" : "C";
print text;

Here is the implementation in Javascript:

let st = true;
let text = st === true ? "A": st === false ? "B" : "C";
document.write(text);

Here is the implementation in PHP:

$st = true; 
$text = $st === true ? "A": $st === false ? "B" : "C"; 
echo $text;

As a bonus, I also try the same pseudocode in C#:

bool st = true;
string text = (st == true) ? "A" : (st == false) ? "B" : "C";
Console.WriteLine(text);

Now let’s guess what is the result of variable text? The expected value should be A, which is already correct in other languages, but PHP​ produces B. Well, this is not a great discovery but many PHP developers may be missing this after all, so I think it’s worth archiving. Parentheses may fix them but it’s making things ugly, looks like it’s better to stick with if-else statement then.

Security is hard

The recent issue about meltdown and spectre attack shows how hard a security implementation is. For a short explanation, those two attacks takes advantage of CPU’s error handling to gain access and read other non-authorized memory address. A patch has been published by each respective vendor and OS right after. However the real issue is the applied patch can bring down the performance up to 30%! And this is what I want to raise in this article.

Trade-off

Ignoring programmers efforts or development cost, a security implementation may or may not has a trade-off, but it’s more likely to has a trade-off rather than not.

Let’s take for example a security token for online banking. It’s a security implementation that reduce UX (user experience) by adding one step of verification. Though in this case the trade-off is worth it, that it helps the user to verify the input and prevent wrong transaction that otherwise can be too easy.

Asking user for username password everytime to login is also a UX trade-off, in which lately there is other option by “login with facebook”, “login with twitter” and so on. And in majority of trade-off, such as in latest meltdown case, is performance drop due to another step of verification.

Trade-off vs Risks

Security flaw after all, are just risks. It’s only when an attack being executed that the security flaw is a loss for one. Usually security flaw only bring negligible trade-off (performance drop) that it’s better to implement than not. Some example, preventing sql injection, xss, one-way hash salted password, using HTTPS is a common practice. They should be enforced because otherwise it’ll be too easy for the attacker to exploit the flaw and getting advantage of it.

However in case of up to 30% performance drop in latest case, how complex and how much precondition there is for a successful meltdown attack, the performance drop to risk rate can be considered high. In this case, there is an “advantage” to not fix the security flaw, and simply hoping for the attacker to either not targeting you, do not attempt with specific attack method, or simply doesn’t interested enough that they don’t want to waste with their time.

However, the risks will always be there and the attacker may be have better and better tools to exploit the flaw, while at the same time we can hope for better and better fix with lower trade-off to exists. After all, it’ll be top level management and developers that may decide whether it’ll be better to patch it right away or leave it as is.

After all, security is hard.

 

Debugging / learning is a scientific cycle

This is a little shower-thought idea I’ve got a while ago, that by debugging, a less or more you are actually doing a scientific cycle. Though it’s simpler than actual scientific cycle. A full simple scientific cycle can consist of: do observation, make theory / hypotheses, perform experiment, perform observation based on that experiment, repeat. I know, actual scientific process cycle and debugging is more complicated than that, but the general idea for their full flow is that cycle.

Do observation

When you encounter a bug or abnormality in the process, the very first thing that you need to do is observe the abnormality. You will need to observe some (or all) of those things:

  • what happened during the specific process
  • what was the process doing
  • what is affected
  • what is the current result
  • what is the desired result
  • where is the difference
  • what is written in the log
  • is there any more information that you can gather?

The observation process will lead to next step, to make hypotheses.

Making hypotheses

You will craft some hypotheses from all of the information gathered from observation process before. Some of the hypotheses can be:

  • it only occurs at requests with specific value at specific field
  • only when user do those specific steps
  • when the machine is having high load
  • when there is issue in internet connection
  • when the machine’s recommended specification is not met
  • when some of the apps is outdated
  • and so on…

If there are insufficient information acquired from previous process, the worst hypotheses available can be: that bug will happen if you perform the same step, with the same data, at the system with same configuration, maybe needed to be done at specific time. No matter what your hypotheses are, the best experiment to perform next is to reproduce the issue.

Perform experiment reproduce the issue

This is one of the hardest steps of debugging, creating an environment that can reproduce the issue consistently. Many issue can be hardware specific, concurrent / race condition specific, network issue specific, hardware failure specific, and many other complex situation can produce the issue. But this hard effort can provide you with big rewards, such as that you will understand the process more, it will be easier to decide the cause and you will be ensured that the fix is really solving the issue.

After you can reproduce the issue consistently, you can do the next step by placing more logging features, setup debugger tools and then continue with observation.

Do observation, make hypotheses and experiment again

With more information and the ability to reproduce the issue, you can repeatedly perform the cycle. Observation produce information, it will be used to make hypotheses, you make fix based on the hypotheses, observe whether the fix is really solving the problem, make another hypotheses if the problem is still there, perform another fix, repeat.

At the last iterations, you may observe that the change to application has fixed the problem. Then you will start to make theories (a hypotheses that is supported by facts from tests), then do more experiment to prove the theories. For example, you can change the application back to re-produce the same error with different condition, or that you can do same steps with different data to ensure that the fix is correct. If you theories is proven by more tests, then the debugging process is completed.

Unit test

Now we can see that the debugging process above is very complex and time-consuming. Especially when you need to re-create the environment to reproduce the error every time. Unit test is an amazing tools in this case.

With unit test, you can do experiment with more isolated environment, can easily tinker the data with mock object, replicate the situation or configuration (set the time to specific value maybe) and many more to reproduce the issue. Once the issue has been reproduced, the test will result in fail / error.

Then the fix that you made will be tested again until it produce the correct expectation, and other existing unit tests can help to ensure that it won’t make error in other place. Amazing, right?

Conclusion

Debugging is more or less similar with how you will perform scientific experiment / research. It’s a repetitive cycles of observation, hypotheses and experiment. Unit testing can help the process greatly since it can create an enclosed environment in which you can perform experiments with artificial conditions.

Why PHP is bad and why it isn’t

Nowadays programmers consider PHP as a very bad programming language. One of the example is in this comic strip, saving the princess with programming language. But why is PHP considered bad and why does it is very popular out there?

The good

In general, PHP is a good language to start learning programming with.

It’s easy to setup and start

PHP is very easy to setup, especially for beginner. Just use XAMPP (for windows) and LAMP (for linux), and drop the code in htdocs and everything will go well. Just search in google for “hello world php xampp” or “hello world php lamp” and you’re good to go.

Furthermore it’s one of the easiest language to setup shared hosting, making it very easy to make your own website.

It’s very forgivable

PHP is dynamic typing, meaning you don’t need to specify whether an variable is string, int, specific class, etc. And it’s string concatenation is different with numeric additional, making it less ambiguous than javascript’s dynamic and don’t need type conversion. It’s very easy for beginner to start with.

And PHP variables works very well with HTML. Almost all native variables can be printed to screen by using echo, while array and object need special treatment.

Furthermore, using undefined variable only resulting in notice, and can be easily suppressed. But beware, both are considered “bad habit” in programming, so take it as learning features. There are also more exceptions that usually result in error in other language, that can easily suppressed in PHP.

It’s both procedural and OOP

PHP can serve procedural code, and OOP one. It’s very common to start learning programming with procedural, and learning OOP next, and it’s easier in same language.

Furthermore, PHP is a C-like syntax programming language, and there are many good languages in C-like syntax, like Java, C# and javascript. It’s C-like syntax is better than python (which is also a good starting language) “if” you aim to move later to those language.

Frameworks and tutorials are abundant

With many framework and tutorials out there, someone can search any problem or topics that they currently worked at, and finding many pages of google results. It’s very easy to find answers to PHP problems nowadays.

Furthermore, many PHP framework are using MVC (Model View Controller) pattern, which is one of the most common pattern in web programming. Learning them can help transition to other good languages using MVC pattern, such as Java MVC spring, C# Asp.Net MVC, NodeJs MVC frameworks and many more.

Furthermore nowadays PHP has composer, which is good to handle library as packages, which is almost all new languages use. And PHP has many CMS which make creating webpage like wordpress CMS easy.

The bad

So why is PHP considered bad? Well you need to at least good in programming to know it’s limitation and bad side.

It is not strong, static typed

PHP starts as dynamic, weakly typing language, helping to customize HTML pages ages ago. Up to this day, it still support dynamic typing, while supporting some type hinting at arguments and property level. While dynamic typing is good to start learning programming, it’s not good at complex business process.

However, being interpret language means the type hinting can only trigger when executed. So we won’t get any type error up until the portion of code is executed, as opposed to Java/C# where it can be caught compile time.

Moreover, PHP7, even after getting scalar type hinting for string and int, still not having generics for array. Without any means to type checking array, it’s harder to do type checking and enforce reliability, especially in business process (accounting).

It doesn’t have multithreading options by native

Without using additional components “PThreads”, PHP doesn’t have any options to emulate multithreading. It isn’t that PHP cannot do multithreading, however the problem lies in how “PThreads” works. It copy the current “process” state (loaded classes, etc) into another process and execute them concurrently.

In my experience with PThreads for PHP 5.6, (maybe I just lack configuration, correct me if so) PThreads use bigger memory than other programming languages, notably C#, Java and NodeJs. Moreover it’s harder to catch exception and to debug process spawned by threads.

So it doesn’t support multi-core process

In case of heavy background process or batch processing, most of the time multi-core support is a requirement.

It doesn’t have memory-persistence cache

PHP is run-and-forget scripting language, which load all it’s needed reference class on beginning of request (and during execution for lazy loading one), and to flush them later. The process takes time, and while PHP7 doing JIT to cache some of it’s code, it’s still not efficient because they need to be loaded for every request.

In contrast with PHP’s scripting, NodeJs and C# Asp.Net MVC (haven’t use java, but should be similar) run a server, and keeping the loaded classes (scripts) in memory, making them more efficient.

It’s dynamic typing takes too much memory

Looks like it’s mitigated in PHP 7, however in PHP 5.6 below, the dynamic variable in PHP takes too much memory. It’ll soon be a hassle when working with big variables, big file or many records of data.

And even if PHP7 is more efficient, it still can’t beat C/C++ level of memory usage per variable. And arguably, so do as in comparison with static typed language, such as Java and C# and the currently rising golang.

It’s data access doesn’t support multiple-result sets

Apply for MySql at least (looks like it supported in PostgreSQL). PHP cannot return multiple tables in one query. Let’s say that you have one procedure that returns 3 select queries, PHP MySql driver can only return one.

Many of it’s library support is configured at installation level

Some of the native library for PHP is configured during installation (gcc make and phpize). Some of the examples are zip (–enable-zip), thread safety (–enable-zts) for pthreads. It makes binding configuration to app repository level harder and reduce portability.

In conclusion

PHP is a good language to start programming with, easy to setup and have many libraries / framework / CMS. However in case of advanced use by expert programmers, PHP doesn’t really meet up the requirement.

Conference experience: #tiajkt2017

First and foremost, thank you for Tech in Asia for free #tiajkt2017 conference invitations to my workplace. What an amazing conference. Now let me share some of my experience attending the conference.

It’s so full of visitors

The conference is big, the hall is also big, however there are full of visitors everywhere. Not only from startups, bootstrap alleys, there are also full of academy, investor and general guests everywhere. However I feel bad for bootstrap alley’s long registration queue at the morning, hope it’ll be better next time.

Notes from developer stage

Today I came mainly for attending developer stage. As a programmer myself, the seminar materials are very useful for me. For those who did not attended the conference / stage, here are some notes that I think are important to be known:

Artificial Intelligence are raising in popularity

Artificial Intelligence, with the more specific Machine Learning as subset are gaining popularity in IT industry. Their usage are vast, and will be very useful to (but not limited to):

  • Targeting advertising and marketing
  • Personalization
  • Cut the repetitive workload such as managerial approval and document filling checking
  • Recognizion (face / image, sound, text)

My speculation are that the popularity of big data in recent years enabled companies to do further data mining and lastly the development of artificial intelligence. It’ll be popular and mainstream in next 5-10 years, so make sure to invest your skill in there (me too, need to learn it asap).

Mobile is the king

Unless the startup’s field is in SAAS for administrative B2B systems, their applications will be mobile device oriented. Well then it’s not a wonder that there are rising in demand for ionic, android, angular and react programmers. It’s beneficial for you to learn any of those programming features. And for startup founders, please aware that this is the era of mobile device, and consider to hire those (at least one) who excel at mobile programming.

Metrics are important

For companies and especially for startups, it’s very important to have metrics. Without them, you won’t know how good your progress, and how you measure your lately performance. For startups in specifics, keep dreaming to getting good investors without good metrics.

And make sure your metrics are correct, and aligned with your company’s value. Measuring number of instagram with tag posted won’t be useful for e-commerce startup.

Clouds are rising higher

With the rising in big data and artificial intelligence, the needs for higher spec hardware and workload fluctuation are appearing. Cloud is one of the solutions, that they offer performance scaling in specific time (100% increase in ram for next 24 hours for example). No wonder they are more popular now.

Startup types

I see there are more startups that works in recruitment area. Maybe that’s being inspired with how hard it is to find good programmers. I’m wondering whether they will provide solution (since in reality, good programmers are really scarce). E-commerce startups aren’t many, and most of them are b2b. That’s good. Some ad-hoc service providers startups like plumbers are common too. And uniquely, there are some that trying to work in agricultural area, and one in AI for chatting and socmed marketing. Keep up the good work!

So…

It was an amazing experience. It’s unfortunate that there won’t be any programmers topic at 2nd day though. Maybe TIA can also add IT recruitment segment in their next conference? Lastly I wanna say thank you, and congratulations to Tech in Asia for hosting such amazing conference!