Performance

Why should you give buffer to project estimation

“Why is this enhancement needs 1 month? Can’t it be done with just 3 weeks?”

This is often being said by project managers to developers. They like to squeeze estimations as short as possible, then in the end wonder why the project is going late. Overtime happens, code quality reduced, bugs everywhere during development until launching. This is one factor that I, as a developer, see as an “incapable” project manager. Furthermore, client often has the same vision: to reduce the estimation as short as possible. Little did they know that having buffer to estimation can bring many benefits rather than shorter one.

Less overtime

Overtime is unproductive. It has been researched many times, and you can easily looking for those in internet or forums. It is obvious, having more buffer to estimation will lead to less overtime needed, in which overall will prevent productivity dropped in long time.

More often than not, I find that unjustified overtime won’t bring better result in a week / month time span, compared to working normally. That’s because usually the task done in overtime is either wrong, bad quality, or incorrect by requirement, in which usually they will be fixed in next week / month.

On the other hand, justified overtime is a necessity, like when there is a critical bug in production, or when a bug is found at last time before launching.

No spare time for changes

Client is very bad at giving requirement. They don’t know what solution they want. They maybe even don’t know what problem they are facing. In my experience, it is 100% chance that at least one small requirement change and at least 50% chance of one medium / big requirement change will happen during project timeline.

Up to this day, prototyping is very useful, especially in programming world. Usually prototype are made to give client a clear picture of how the program will work and how it will help solve their problem. Requirement change are highly possible to happen at this point.

Setting up tight estimation will be a disaster during this time. With tight estimation, any change can’t be integrated in timeline, since everything has already calculated. This can only lead to bad things: late deadline, non-applicable requirement or bad quality.

No room for mistake

Mistake happen, there is no single software in this world that is bug-free. Even the business flow, a requirement that is given by client, may be incorrect. Setting up tight estimation without calculating for mistake and fixing is planning to fail. More buffer to your estimation means the application can be tested more, and more problems can be fixed during the meantime.

You will never deliver “earlier”

There are two situations possible when delivering, that is to deliver “early” or deliver “late”. Deliver “on time” is almost impossible. Usually the project are already done and ready to be delivered at the specific time, so it counts as “early”.

Now which one is better: “longer estimation but deliver ‘early'” or “shorter estimation but deliver ‘late'”. It’s up for personal preference, but usually I find that deliver “early” bring better impression than “late”. “Early” deliver usually being perceived as faster than the opposite, a physiological  things.

Now with setting up tight estimation you are setting yourself to deliver “late” and never “early”.

It is indeed hard at “client” side

Client is hard to deal with. More often than not they aren’t satisfied with estimation given, no matter how tight that estimation is. Though they want tighter estimation, they cannot give clear requirement. “Just make it so that we input this, that and that and it’s saved! How hard is it?” is usually their saying, without knowing what must be done if it needs to be edited if mistakenly inputted, what if there are changed, which system are affected by the change, etc.

That’s why good client will bring good software as well. If you want to teach a client why a tight estimation is bad, give them a tight estimation. Then everytime there are changes and everytime discussion with client about requirement happens, count for all of that. Then after delivery, make an evaluation session with client and present them how all of those things delay the estimation and make the project late.

The delivery time will be the same after all

Many times I find that tight estimation will be late, and the delivery time is usually the same with when I giving buffer to estimation originally. Inexperienced developer and manager usually prefer to give tighter estimation and underestimating how big time will changes and fixing take. The problem is, how much buffer do they need?

In previous job, I like to add 30-50% buffer to my estimation. Then my PM will try to bargain by cutting 20-30%, then give back some buffer to QA and fixing phase. In the end I assume it’s around 25-40% buffer. With that, I usually deliver 10-20% early, so it means 20-40% is a sweet spot, based on how complex and big the project is. It’s just my preference and personal experience, do not take it as guidance, since everyone is estimating differently.

Now if it’s the same after all, why not try to give longer estimation and more flexibility in requirements and development? It will provide better foundation and code quality in software after all.

Summary

Give buffer to your estimation. It’s benefit is far outweight the false perception that “shorter estimation is faster”. You won’t be flexible or will having good quality if the estimation is tight.

Advertisements

Why PHP is bad and why it isn’t

Nowadays programmers consider PHP as a very bad programming language. One of the example is in this comic strip, saving the princess with programming language. But why is PHP considered bad and why does it is very popular out there?

The good

In general, PHP is a good language to start learning programming with.

It’s easy to setup and start

PHP is very easy to setup, especially for beginner. Just use XAMPP (for windows) and LAMP (for linux), and drop the code in htdocs and everything will go well. Just search in google for “hello world php xampp” or “hello world php lamp” and you’re good to go.

Furthermore it’s one of the easiest language to setup shared hosting, making it very easy to make your own website.

It’s very forgivable

PHP is dynamic typing, meaning you don’t need to specify whether an variable is string, int, specific class, etc. And it’s string concatenation is different with numeric additional, making it less ambiguous than javascript’s dynamic and don’t need type conversion. It’s very easy for beginner to start with.

And PHP variables works very well with HTML. Almost all native variables can be printed to screen by using echo, while array and object need special treatment.

Furthermore, using undefined variable only resulting in notice, and can be easily suppressed. But beware, both are considered “bad habit” in programming, so take it as learning features. There are also more exceptions that usually result in error in other language, that can easily suppressed in PHP.

It’s both procedural and OOP

PHP can serve procedural code, and OOP one. It’s very common to start learning programming with procedural, and learning OOP next, and it’s easier in same language.

Furthermore, PHP is a C-like syntax programming language, and there are many good languages in C-like syntax, like Java, C# and javascript. It’s C-like syntax is better than python (which is also a good starting language) “if” you aim to move later to those language.

Frameworks and tutorials are abundant

With many framework and tutorials out there, someone can search any problem or topics that they currently worked at, and finding many pages of google results. It’s very easy to find answers to PHP problems nowadays.

Furthermore, many PHP framework are using MVC (Model View Controller) pattern, which is one of the most common pattern in web programming. Learning them can help transition to other good languages using MVC pattern, such as Java MVC spring, C# Asp.Net MVC, NodeJs MVC frameworks and many more.

Furthermore nowadays PHP has composer, which is good to handle library as packages, which is almost all new languages use. And PHP has many CMS which make creating webpage like wordpress CMS easy.

The bad

So why is PHP considered bad? Well you need to at least good in programming to know it’s limitation and bad side.

It is not strong, static typed

PHP starts as dynamic, weakly typing language, helping to customize HTML pages ages ago. Up to this day, it still support dynamic typing, while supporting some type hinting at arguments and property level. While dynamic typing is good to start learning programming, it’s not good at complex business process.

However, being interpret language means the type hinting can only trigger when executed. So we won’t get any type error up until the portion of code is executed, as opposed to Java/C# where it can be caught compile time.

Moreover, PHP7, even after getting scalar type hinting for string and int, still not having generics for array. Without any means to type checking array, it’s harder to do type checking and enforce reliability, especially in business process (accounting).

It doesn’t have multithreading options by native

Without using additional components “PThreads”, PHP doesn’t have any options to emulate multithreading. It isn’t that PHP cannot do multithreading, however the problem lies in how “PThreads” works. It copy the current “process” state (loaded classes, etc) into another process and execute them concurrently.

In my experience with PThreads for PHP 5.6, (maybe I just lack configuration, correct me if so) PThreads use bigger memory than other programming languages, notably C#, Java and NodeJs. Moreover it’s harder to catch exception and to debug process spawned by threads.

So it doesn’t support multi-core process

In case of heavy background process or batch processing, most of the time multi-core support is a requirement.

It doesn’t have memory-persistence cache

PHP is run-and-forget scripting language, which load all it’s needed reference class on beginning of request (and during execution for lazy loading one), and to flush them later. The process takes time, and while PHP7 doing JIT to cache some of it’s code, it’s still not efficient because they need to be loaded for every request.

In contrast with PHP’s scripting, NodeJs and C# Asp.Net MVC (haven’t use java, but should be similar) run a server, and keeping the loaded classes (scripts) in memory, making them more efficient.

It’s dynamic typing takes too much memory

Looks like it’s mitigated in PHP 7, however in PHP 5.6 below, the dynamic variable in PHP takes too much memory. It’ll soon be a hassle when working with big variables, big file or many records of data.

And even if PHP7 is more efficient, it still can’t beat C/C++ level of memory usage per variable. And arguably, so do as in comparison with static typed language, such as Java and C# and the currently rising golang.

It’s data access doesn’t support multiple-result sets

Apply for MySql at least (looks like it supported in PostgreSQL). PHP cannot return multiple tables in one query. Let’s say that you have one procedure that returns 3 select queries, PHP MySql driver can only return one.

Many of it’s library support is configured at installation level

Some of the native library for PHP is configured during installation (gcc make and phpize). Some of the examples are zip (–enable-zip), thread safety (–enable-zts) for pthreads. It makes binding configuration to app repository level harder and reduce portability.

In conclusion

PHP is a good language to start programming with, easy to setup and have many libraries / framework / CMS. However in case of advanced use by expert programmers, PHP doesn’t really meet up the requirement.

Choosing the right tool for the job

This morning I read the following article: The myth of the “right tool for the job”. The short summary for that article is: do not choose programming language based on the task / project, but choose based on popularity, documentation and ease of learning. While that statement is not completely wrong, it’s also not perfectly right.

I dare you to use PHP for highly reliable, complex business process

Dynamic type makes developing reliable business process hard, because many times you don’t know which kind of variable being processed. This seems possible with HHVM and PHP7 due to type hinting, however the lacking of generic, runtime validation, and ability to re-assigning different type to same variable is making it harder. Consider the following code:

$var1 = new \ComplexObject(); 
$var1 = "Hello World";
$service->process($var1);

I agree that it is a very bad code snippet. However it is possible in PHP and do not produce any error. Meanwhile in static typed language like C# or Java you will get compile error. Yes, compile error which validate variable types in compile times and produce error if the type is somehow not valid (except type casting).

Why compile-time type validation would that matter compared to runtime validation in PHP? The runtime validation in PHP won’t produce error if the code / function / module isn’t being triggered during process. That means to validate the type hinting at specific function, you’ll need to run all process that use the part of code to check whether the type is valid or not. Meanwhile compile time validation will produce error even if the part of code is not used anywhere.

So in short, if you need highly reliable, validated business process, then static typed, compiled language like Java or C#is better than dynamic typing like Nodejs or PHP.

Developing template-based process in Java or C#?

C# or Java is static typed language, so any template-based or string pattern process will be hard to develop. For example, the following code is a part of swagger JSON specification:

{
    "properties":{
        "type": {
            "type": "object",
            "properties":{
                "id":{
                    "type": "number",
                    "example": 1
                },
                "name":{
                    "type": "string",
                    "example": "Information"
                }
            }
        },
        "description": {
            "type": "string",
            "description": "Description",
            "example": "This is the description of Programming Language"
        }
    }
}

The code is swagger code for part for fields of object. If you tried to parse and process the JSON object in Java or C#, you’ll get a headache due to static typed. Meanwhile you’ll get native support when parsing that code in nodejs, or in PHP you can easily decode the json string to PHP objects.

Real time messaging service

I haven’t use Erlang so I don’t know how superior it is in term of messaging (chat) service. Whatsapp using Erlangfor their messaging service, so it’s somehow good at the job. For this case I’ll promote nodejs over Java, PHPor C#.

Nodejs is non-blocking single process server, meanwhile Java or C# is blocking single process server. PHP is the worst here, it spawn another thread or process for each request. So everytime a data is sent to PHP service, it’ll spawn another thread, loading all classes then begin processing the data. It take too much flow over single simple process.

Java or C# is good, however the non-blocking Nodejs is the superior one here. Nodejs will able to handle more requests in lower performance cost.

Conclusion

Some languages are good for some task, while the other are good for other kind of task. Finding the best language for specific task it not optimal. However deciding not to use programming language which is bad at the task is many times better than sticking to existing, used language in your environment.

Which attitude for hire?

In management world, you will often hear the following quote:

hire for attitude train for skills

The quote is good and hit the point. However even after you agree that attitude is better than skills for hiring people, knowing which “attitude” that are being mentioned in the quote is important. I will say that, without knowing the attitude that is mentioned in the quote, you better hire for skills instead.

In this article, I’ll use many references from Pawel Brodzinski. His writing is insightful and he is good at management. Moreover, he also had written one article titled “Why I Genuinely Want To Work With You“, a very similar with this one.

The essential attitude : Honest, open handed and cheerful

Essential attitude are useful in any kind of job.

For me, the very essential attitude, required and isn’t negotiable is honesty. A person I want to work with, or hire must be a honest person. Brodzinski also say that in business, the trust isn’t measurable. For me, honesty is the most important part in building trust. Dishonest person won’t make a great partner, and they will even make me work under suspicion and making me take extra step to prevent them stabbing me in the back.

The second essential attitude is open handed and willing to help, If they are willing to help me, I will do the same and am willing to help them. No matter how skilled someone is, if they are not willing to help, their ability won’t be much of use.

Last is cheerful. Why cheerful? A cheerful person will bring positive energy to workplace. They will brighten other employees and they can get more motivation and less stressed. In front-desk jobs, a cheerful person will be more liked by customers rather than a gloomy one. You wouldn’t want to work with a gloomy person, right? Worse if that gloomy person is your supervisor.

Note: There may be several disagreement about those three attitudes can be useful in any job. Ex: cheerful factory worker or honest sales. It is conditional though and can be interpreted differently. Ex: the sales is honest to the company and only do small lies to the customer, or cheerful factory worker can be serious at working and fun at break time.

The job-supporting attitude

Different job need different supporting attitude. That’s why you cannot determine that an attitude will be useful for any line of job, even if the attitude is positive one. For example: a manager or CEO will need to have innovative and creative thinking attitude, while a factory worker need not to have that attitude. A non-innovative or non-creative factory worker is better than innovative one, because they can follow order and won’t complain about the current system. It’s cruel I know, but it’s what the business needs.

The same is applied with the so-called good attitude: “hard worker” and “can work under pressure”. Both of them has very specific appliances and not all line of jobs benefit from them. Managers that is “hard worker” usually cannot manage well. They often do the work themselves and leading to micromanagement. If not, they usually prefer hard-worker staff, resulting in overtime, reduced employee happiness and reduced employee creativity and problem solving skill.

In programming world, they don’t need the “can work under pressure” attitude while most company stated that they need programmer with that kind of attitude. I don’t really understand the reasoning behind it. Unless the software they are developing is used in military, nuclear plants, airplanes or anything that can involve life beings, there won’t be any meaningful pressure.

Michelangelo, talent is cheap, dedication is costly!
– Bertoldo de Giovanni

For every job that need skill refinery such as smiting, music, crafting to even programming, dedication is a must-have trait. They need to be dedicated to their job, doing and doing the same thing countless time, refining they skills anytime to make them able to provide masterpieces.

Conclusion

There are essential attitudes, where the attitudes are useful in any line of job and the other attitudes are job-specifics. Knowing which attitudes to look for is as important as knowing that hiring for attitude is better than hiring for skills. When you are mistakenly looking for the wrong attitude required, you are hiring the bad employee.

Microsoft SQL Server : What you need to know as beginner developer?

I realized that Microsoft SQL server is easy to use and setup. It’s UI (management studio) is easy to use, user friendly and nice. However, used by beginner developer, usually the SQL Server will perform bad. As a beginner at MS SQL Server world, what do you need to know?

In this series, I will provides hints and small description about what topics you need to learn beforehand. If you need detailed explanation, you need to do detailed research yourself.

Database transaction isolation level and lock hints

This is the very first thing that you need to know when developing system with SQL Server as database. Why? Because not knowing this will grant you over 80% possibility of deadlock when used in high-transaction system.

By default, MS SQL Server use Serializable isolation level for read queries (select). It is the heaviest-locking isolation level that you can achieve with SQL Server. The most secure, but also the most problematic. It basically locks the table (or page) every time you do select/insert/update/delete queries. In heavy read and light write applications such as stackoverflow, this usually cause problems.

Microsoft recommend Read Committed Snapshot Isolation (RCSI) level for common system. But still you need to search for yourself the best isolation level that is most suit with your apps.

Auto-commit transaction

By default, MS SQL enables the auto commit feature, means that every insert/update/delete that is not inside a transaction will be wrapped in a transaction and committed. This is bad for performance, because every commit will add record into your database log and it hurts performance. This is especially happen in looped-generated statement (to insert/update/delete many rows from applications).

Basically to avoid this, you need to wrap your statements in a transactions, or set auto commit off.

Backup Recovery Model

By default, MS SQL has backup recovery model set to FULL. In contrast to SIMPLE recovery model, full recovery model enables you to point-in-time recovery per transaction committed. While in simple model it is not supported. In short, set to simple model if you don’t think that point-in-time recovery is required, especially in logging database.

Indexing

Indexing is complex, but in case you need to maintain database performance against big volume data, you need to learn indexing. First, you need to know how to get the query execution plan. Next, you need to know the index seek vs index scan, key lookup, and sargable query.

In short, what I recommend is:

* Always use primary keys in all your tables,

* If you need to do join query, ordered by recommendation, it’s better if you can:
1) join between primary key/foreign key
2) the joining field contains same data type and length
3) avoid computational at where clause or joining fields, eg: isnull, where fieldA < fieldB + 1
4) do not use leading wildcard (percent) in string search

Connection overhead

Opening connection and user authentication is providing some performance impact. It is not much, but exists, especially when the application server is far away from database server. Please note that stating there is connection overhead does not means that keeping the connection open is the solution. What you need to avoid is opening/closing connection inside an application loop. It’s better to re-factor it to become more like set-based operation.

Scalar vs table-valued function

In MS SQL, there are 3 types of functions, that is scalar function, table valued function and multi-statement table-valued function. Execution plan wise, scalar and multi-statement table valued is same, while table-valued function is operated more like accessing view. So let’s consider multi-statement table valued function the same as scalar in this topic.

Scalar function inside select column / join / where clause will be converted in looping operations. Meanwhile table-valued function will be treated the same as view, and then being included in query plan as set operations. So, in short, if the function you defined is accessing any table, avoid define it in scalar function.

Conclusion

SQL Server, after being installed can be easily used. But not knowing the features of SQL Server and then using it to make complex system can cause problems. Before actually using it in real system, I suggest that you at least know the points I described above.

However, even if I know those points, it does not immediately makes me know all the SQL Server and the best configurations for each scenarios. I am not a database administrator after all.