Process

Debugging / learning is a scientific cycle

This is a little shower-thought idea I’ve got a while ago, that by debugging, a less or more you are actually doing a scientific cycle. Though it’s simpler than actual scientific cycle. A full simple scientific cycle can consist of: do observation, make theory / hypotheses, perform experiment, perform observation based on that experiment, repeat. I know, actual scientific process cycle and debugging is more complicated than that, but the general idea for their full flow is that cycle.

Do observation

When you encounter a bug or abnormality in the process, the very first thing that you need to do is observe the abnormality. You will need to observe some (or all) of those things:

  • what happened during the specific process
  • what was the process doing
  • what is affected
  • what is the current result
  • what is the desired result
  • where is the difference
  • what is written in the log
  • is there any more information that you can gather?

The observation process will lead to next step, to make hypotheses.

Making hypotheses

You will craft some hypotheses from all of the information gathered from observation process before. Some of the hypotheses can be:

  • it only occurs at requests with specific value at specific field
  • only when user do those specific steps
  • when the machine is having high load
  • when there is issue in internet connection
  • when the machine’s recommended specification is not met
  • when some of the apps is outdated
  • and so on…

If there are insufficient information acquired from previous process, the worst hypotheses available can be: that bug will happen if you perform the same step, with the same data, at the system with same configuration, maybe needed to be done at specific time. No matter what your hypotheses are, the best experiment to perform next is to reproduce the issue.

Perform experiment reproduce the issue

This is one of the hardest steps of debugging, creating an environment that can reproduce the issue consistently. Many issue can be hardware specific, concurrent / race condition specific, network issue specific, hardware failure specific, and many other complex situation can produce the issue. But this hard effort can provide you with big rewards, such as that you will understand the process more, it will be easier to decide the cause and you will be ensured that the fix is really solving the issue.

After you can reproduce the issue consistently, you can do the next step by placing more logging features, setup debugger tools and then continue with observation.

Do observation, make hypotheses and experiment again

With more information and the ability to reproduce the issue, you can repeatedly perform the cycle. Observation produce information, it will be used to make hypotheses, you make fix based on the hypotheses, observe whether the fix is really solving the problem, make another hypotheses if the problem is still there, perform another fix, repeat.

At the last iterations, you may observe that the change to application has fixed the problem. Then you will start to make theories (a hypotheses that is supported by facts from tests), then do more experiment to prove the theories. For example, you can change the application back to re-produce the same error with different condition, or that you can do same steps with different data to ensure that the fix is correct. If you theories is proven by more tests, then the debugging process is completed.

Unit test

Now we can see that the debugging process above is very complex and time-consuming. Especially when you need to re-create the environment to reproduce the error every time. Unit test is an amazing tools in this case.

With unit test, you can do experiment with more isolated environment, can easily tinker the data with mock object, replicate the situation or configuration (set the time to specific value maybe) and many more to reproduce the issue. Once the issue has been reproduced, the test will result in fail / error.

Then the fix that you made will be tested again until it produce the correct expectation, and other existing unit tests can help to ensure that it won’t make error in other place. Amazing, right?

Conclusion

Debugging is more or less similar with how you will perform scientific experiment / research. It’s a repetitive cycles of observation, hypotheses and experiment. Unit testing can help the process greatly since it can create an enclosed environment in which you can perform experiments with artificial conditions.

Advertisements

Choosing the right tool for the job

This morning I read the following article: The myth of the “right tool for the job”. The short summary for that article is: do not choose programming language based on the task / project, but choose based on popularity, documentation and ease of learning. While that statement is not completely wrong, it’s also not perfectly right.

I dare you to use PHP for highly reliable, complex business process

Dynamic type makes developing reliable business process hard, because many times you don’t know which kind of variable being processed. This seems possible with HHVM and PHP7 due to type hinting, however the lacking of generic, runtime validation, and ability to re-assigning different type to same variable is making it harder. Consider the following code:

$var1 = new \ComplexObject(); 
$var1 = "Hello World";
$service->process($var1);

I agree that it is a very bad code snippet. However it is possible in PHP and do not produce any error. Meanwhile in static typed language like C# or Java you will get compile error. Yes, compile error which validate variable types in compile times and produce error if the type is somehow not valid (except type casting).

Why compile-time type validation would that matter compared to runtime validation in PHP? The runtime validation in PHP won’t produce error if the code / function / module isn’t being triggered during process. That means to validate the type hinting at specific function, you’ll need to run all process that use the part of code to check whether the type is valid or not. Meanwhile compile time validation will produce error even if the part of code is not used anywhere.

So in short, if you need highly reliable, validated business process, then static typed, compiled language like Java or C#is better than dynamic typing like Nodejs or PHP.

Developing template-based process in Java or C#?

C# or Java is static typed language, so any template-based or string pattern process will be hard to develop. For example, the following code is a part of swagger JSON specification:

{
    "properties":{
        "type": {
            "type": "object",
            "properties":{
                "id":{
                    "type": "number",
                    "example": 1
                },
                "name":{
                    "type": "string",
                    "example": "Information"
                }
            }
        },
        "description": {
            "type": "string",
            "description": "Description",
            "example": "This is the description of Programming Language"
        }
    }
}

The code is swagger code for part for fields of object. If you tried to parse and process the JSON object in Java or C#, you’ll get a headache due to static typed. Meanwhile you’ll get native support when parsing that code in nodejs, or in PHP you can easily decode the json string to PHP objects.

Real time messaging service

I haven’t use Erlang so I don’t know how superior it is in term of messaging (chat) service. Whatsapp using Erlangfor their messaging service, so it’s somehow good at the job. For this case I’ll promote nodejs over Java, PHPor C#.

Nodejs is non-blocking single process server, meanwhile Java or C# is blocking single process server. PHP is the worst here, it spawn another thread or process for each request. So everytime a data is sent to PHP service, it’ll spawn another thread, loading all classes then begin processing the data. It take too much flow over single simple process.

Java or C# is good, however the non-blocking Nodejs is the superior one here. Nodejs will able to handle more requests in lower performance cost.

Conclusion

Some languages are good for some task, while the other are good for other kind of task. Finding the best language for specific task it not optimal. However deciding not to use programming language which is bad at the task is many times better than sticking to existing, used language in your environment.

Methodology / design pattern / development driven apa yang paling bagus?

Saat sedang melihat-lihat group programming di facebook, saya pernah melihat beberapa job opportunity dengan slogan seperti berikut: “Perusahaan kami menerapkan agile dan scrum!”. Banyak juga bahasan mengenai “agile” lebih baik dari waterfall, adalah metodologi terbaik. Tidak sedikit pula junior programmer atau analyst atau project manager baru di kantor-kantor yang bersikukuh bahwa team harus mulai menerapkan agile dan scrum. Hal yang sama juga berlaku pada TDD (Test Driven Development). Benarkah scrum + TDD adalah metodologi yang terbaik?

Sebelum saya melanjutkan dengan pembahasan yang lebih detail, akan menekankan hal yang menurut saya pribadi paling penting dalam pengembangan applikasi / system:

Pergunakan tools / cara apapun yang dirasa terbaik untuk menghasilkan applikasi yang bekerja dengan baik, dan mudah diubah. “Make it works, and changeable!”

Individuals and interactions over processes and tools

Salah satu manifesto yang cukup penting dan sering dilupakan dalam agile adalah “individuals and interactions over processes and tools“, atau bisa disingkat sebagai “people over process“. Sebagai programmer / developer / analyst, terkecuali kamu adalah ConcernedApe yang men-develop indie game “Stardew Valley”seorang diri dulunya, kamu akan bekerja dalam team, berinteraksi dan membuat keputusan-keputusan bersama.

Preferensi setiap orang berbeda-beda, dan metolodogi tertentu bisa bekerja di satu kelompok orang, bisa juga tidak berfungsi di kelompok orang lainnya. Bagaimana kamu dan team developer bisa bekerja dengan baik adalah yang terpenting, proses / metodologi adalah “alat bantu” yang bisa digunakan untuk mencapainya. “Scrum user story“, “Kanban board” tidak lebih dari hanya alat bantu untuk bekerja dalam team.

Bila dalam beberapa situasi tools tersebut tidak dapat digunakan dalam team, misalnya mayoritas tidak mengerti cara pakainya, atau tidak merasa ada manfaatnya, atau misalnya projectnya cukup kecil sehingga tidak diperlukan, maka carilah alternatif tools yang lebih dapat bermanfaat bagi team. Misalnya post-it notes di cubicle kerja masing-masing, atau minutes of meeting yang di-share dalam email, atau bug tracker.

Pernah saya berdiskusi dengan seorang project manager yang cukup berpengalaman, mengapa beliau tidak menerapkan daily standup meeting, yang umumnya adalah senjata pemungkas di scrum. Beliau menjelaskan bahwa teamnya bekerja secara remote di berbagai daerah yang berbeda, sehingga daily standup meeting tidak dapat dilakukan. Selain itu, tidak selalu ada hal yang dapat di-share dalam daily meeting, sehingga beliau menerapkan rule weekly report, dan contact langsung apabila ada (laptop, mouse rusak misal atau approval) yang diperlukan. Itu adalah contoh “people over process“.

Management juga tidak akan serta merta mengubah metodologi yang sedang berjalan secara tiba-tiba, untuk mengadopsi scrum secara mendadak. Perubahan itu terlalu beresiko, apabila team tidak terbiasa dan banyak yang tidak mengerti, maka perubahan hanya akan membawa musibah daripada manfaat. Terkecuali memang ada kebutuhan untuk meningkatkan metodologi development, jangan memaksa management untuk berubah demi ego sendiri atau “hanya karena scrum lebih baik”.

Metodologi sebaik apapun tidak akan berguna bila team tidak dapat menghasilkan applikasi yang berfungsi dengan baik

Scale it!

Salah satu hal yang menarik yang saya cermati adalah tidak banyak orang yang menyadari bahwa metodologi yang berbeda bisa diterapkan dalam skala yang berbeda pula, dalam satu organisasi. Misalnya project manager dan customer menggunakan iterative waterfall untuk memecah-mecah feature development ke dalam project-project, sebagai team bisa saja development dilakukan secara agile / scrum.

Atau hingga dalam skala personal sebagai programmer, kamu bisa saja memecah task list yang diberikan menjadi iterasi2 yang bisa menghasilkan feedback dalam 2 minggu (iterasi standar scrum), dan melaporkannya ke project manager / team lead secepatnya.

Don’t forget to make it changeable!

Saya berani bertaruh, tidak ada development plan / requirement yang tidak berubah saat development. Perubahan requirement adalah sangat wajar dan sangat mungkin terjadi. Di luar negosiasi finansial yang memang bukan ranah developer, mengembangkan applikasi agar bisa mudah diubah-ubah sesuai dengan perubahan requirement adalah penting. Applikasi yang mudah diubah juga penting agar dapat menambahkan fitur dengan mudah di kemudian hari. Hal ini juga tersirat dalam agile manifesto, “Responding to change over following a plan“.

Lalu bagaimana dapat mengembangkan applikasi yang mudah dikembangkan / diubah? Salah satunya adalah dengan membuat “low coupling, high cohesion” modul (class / function). Dan TDD dapat membantu mengembangkan kode yang low coupling tersebut. Namun dengan efek samping development time yang meningkat hingga 2x dari biasa dan kompleksitas dalam unit test, tidak semua team dapat menggunakan TDD (namun pastikan menggunakan TDD bila memungkinkan).

Ada 2 pemahaman lain yang juga dapat membantu mengembangkan kode “low coupling high cohesion”, yaitu Single Responsibility Principle, dan Dependency Injection. Keduanya adalah bagian dari SOLID principle, yang menurut saya lebih bermanfaat dari ke-3 pemahaman lain dari SOLID principle.

Conclusion

Pergunakan development method yang paling cocok dan bermanfaat untuk team. Kembangkan applikasi yang berfungsi, bisa digunakan dan mudah untuk diubah. Kembangkan kode yang “Low coupling High cohesion” agar applikasi dapat diubah dengan lebih mudah.

Is Artificial Intelligence can become dangerous?

This article is purely my opinion about dangers present in Artificial Intelligence, based on arguments between two amazing person in computer fields, Mark Zuckerberg and Elon Musk about it. In short, Elon says that there is danger present in non-regularized A.I. development, while Mark says not (one example article).

I don’t want to discuss or guessing who is right or wrong, and considering that application and fields in Artifical Intelligence is very wide, we cannot decide anything based on several fields. Furthermore my understanding about A.I. is far more limited than theirs are, so saying that one is right and another isn’t, is nonsense. But let’s say that I agree with Elon that in some cases, AI can be devastating, and it is because our lack of control and security concern.

One of A.I. design is to mimic us

The above video is MarI/O, machine learning (a part of Artificial Intelligence) for classic game Super Mario World in SNES (oh, how nostalgic). It shows how it learns to beat a stage using experience gained from losing each time. It is seems like a very simple application of AI, though at current technology, it’s kind of amazing.

Now what if we can somehow develop the same system with purpose of bypassing captcha? Kind of similar with virus vs antivirus, it is a game of cat and mouse, captcha vs bot is a game of cat and mouse. Captcha must evolve to keep able to detect bots, and bots need to evolve to keep able to fool Captcha. It’s a constant battle.

Assuming that bot can be evolved using supercomputer and machine learning and it’s so advanced so it can mimic us, humans flawlessly, then Captcha is fighting a losing battle. At worst, captcha will even prevent human from access.

Now what will happen if that resource is focused towards development of hacking tools, powered by supercomputer? That hacking tools are being armed with skills from clever hackers and improved overtime from experience and trial and error. One of the existing application of programmed hacking, by brute-force attack, is the example. Fortunately, brute-force attack is being greatly mitigated with account locking and captcha.

Now that when the tools are ready, it then mounted on hundreds of supercomputers to exploit the vulnerabilities existing in internet. How much outrage will happen at that time? Many sites will be hacked at same time, many other will be down. Considering our credit card and banking information exists there, it is also at risk to be blown over. Combine with Ransomware like WannaCry, it’ll make matters worse. What’ll happen if GitHub fall over because of that?

Lack of Control

Soon, we will have autonomous driving car in the streets. They’ll become common. Now what will happen if for one time there occurs a system failure? Maybe due to short circuit, deteriorating hardware or even cosmic ray. Usually it’s being handled by changing the control to manual mode, but how if at the same time, due to how advanced the A.I. has become, that the A.I. decided to not give away control? Or more realistically, if the passenger / driver is too unaware or distracted to act on specific time. It’ll be bad.

In a current, simpler case will be when a gas pedal is stuck at automatic car. In manual car, pressing the clutch will definitely cancel the acceleration. Put the gear to natural for further safety, then no danger will present. However in automatic car, it needs the gear to change to neutral, still with a chance of error. It’s not much, but the more we lose control over something, the more dangerous it will be.

Even in today’s lifestyle, companies will try to use A.I. to sell you more goods. Sometimes you don’t realize, but for they who don’t notice it, they can be powerless in front of those companies, only to burn more bucks for them.

Smart encryption

So let’s say that terrorist already develop an evolving-tightly-encrypted, anti-spy messaging service via internet. They can use that platform to communicate each other, freely, without able to be tracked by government or officials. Worse if they actually can hack into officials (police or army) communication line, they can get the security hole to perform some action.

No robots?

AI robots situation like Terminator or I, Robot maybe possible in distant future, but not in near future. Limitation of energy supply and processing power is one of the big cause they cannot be realized soon. Furthermore the lacking of robot-shaped humans hinders them in place / specific area. As xkcd has explained, it’s very unlikely for it to happen.

Conclusion

The way A.I. can be dangerous is not in a physical form that is popular in movies, like robot apocalypse. It is more related to our daily life, our interaction with computers and security concern.

Agile manifesto

Agile is dead! Really?

Dave Thomas, one among the founders of agile manifesto says so in his post blog, “Agile is Dead (Long Live Agility)“. And lately I have seen a blog post “Agile is Dead, Long Live Continuous Delivery“. Is agile really die and need to be changed? Is Continuous Delivery a replacement of agile?

Background

In his blog post, Dave Thomas indeed say that the word “agile” need to be killed down. But why? That’s because the word “agile” and the “agile” concept has been misguided and become a tool, a process and being industrialized. As Dave said in his blog post:

Once the Manifesto became popular, the word agile became a magnet for anyone with points to espouse, hours to bill, or products to sell. It became a marketing term, coopted to improve sales in the same way that words such as eco and natural are. A word that is abused in this way becomes useless—it stops having meaning as it transitions into a brand.

Get it? The current “agile” is swayed from it’s original meaning and objective. It has become “marketing term”.

Agile is guidance, not methodology

Let’s see what we have at agile manifesto:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

The first point itself state that agile is not a methodology. The first point of manifesto states that you need to prefer to individuals and interactions over process and tools. If the agile you know focus to methodology and process, it already violates the first manifesto. Agile is just a guidance, to help you (and the team / company) to find the best process and tools for your development activity.

Even scrum, one of the most popular agile adaptation is still only a guidance or development framework. As stated,

“Langkah pertama yang perlu dilakukan untuk dapat memahami apa itu Scrum adalah dengan tidak mengaitkannya dengan sebuah metodologi apa pun juga karena Scrum hanya sebuah kerangka kerja dalam mengembangkan produk kompleks seperti software.” – Partogi, Joshua (2015) Manajemen Modern dengan Scrum, Yogyakarta, penerbit ANDI.

As translated: the first step that need to be done to able to understand what is Scrum is to not connect it with any methodology because Scrum is only a framework to develop complex product like software.

The marketing term agile / scrum

Today’s agile / scrum meaning varies and very misleading. The most popular misleading scrum is that it has “morning meeting” and “no documentation”. “If we have morning meeting, we already do agile”. It’s very misleading. The sole focus to process is already swayed away from agile manifesto.

Now let’s say that in your company you have one PM that manage 3 teams. Each team is doing project in different area. How can you do morning meeting with those setup? Evening e-mail daily reporting is enough for project tracking with those setup.

Continuous delivery is the next agile!

First, continuous delivery is not directly contradict with agile. Agile does not state that you need to develop a non-ready program. Moreover, agile state that you need to “responding to change over following a plan”, in which more or less is aligned with continuous delivery. And as stated above, agile is a guidance, not a methodology. We can say that continuous delivery is a more advanced implementation for agile.

Then it is the next agile! Hold on. As stated in agile manifesto, individuals and interactions over processes and tools, you cannot directly implement continuous delivery (CD) and hoping for everything to work well. You still need to consider the capability of team, servers, workflows, division, business decision. And the most important, the kind of software you make. CD relies heavily on good QA, test unit and continuous integration infrastructure. If you don’t have those matured, it’s risky to implement it.

Conclusion

Agile is not dead. It’s just that the term “agile” itself has swayed from it’s original meaning. It is a guidance and not methodology. It’s implementation may be evolved and improved. Continuous delivery is a good methodology and can align well with agile.

I’ve found a bug! Who to blame?

Despite that blame culture serve no benefit to the company, many companies still have this culture. Moreover, it is stupider to put a blame for bugs happened in software. Simply said, that’s because bugs happen, and the possibility of a bug is close to infinite.

Many times, many parties responsible for bugs in production

Let’s say that we use the following SDLC: requirement, analysis, design, develop, test, implement, maintenance. Now let’s see which responsibility can be held in each phases for each parties.

Requirement

The requirement given by user is ambiguous, has multiple meaning or the worst, he/she is wrong when giving the requirement. Next, the meeting PIC (person in charge) is doing mistake during passing the requirement through the analyst and programmer.

Analysis

The analyst doing a mistake during interpreting the user requirement. The analyst is mistaken during communicating the analysis and requirement through the programmer.

Development

Many-many things can happen during development. Wrong logic condition, wrong parameter assignment, etc. In short, the code is wrong.

Testing

The test case is very lacking compared to the complexity of the project and environment deployed. The case is testing wrong things. The test case resulting in false positive.

Implementation

Incorrect publish parameter. Incorrect deployment tools configuration. Production environment not tested beforehand. Production environment (OS, etc) is different with testing environment, and the code has environment-dependencies. Worse if the publishing done manually.

Maintenance

Internal support team inserting data with format that is not tested (whitespace characters). Some script being executed without tested. No change management for software correction, which causing the developers can directly push file to server. Some maintenance activity (backup, index rebuilding) causing anomaly and incorrect results. User trying to break the system.User input data with unsupported format.

Other causes

Requirement change mid-development. Some requirement break / not compatible with other requirement. Development time being shortened to meet earlier shipping date. Natural disaster causing loss in development time, but no schedule adjustment.

Bugs caused by many factors

Despite code / logic error, many other things can cause bug to an application. For example:

  • compiler / interpreter error – very rare cases though
  • flawed server configuration / installation
  • OS-level bug
  • Bug caused by different configuration in different environment
  • Error caused by cosmic rays
  • Y2k error – year 2000 problem
  • Network-related error
  • Incorrect data / format
  • Not-printable characters input

And there are many-many more.

Conclusion – now, who I can blame for that?

It’s best to not blame anyone for bugs / error happened in a software. Entire team / company should take responsibility for the error (even user as well). What you need to do is, to find the root cause, fix the software and find a way so the same mistake / error won’t happen in the future.