Process ID is not unique

14/02/2020Bug's nest

Process IDs lose their uniqueness over time

When developing in a multi-process environment such as Windows or Linux (also called multi-tasking), there are occasions when the process ID is used. The process ID is a " number that uniquely identifies a process at a certain point in time " managed by the OS, but uniqueness is not guaranteed over time . If you inadvertently forget this, unexpected bugs will settle in, so be careful.

Creating a unique ID number is a hassle

When developing software in a multitasking environment , when there are multiple identical processes or resources, a unique identification number may be required to identify each process or resource . .. In English, it is an Identification Number, so it is often called an ID number for short . You can create any unique number, so if the number of ID numbers you need is predetermined, you can assign an appropriate integer. 

However, in some cases, the required number has not been decided. In such a case, if an integer that increases by 1 is used for the ID number, the same value will appear when the integer rolls over and the uniqueness will be lost, making it inappropriate as an ID number. If you use a random number, it will be a randomly generated number, so the same value may be generated and the uniqueness may be lost, which is also inappropriate for an ID number. Creating a unique identification number in the program is unexpectedly troublesome.

I feel like I can use the process ID of the system …

It’s a hassle to generate an ID number yourself, so if you look around for something you can use on your system, you’ll notice that if you’re using Windows or Linux as your OS, there’s a process ID . Since it is an ID provided by the OS, a system call to acquire that value is also prepared. Since the name also has an ID, there is no problem with the uniqueness as an identification number . After all, it is an ID number provided by the OS, so you can rest assured that its uniqueness is guaranteed by the OS.

So, I think the idea of ​​using this process ID as the ID number is a fairly common solution.

The process ID is unique only at some point

However, you need to be a little careful when using the process ID as an IID number in your program . If you read the description of the process ID carefully, it says, “It is used to uniquely identify a process that exists at a certain point in time ." If you don’t understand this “at some point" proviso, bugs will come to you.

The fact that the OS guarantees the uniqueness of the process ID is limited to a certain point in time , in other words, the process ID does not guarantee the uniqueness over time. Since the process ID is the ID number used by the OS to manage the process, it should be “a number that can uniquely identify the process that exists at this moment". Therefore, it usually happens that the process ID used by the terminated process is assigned to the process created after that.

Let’s look a little more concretely. In the OS of multi-process, the process has been completed or is dynamically generated have interest, the process ID is assigned when it is generated. For example, when process-A is created, the OS assigns " ID number -1 " to process A as a process ID that is unique at that time . Then, when process A ends, the process ID “ID number -1" assigned to that process is released .

If another process-B is spawned after the process ID “ID number-1" is released from process-A, the OS processes the already released “ID number-1" to process-B. It may also be assigned as an ID. Then, process-A and process-B will be assigned the same process ID “ID number-1″. However, when process-B is created, process-A is already terminated and does not exist, so the process ID at a certain point in time will not be duplicated, and the " uniqueness of the process ID at that point in time " will be maintained. Therefore, there is no problem as an OS.

It’s a problem for programs that process over time

 As an OS, the uniqueness of the process ID is maintained and there is no problem, but if the program uses this process ID for some ID number, problems will occur depending on how it is used . Suppose you have a program that earns the process ID of a process at a certain point in time as an ID number of something and remembers and uses it. If you are writing a program assuming that process-A and process-B are different processes, so naturally different process IDs will be assigned, depending on the timing of creation / termination of process-A and process-B . It is also possible that the process IDs of both processes will have the same value, “ID number-1" .

If the ID number that should have been used as the unique identification number is duplicated , the processing of the program will be incorrect. What I hate about this duplicate ID number as a bug is that the probability of problems is very low and in many cases it works fine . Since it works normally, it is unlikely that it will be found as a bug in internal tests, and after it goes into production on the market, something strange will happen to be seen. And even if you try to investigate the cause, the reproducibility is poor and the investigation will not proceed easily.

If you forget that the process ID is unique only at a certain moment , you may call such a bug and dry it, so please be careful.

Return to the bug nest