Overview・Useful test for quality assurance of embedded software 

08/07/2021Test qualityy

Various tests that are effective for the quality of embedded software

There are various books and web page public information about software testing. However, in terms of software, there is a lot of information that focuses on testing the most numerous web-based software and corporate core software, and there is not much information on tests specific to the quality of embedded software .

Embedded software has a wide range of subjects, but I will introduce a few specific tests that may be useful for ensuring quality, using embedded software for control and communication systems that my father Gutara has experienced so far as an example. I will try. I hope it will be helpful when you think about testing.

The test classification is introduced in another article , so if you are interested, please see that. This article classification and subsequent articles will introduce you to some more specific tests. This article will give you an overview of the test, and each article will go into a little more detail.

Memory leak test

This is a test to confirm that the dynamic memory is not leaked. This is essential if you are using Dynamic Memory , which is acquired from the OS when needed and returned to the OS when it is used up. If a memory leak occurs, it is important to detect it by testing because it causes a time bomb bug that suddenly malfunctions from a certain time when it was operating normally for a while after the power was turned on .

Non-memory dynamic resource leak test

Similar to dynamic memory, dynamic resources that are acquired when needed and returned when used are for table entries with registration / deletion functions managed in the application and communication acquired from the OS. There are sockets and so on. These also need to be tested as they also become time bomb bugs if leaked.

Rollover test of timers and counters

The cause of the time bomb bug is the bug hidden in the timer rollover process , which doubles as the leak of dynamic resources . The timer holds the value in an unsigned or signed integer variable, but when the maximum value of the integer is exceeded and the countdown is exceeded or the minimum value is exceeded, it is processed correctly. Without the code, problems will occur.

Simultaneous execution test of multiple functions

Even if there is no problem when using a single function, it is quite common that a problem occurs when function B is used while function A is operating. In software processing, when function A and function B call the same internal processing C, this often happens if there is a bug in the exclusive control of processing C. Since it cannot be found in a test that confirms a single function, it is necessary to test for simultaneous execution of multiple functions.

Add quasi-normal and abnormal systems to the startup test

At startup, many processes that normally do not work, such as checking various settings, initializing internal control variables, and connecting to the server, are executed. Processes that normally do not work very well and only work at startup are rarely inspected because they are rarely executed in normal functional tests and stability verification tests. Also, if some kind of error occurs at startup and the startup fails, the device cannot be used and the effect will be greater. Therefore, in the startup test, it is necessary to carry out sufficient tests in addition to the abnormal system and the quasi-normal system .

Deterioration test of secondary storage devices such as flash memory and HDD

Flash memory and hard disks are common secondary storage devices for embedded devices . Both secondary storages have a built-in error recovery function that assumes that the data will be erased internally, so when viewed from the outside, it looks like the data can be read and written without errors. However, in reality, the access speed may decrease or the written information may be lost due to the increase in errors . In order to confirm the stability of the software in such a situation, it is important to prepare a secondary storage device that is prone to error and to test the deterioration of the secondary storage device.

High load aging test

Apart from the long-term aging test, which is usually impossible, the aging test with a high load is also effective in identifying potential defects. What changes between when the load is high and when it is normally impossible is the mechanism for equalizing the load built into the processing , such as the processing waiting queue (which disperses the processing time and withstands the peak load) and the cache function. This is the part where a large amount of processing such as (delay the processing that can be postponed first) operates in large numbers. High-load aging tests are indispensable to identify potential bugs in these areas that are executed in large numbers only after high load.

Aging test on poor quality channels

If there is communication with other devices or devices, it is natural that the function to use that communication will be tested. For the communication channel used in the test, in addition to the normal communication channel error occurs frequently or transmission delay , such as you experience poor communication channel quality using the test environment aging test of should also be carried out is good is. The environment of the communication path in the test work room where the test is performed will be a very clean communication path unless special care is taken. On the other hand, in the actual production environment , the communication path length is long and there is noise that jumps in from the surroundings, so the quality of the communication path may be quite poor . An aging test on a poor quality communication path confirms the stability of operation in such a communication environment.

Resource halving test

Embedded software is designed, produced, and tested on the premise of predetermined hardware resources (CPU performance, main memory capacity, communication speed, etc.). It is good if there is enough hardware resources, but in the case of product development that pursues cost performance, it may be necessary to run software with a fairly small amount of hardware resources. In such a case, a situation may occur in which hardware resources are insufficient due to slight fluctuations in the operating environment or fluctuations in the amount of input . In such a case, in order to check whether the software can continue to operate stably without banging, there is a way to check the operation in an environment where the hardware resources are halved in advance.

Hard watchdog test

Embedded devices often have a built-in hard watchdog circuit that detects CPU hangs and attempts to recover by resetting. The important design items of the hard watchdog circuit are what is the heartbeat signal (the signal that confirms that the CPU is not hung) and what means to try to recover (soft reset or hard reset). Or power off / on). If you do not confirm that these important design items of the hard watchdog circuit are operating as designed, the recovery process that you expected when the hard watchdog actually occurred will not work. The damage will be quite large.

Soft watchdog test

In the case of multitasking / multithreaded software that is large to some extent , a soft watchdog function may be implemented for each software function . The soft watchdog function monitors the heartbeat process (process to confirm that the process is not hung) determined for each function of the software, and resets the function when a hang or stop of the function is observed. By doing so, we will take steps to recover the problematic function by minimizing the effect on other functions . If such a soft watchdog feature is implemented, testing is required to ensure that the feature is working as designed.

Time-out test

The communication processing software performs, performs other tasks and threads in the same CPU to the other communication processing CPU and performs other devices or equipment to the other external communication process there are two. In both communication processes, a timeout period is set so that the communication partner does not hang up if it does not return a response within a certain period of time . The timeout test is to check the operation of the timeout function to see if the timeout process works according to the designed contents when the time-out time is exceeded . The time-out time changes depending on the modification of the own software or the change of the software of the device or device of the communication partner . A timeout test should be performed when there is a change that affects the timeout period.

Various tests are required to guarantee the quality of embedded software

In order to improve or guarantee the quality of embedded software, it is necessary to test from various points of view. This article is just an example, but I hope it helps you a little when thinking about testing.

In the following articles, I will introduce each test a little more concretely based on the experience of Father Gutara, so please have a look if you are interested.

Back overview : Overview・Items to write in the software test report
Next detail  : Test example · Memory leak test