What is Design for Testability (DFT) in VLSI?

Contents

What is Design for Testability, and why we need it?

Problems with manufacturing ICs

Today, semiconductors lie at the heart of ongoing advances across the electronics industry. The introduction of new technologies, especially nanometre technologies with 14 nm or smaller geometry, has allowed the semiconductor industry to keep pace with increased performance-capacity demands from consumers. This has brightened the prospects for future industry growth.

However, new technologies come with new challenges. Smaller die sizes increase the probability of some errors. Errors in ICs are highly undesirable. Here’s a list of some possible issues that arise while manufacturing chips.

Density Issue: Fabrication processes have become quite complicated with the advent of deep-submicron design technologies. Design elements are coming closer and closer; they are becoming smaller and thinner. Billions of transistors are involved in present-day VLSI chips. So, the chances of two wires touching each other or a very thin wire breaking in between are high. These are a few sources of errors or faults. The point is, there can be many such errors that can creep in during the design and fabrication processes. So, with an increase in density, the probability of failure also becomes high.
Software Issue: Moreover, apart from fabrication, there can even be errors in the translation process due to the bugs in CAD software tools used to design the chip.
Application Issue: There are several critical applications, in which we can’t afford to have faults in the chip at any cost. For example, in medical or healthcare applications, a single fault in the equipment controllers may even risk the life of an individual. For rockets or space shuttles that run on cryogenic fuel, they may need their microcontroller or microprocessor to run on a broader temperature range. Hence the test conditions for these chips should be very application-specific and on an extreme level to prevent any future failures.

Maintenance Issue: In case of any future failure, for repairing or maintenance, we need to identify the proper coordinates of fault. Since PCB sizes are also decreasing, multimeter testing isn’t a viable option anymore. Moreover, moving towards SoC (System on Chip) design, the modular design is losing its relevance, thereby making the maintenance process more expensive.
Business Issue: If designed chips are found to be faulty, then it transforms into a substantial loss and penalty for the company. Later, we will discuss how detecting a fault earlier decreases the cost of doing business significantly.

The possibility of faults may arise even after fabrication during the packaging process.

With all these issues in mind, it becomes vital to test every chip before it can be shipped and in fact, test it after every level of manufacturing.

Solutions to these problems: DFT

Testing does not come for free. Modern microprocessors contain more than 1000 pins. They pack a myriad of functionalities inside them. If any single transistor inside a chip becomes faulty, then the whole chip needs to be discarded. We, consumers, do not expect faulty chips from manufacturers. But identifying that one single defective transistor out of billions is a headache. We may need to test every functionality with every possible combination. If testing is done that way, then the time-to-market would be so high that the chips may never reach the consumers. So, how do we tackle this? We use a methodology to add a feature to these chips. The methodology is called DFT; short for Design for Testability. And the feature it adds to a chip is ‘testability.’

Design for Testability is a technique that adds testability features to a hardware product design. The added features make it easier to develop and apply manufacturing tests to the designed hardware.

In simple words, Design for testability is a design technique that makes testing a chip possible and cost-effective by adding additional circuitry to the chip.

Alternatively, Design-for-testability techniques improve the controllability and observability of internal nodes, so that embedded functions can be tested.

Role of DFT

Testing of Sequential Circuits

DFT offers a solution to the issue of testing sequential circuits. It’s kind of hard to test sequential circuits. Since there are clocks involved along with the flip-flops.

Unlike combinational circuits, we can’t determine the output of sequential circuits by merely looking into the inputs. Sequential circuits consist of finite states by virtue of flip-flops. The output also depends upon the state of the machine. It is difficult to control and observe the internal flip-flops externally.

Hence, the state machines cannot be tested unless they are initialized to a known value. And to initialize them, we need a specific set of features in addition to the typical circuitry. DFT enables us to add this functionality to a sequential circuit and thus allows us to test it.

Improving the chip manufacturing process

DFT accomplishes two significant goals in the chip manufacturing process:

Reject Defective Modules (Product Quality)

Testing checks the errors in the manufacturing process that are creating faults in the chips being designed. If faults can be detected earlier, then the underlying process causing the faults can be discarded at that point. This saves time and money as the faulty chips can be discarded even before they are manufactured.

Monitor and Improve Manufacturing Process

Testing is applied at every phase or level of abstraction from RTL to ASIC flow. This identifies the stage when the process variables move outside acceptable values. This simplifies failure analysis by identifying the probable defect location. Meticulous monitoring improves process-line accuracy and decreases the fault occurrence probability.

Can DFT permanently eliminate faults?

So, does testing guarantee that the chip will never be faulty again?

No, faults can arise even after the chip is in consumer’s hands. A chip may misbehave anytime if it is exposed to a very high temperature or humid environment or due to aging.

Want a live explanation? If you have an unlocked processor, you can try to overclock your CPU using this tutorial. But would you do it? Please don’t!

Overclocking is a method to increase the system frequency and voltage above the rated value. An improperly configured overclocking can mess up with timing metrics and cause instability. Prolonged overclocking would overheat and stress out your system to shorten the lifespan of your computer. This may cause intermittent faults in the chip and random crashes in the future. Adding to this, it may void your warranty too. This example is just one high-level explanation of how a fault may occur in real life.

The point is, you can even generate a fault on your own. A chip can’t ever be made resistant to faults; they are always bound to occur. So, what are we trying to achieve? Testing a device increases our confidence. By testing a chip, vendors try to minimize the possibility of future errors and failures.

To ensure the highest quality of chips, there is also an auxiliary process involved in the chip-design process called Verification.

Verification vs Testing

What is the difference between Verification and Testing?

Verification proves the correctness and logical functionality of the design pre-fabrication. The process is done after the RTL (Register Transfer Logic) design is coded with hardware description languages like VHDL or Verilog. It is done using a testbench in a high-level language. This is performed only once before the actual manufacturing of chip. In industry, this is done using formal verification processes like UVM (Universal Verification Methodology) using System Verilog. Verification is a vast topic on its own and we will cover it in this VLSI track and link it here soon.

In contrast, testing tries to guarantee the correctness of the manufactured chips at every abstraction level of the chip design process. Testing needs to be performed on each manufactured chip because each one of them has an equal probability of being faulty during the fabrication or packaging process. By doing testing, we are improving the quality of the devices that are being sold in the market.

Verification	Testing
Verifies correctness of the design.	Verifies correctness of the manufactured hardware.
Performed by simulation, hardware emulation, or formal methods.	Two-part process: Test generation: software process executed once during design. Performed only once. Test application: electrical tests applied to hardware. Performed multiple times.
Performed once prior to manufacturing.	Test application is performed on every manufactured device.
Responsible for the quality of design.	Responsible for the quality of devices.

Career choice: Verification vs DFT

Let’s segue into the career aspect of these two stages for a moment.

Which is the better career option: Verification or DFT?

Both Verification and DFT have their importance in the VLSI industry. Both of them have an excellent scope, as you see from the product development perspective. You can choose any one of them, depending upon your subject of interest.

About 2/3^rd of VLSI design time is invested in the verification process, thereby making it the most time-taking process in VLSI design flow. Hence, the count of verification engineers is also huge as compared to DFT engineers. If you are working as a DFT engineer, then your team size will be much smaller as compared to the verification team.

DFT

For DFT, you need to be good at CMOS VLSI, Digital Electronics, Testing of Digital Circuits, Verilog, and a little bit of scripting knowledge. These subjects will play a significant role in your day-to-day work. You will work on DFT EDA and ATPG tools using special libraries on languages like Perl, Shell, or TCL. You will work closely with physical design engineers and RTL design engineers. The career path might be more aligned to the backend/physical design and would have to deal with the complexities and challenges of newer technologies.

Verification

For the Verification domain, you will work in design development and some of the advanced constrained random test benches. This demands analytical and software programming skills, along with hardware skills. You need to have expertise in Verilog, System Verilog, C++.

Verification is performed at two stages: Functional Verification and Physical Verification. Most verification engineers don’t get involved in circuits, transistors, or backend design part. They only deal in the frontend domain. This is the highest level of abstraction in the VLSI industry, and there’s a lot of degree-of-freedom on your side to verify the design. For becoming a Verification expert, you have to gain experience practically (not theoretical much).

Some Terminologies

Here are a few terminologies which we will often use in this free Design for Testability course. Don’t fret if you can’t completely understand them yet, we will be covering them in-depth in this course.

Testing: An experiment in which the system is put to work and its resulting response is analyzed to ascertain whether it behaved correctly.

Diagnosis: Process for locating the cause of misbehavior in the circuit if it happened.

Defect: Refers to a flaw in the actual hardware or electronic system.

Fault: It is a model or representation of defect for analyzing in a computer program.

Error: It is caused by a defect and happens when a fault in hardware causes line/ gate output to have a wrong value.

Failure: This occurs when a defect causes misbehavior in the circuit or functionality of a system and cannot be reversed or recovered.

Fault Coverage: Percentage of the total number of logical faults that can be tested using a given test set T.

$FC\quad =\quad \frac { No.\quad of\quad detected\quad faults }{ Total\quad no.\quad of\quad faults }$

Defect Level: Refers to the fraction of shipped parts that are defective. Or, the proportion of the faulty chip in which fault isn’t detected and has been classified as good.

$DL\quad =\quad 1\quad -\quad { Y }^{ (1-FC) }$

where Y is the yield, means the fraction of the chips fabricated that are good.

Levels of Testing

Testing is carried out at various levels:

Chip-level, when chips are manufactured.
Board-level, when chips are integrated on the boards.

System-level, when several boards are assembled together.

There is an empirical rule of thumb that it is ten times more expensive to test a device as we move to the next higher level (chip → board → system). As we move to higher levels, more components are integrated, which makes the fault detection and localization much more difficult and expensive.

Sources of Faults in ICs

Here are a few possible sources of faults:

In the fabrication process like missing contact windows, parasitic transistors, etc.
Defects in the materials like cracks or imperfections in the substrate, surface impurities, etc.
Aging caused by Dielectric breakdown, electron migration, etc.

During packaging like Contact degradation, disconnection, etc.

Classification of Faults

Faults can be classified into various subcategories.

DFT Techniques

DFT techniques are broadly classified into two types:

Ad-hoc techniques

These are a collection of techniques or set of rules (do’s and don’ts) in the chip design process learned from design experience to make design testability more comfortable to accomplish. Basically, these are the rules that have been gathered over time after experiencing various errors.

Advantages

Test vector generation is simplified
Very easy to implement, no design rule or constraints and area overhead is very less.

Drawbacks

Are not always reusable, since each design has its specific requirements and testability problems.
It doesn’t guarantee high testability levels regardless of the circuit.
Not systematic enough to enable a uniform approach to testable circuit design.

Examples

Following are a few ad-hoc set of rules that designers generally follow:

Large circuits should be partitioned into smaller sub-circuits to reduce test cost.
Test access points must be inserted to enhance the controllability & observability of the circuit. This is done either by increasing the number of nodes or by multiplexing existing primary outputs for the internal nodes to be observed.

Structured techniques

In this technique, extra logic and signals are added to the circuit to allow the test according to some predefined procedure.

Advantage

In contrast to Ad-hoc, structured DFT implies that the same design approach can always be used and assure good testability levels, regardless of the circuit function. This technique is the only solution to modern world DFT problems.

Disadvantage

There is, however, a price to pay, which usually consists of accepting that some design rules (rather a design style) are enforced and that additional silicon area and propagation delays are tolerated.

Examples

Following are a few examples of structured DFT which we will cover extensively in future lessons:

Scan Path
Partial Scan

Level Sensitive Scan
BIST
Boundary Scan

Summary (TL;DR)

This was a short introduction to the concept of Design for Testability in VLSI. We also saw an overview of what it entails and what’s to come in this course. The key takeaway is just that there is a lot of room for error in the manufacturing of ICs. These errors can be costly in more ways than just financially. To reduce these errors significantly, a methodology known as DFT exists. This methodology adds a bunch of features to test the chips. To learn how that’s done, and everything it entails, keep up with the course!

2 thoughts on “What is Design for Testability (DFT) in VLSI?”

DFT Enthusiast says:

September 20, 2020 at 1:46 PM

Are the posts collapsed?
Unable to see any content

1. Asem Syed says:
  
  September 20, 2020 at 9:18 PM
  
  Thank you for bringing this to our attention! You should be able to access this now.