Unit - 5
Metrics and cost estimation
1. The phrase “lines of code” (LOC) is a metric generally used to evaluate a software program or codebase according to its size.
2. It is a general identifier taken by adding up the number of lines of code used to write a program. LOC is used in various ways to assess a project, and there is a debate on how effective this measurement is.
3. Source lines of code (SLOC), also known as lines of code (LOC), is a software metric used to measure the size of a computer program by counting the number of lines in the text of the program's source code.
4. SLOC is typically used to predict the amount of effort that will be required to develop a program, as well as to estimate programming productivity or maintainability once the software is produced.
5.1.2 Measurement methods
1. Many useful comparisons involve only the order of magnitude of lines of code in a project. Using lines of code to compare a 10,000-line project to a 100,000-line project is far more useful than when comparing a 20,000-line project with a 21,000-line project.
2. While it is debatable exactly how to measure lines of code, discrepancies of an order of magnitude can be clear indicators of software complexity or man-hours.
3. There are two major types of SLOC measures: physical SLOC (LOC) and logical SLOC (LLOC). Specific definitions of these two measures vary, but the most common definition of physical SLOC is a count of lines in the text of the program's source code excluding comment lines.
4. Logical SLOC attempts to measure the number of executable "statements", but their specific definitions are tied to specific computer languages (one simple logical SLOC measure for C-like programming languages is the number of statement-terminating semicolons).
5. It is much easier to create tools that measure physical SLOC, and physical SLOC definitions are easier to explain. However, physical SLOC measures are sensitive to logically irrelevant formatting and style conventions, while logical SLOC is less sensitive to formatting and style conventions.
6. However, SLOC measures are often stated without giving their definition, and logical SLOC can often be significantly different from physical SLOC.
7. Consider this snippet of C code as an example of the ambiguity encountered when determining SLOC:
for (i = 0; i < 100; i++) printf("hello"); /* How many lines of code is this? */
8. In this example we have:
8.1 1 physical line of code (LOC),
8.2 2 logical lines of code (LLOC) (for statement and printf statement),
8.3 1 comment line.
Key takeaways:
1. The phrase “lines of code” (LOC) is a metric generally used to evaluate a software program or codebase according to its size.
2. SLOC is typically used to predict the amount of effort that will be required to develop a program, as well as to estimate programming productivity or maintainability once the software is produced.
3. While it is debatable exactly how to measure lines of code, discrepancies of an order of magnitude can be clear indicators of software complexity or man-hours.
4. Logical SLOC attempts to measure the number of executable "statements", but their specific definitions are tied to specific computer languages (one simple logical SLOC measure for C-like programming languages is the number of statement-terminating semicolons)
5.2.1 introduction
1. Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program's source code. It was developed by Thomas J. McCabe, Sr. in 1976.
2. Cyclomatic complexity is computed using the control flow graph of the program: the nodes of the graph correspond to indivisible groups of commands of a program, and a directed edge connects two nodes if the second command might be executed immediately after the first command.
3. Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program.
4. One testing strategy, called basis path testing by McCabe who first proposed it, is to test each linearly independent path through the program; in this case, the number of test cases will equal the cyclomatic complexity of the program.
5.2.2 Definition and explanation
1. A control flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). On exiting the loop, there is a conditional statement (group below the loop), and finally the program exits at the blue node. This graph has 9 edges, 8 nodes, and 1 connected component, so the cyclomatic complexity of the program is 9 - 8 + 2*1 = 3.
2. The cyclomatic complexity of a section of source code is the number of linearly independent paths within it—where "linearly independent" means that each path has at least one edge that is not in one of the other paths.
3. For instance, if the source code contained no control flow statements (conditionals or decision points), the complexity would be 1, since there would be only a single path through the code.
4. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.
5. Mathematically, the cyclomatic complexity of a structured program[a] is defined with reference to the control flow graph of the program, a directed graph containing the basic blocks of the program, with an edge between two basic blocks if control may pass from the first to the second. The complexity M is then defined as:
M = E − N + 2P,
where
E = the number of edges of the graph.
N = the number of nodes of the graph.
P = the number of connected components.
6. The same function as above, represented using the alternative formulation, where each exit point is connected back to the entry point. This graph has 10 edges, 8 nodes, and 1 connected component, which also results in a cyclomatic complexity of 3 using the alternative formulation (10 - 8 + 1 = 3).
7. An alternative formulation is to use a graph in which each exit point is connected back to the entry point. In this case, the graph is strongly connected, and the cyclomatic complexity of the program is equal to the cyclomatic number of its graph (also known as the first Betti number), which is defined as
8. M = E − N + P.
9. This may be seen as calculating the number of linearly independent cycles that exist in the graph, i.e. those cycles that do not contain other cycles within themselves. Note that because each exit point loops back to the entry point, there is at least one such cycle for each exit point.
10 For a single program (or subroutine or method), P is always equal to 1. So a simpler formula for a single subroutine is
M = E − N + 2.
10. Applications
10.1 Limiting complexity during development
10.2 Measuring the "structuredness" of a program
10.3 Implications for software testing
10.4 Correlation to number of defects
Key takeaways:
1. Cyclomatic complexity is a software metric used to indicate the complexity of a program
2. Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program.
3. A control flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). On exiting the loop, there is a conditional statement (group below the loop), and finally the program exits at the blue node.
4. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.
5. For a single program (or subroutine or method), P is always equal to 1. So a simpler formula for a single subroutine is
M = E − N + 2.
5.3 Halsted’s metric
1. Halstead complexity measures are software metrics introduced by Maurice Howard Halstead in 1977as part of his treatise on establishing an empirical science of software development.
2. Halstead made the observation that metrics of the software should reflect the implementation or expression of algorithms in different languages, but be independent of their execution on a specific platform. These metrics are therefore computed statically from the code.
3. Halstead's goal was to identify measurable properties of software, and the relations between them. This is similar to the identification of measurable properties of matter (like the volume, mass, and pressure of a gas) and the relationships between them (analogous to the gas equation). Thus his metrics are actually not just complexity metrics.
4. Halstead metrics are:
4.1 Program Volume (V) The unit of measurement of volume is the standard unit for size "bits." It is the actual size of a program if a uniform binary encoding for the vocabulary is used. V=N*log2n
4.2 Program Level (L) The value of L ranges between zero and one, with L=1 representing a program written at the highest possible level (i.e., with minimum size). L=V*/V 4.3 Program Difficulty
The difficulty level or error-proneness (D) of the program is proportional to the number of the unique operator in the program. D= (n1/2) * (N2/n2)
4.4 Programming Effort (E) The unit of measurement of E is elementary mental discriminations. E=V/L=D*V
4.5 Estimated Program Length
According to Halstead, The first Hypothesis of software science is that the length of a well-structured program is a function only of the number of unique operators and operands. N=N1+N2 And estimated program length is denoted by N^ N^ = n1log2n1 + n2log2n2 The following alternate expressions have been published to estimate program length:
NJ = log2 (n1!) + log2 (n2!) NB = n1 * log2n2 + n2 * log2n1 NC = n1 * sqrt(n1) + n2 * sqrt(n2) NS = (n * log2n) / 2
4.6 Potential Minimum Volume The potential minimum volume V* is defined as the volume of the most short program in which a problem can be coded. V* = (2 + n2*) * log2 (2 + n2*)
Here, n2* is the count of unique input and output parameters
4.7 Size of Vocabulary (n) The size of the vocabulary of a program, which consists of the number of unique tokens used to build a program, is defined as: n=n1+n2 where
n=vocabulary of a program n1=number of unique operators n2=number of unique operands
4.8 Language Level - Shows the algorithm implementation program language level. The same algorithm demands additional effort if it is written in a low-level program language. For example, it is easier to program in Pascal than in Assembler. L' = V / D / D lambda = L * V* = L2 * V
Key takeaways: 1. Halstead metrics are: Program Volume (V) The unit of measurement of volume is the standard unit for size "bits." It is the actual size of a program if a uniform binary encoding for the vocabulary is used. V=N*log2n
2. Program Level (L) The value of L ranges between zero and one, with L=1 representing a program written at the highest possible level (i.e., with minimum size). L=V*/V
3. Program Difficulty The difficulty level or error-proneness (D) of the program is proportional to the number of the unique operator in the program.
D= (n1/2) * (N2/n2) 4. Programming Effort (E) The unit of measurement of E is elementary mental discriminations. E=V/L=D*V
5. Estimated Program Length According to Halstead, The first Hypothesis of software science is that the length of a well-structured program is a function only of the number of unique operators and operands. N=N1+N2 And estimated program length is denoted by N^
N^ = n1log2n1 + n2log2n2
The following alternate expressions have been published to estimate program length:
NJ = log2 (n1!) + log2 (n2!) NB = n1 * log2n2 + n2 * log2n1 NC = n1 * sqrt(n1) + n2 * sqrt(n2) NS = (n * log2n) / 2
6. Potential Minimum Volume The potential minimum volume V* is defined as the volume of the most short program in which a problem can be coded.
V* = (2 + n2*) * log2 (2 + n2*)
Here, n2* is the count of unique input and output parameters Size of Vocabulary (n) The size of the vocabulary of a program, which consists of the number of unique tokens used to build a program, is defined as: n=n1+n2 where n=vocabulary of a program n1=number of unique operators n2=number of unique operands |
1. The function point is a "unit of measurement" to express the amount of business functionality an information system (as a product) provides to a user. Function points are used to compute a functional size measurement (FSM) of software. The cost (in dollars or hours) of a single unit is calculated from past projects
2. Function points were defined in 1979 in Measuring Application Development Productivity by Allan Albrecht at IBM.
3. The functional user requirements of the software are identified and each one is categorized into one of five types: outputs, inquiries, inputs, internal files, and external interfaces.
4. Once the function is identified and categorized into a type, it is then assessed for complexity and assigned a number of function points. Each of these functional user requirements maps to an end-user business function, such as a data entry for an Input or a user query for an Inquiry.
5. This distinction is important because it tends to make the functions measured in function points map easily into user-oriented requirements, but it also tends to hide internal functions (e.g. algorithms), which also require resources to implement.
6. There is currently no ISO recognized FSM Method that includes algorithmic complexity in the sizing result. Recently there have been different approaches proposed to deal with this perceived weakness, implemented in several commercial software products.
7. The variations of the Albrecht-based IFPUG method designed to make up for this (and other weaknesses) include:
7.1 Early and easy function points – Adjusts for problem and data complexity with two questions that yield a somewhat subjective complexity measurement; simplifies measurement by eliminating the need to count data elements.
7.2 Engineering function points – Elements (variable names) and operators (e.g., arithmetic, equality/inequality, Boolean) are counted. This variation highlights computational function. The intent is similar to that of the operator/operand-based Halstead complexity measures.
7.3 Bang measure – Defines a function metric based on twelve primitive (simple) counts that affect or show Bang, defined as "the measure of true function to be delivered as perceived by the user." Bang measure may be helpful in evaluating a software unit's value in terms of how much useful function it provides, although there is little evidence in the literature of such application. The use of Bang measure could apply when re-engineering (either complete or piecewise) is being considered, as discussed in Maintenance of Operational Systems—An Overview.
7.4 Feature points – Adds changes to improve applicability to systems with significant internal processing (e.g., operating systems, communications systems). This allows accounting for functions not readily perceivable by the user, but essential for proper operation.
7.5 Weighted Micro Function Points – One of the newer models (2009) which adjusts function points using weights derived from program flow complexity, operand and operator vocabulary, object usage, and algorithm.
7.6 Fuzzy Function Points - Proposes a fuzzy and gradative transition between low x medium and medium x high complexities
Key takeaways:
1. The function point is a "unit of measurement" to express the amount of business functionality an information system (as a product) provides to a user.
2. Function points were defined in 1979 in Measuring Application Development Productivity by Allan Albrecht at IBM.
3. This distinction is important because it tends to make the functions measured in function points map easily into user-oriented requirements, but it also tends to hide internal functions (e.g. algorithms), which also require resources to implement.
4. Early and easy function points – Adjusts for problem and data complexity with two questions that yield a somewhat subjective complexity measurement; simplifies measurement by eliminating the need to count data elements.
5. Fuzzy Function Points - Proposes a fuzzy and gradative transition between low x medium and medium x high complexities
1. Feature point is the superset of function point measure that can be applied to systems and engineering software applications.
2. The feature points are used in those applications in which the algorithmic complexity is high like real-time systems where time constraints are there, embedded systems, etc.
3. Feature points are computed by counting the information domain values and are weighed by only single weight.
4. Feature point includes another measurement parameter-ALGORITHM.
The table for the computation of feature point is as follows:
Feature Point Calculations
Measurement Parameter Count Weighing factor 1. Number of external inputs (EI) - * 4 - 2. Number of external outputs (EO) - * 5 - 3. Number of external inquiries (EQ) - * 4 - 4. Number of internal files (ILF) - * 7 - 5. Number of external interfaces (EIF) - * 7 - 6.Algorithms used Count total → - * 3 - |
Key takeaways:
1. Feature point is the superset of function point measure that can be applied to systems and engineering software applications.
2. Feature points are computed by counting the information domain values and are weighed by only single weight.
1 McCabe Cyclomatic Complexity (CC) – This complexity metric is probably the most popular one and calculates essential information about constancy and maintainability of software system from source code. It gives insight into the complexity of a method
2 Weighted Method per Class (WMC) – This metric indicates the complexity of a class. One way of calculating the complexity of a class is by using cyclomatic complexities of its methods. One should aim for a class with lower value of WMC as a higher value indicates that the class is more complex.
3 Depth of Inheritance Tree (DIT) – DIT measures the maximum path from node to the root of tree. This metric indicates how far down a class is declared in the inheritance hierarchy. The following figure shows the DIT value for a simple class hierarchy.
4 Number of Children (NOC) – This metric indicates how many sub-classes are going to inherit the methods of the parent class. As shown in above figure, class C2 has 2 children, subclasses C21, C22. The value of NOC indicates the level of reuse in an application. If NOC increases it means reuse increases.
5 Coupling between Objects (CBO) – The rationale behind this metric is that an object is coupled to another object if two object acts upon each other. If a class uses the methods of other classes, then they both are coupled. An increase in CBO indicates an increase in responsibilities of a class. Hence, the CBO value for classes should be kept as low as possible.
6 Lack of Cohesion in Methods (LCOM) – LCOM can be used to measure the degree of cohesiveness present. It reflects on how well a system is designed and how complex a class is. LCOM is calculated by ascertaining the number of method pairs whose similarity is zero, minus the count of method pairs whose similarity is not zero.
7 Method Hiding Factor (MHF) – MHF is defined as the ratio of sum of the invisibilities of all methods defined in all classes to the total number of methods defined in the system. The invisibility of a method is the percentage of the total classes from which this method is not visible.
8 Attribute Hiding Factor (AHF) – AHF is calculated as the ratio of the sum of the invisibilities of all attributes defined in all classes to the total number of attributes defined in the system.
9 Method Inheritance Factor (MIF) – MIF measures the ratio of the sum of the inherited methods in all classes of the system to the total number of available method for all classes.
10 Attribute Inheritance Factor (AIF) – AIF measures the ratio of sum of inherited attributes in all classes of the system under consideration to the total number of available attributes.
Key takeaways:
1. McCabe Cyclomatic Complexity (CC) – This complexity metric is probably the most popular one and calculates essential information about constancy and maintainability of software system from source code.
2. Depth of Inheritance Tree (DIT) – DIT measures the maximum path from node to the root of tree.
3. Coupling between Objects (CBO) – The rationale behind this metric is that an object is coupled to another object if two object acts upon each other.
4. Method Hiding Factor (MHF) – MHF is defined as the ratio of sum of the invisibilities of all methods defined in all classes to the total number of methods defined in the system.
1. Software fault tolerance is the ability for software to detect and recover from a fault that is happening or has already happened in either the software or hardware in the system in which the software is running in order to provide service in accordance with the specification.
2. Software fault tolerance is a necessary component in order to construct the next generation of highly available and reliable computing systems from embedded systems to data warehouse systems.
3. Software fault tolerance is not a solution unto itself however, and it is important to realize that software fault tolerance is just one piece necessary to create the next generation of systems.
4. In order to adequately understand software fault tolerance it is important to understand the nature of the problem that software fault tolerance is supposed to solve. Software faults are all design faults.
5. Software manufacturing, the reproduction of software, is considered to be perfect. The source of the problem being solely design faults is very different than almost any other system in which fault tolerance is a desired property.
6. This inherent issue, that software faults are the result of human error in interpreting a specification or correctly implementing an algorithm, creates issues which must be dealt with in the fundamental approach to software fault tolerance.
7. Fault tolerance is defined as how to provide, by redundancy, service complying with the specification in spite of faults having occurred or occurring. (Laprie 1996).
8. There are some important concepts buried within the text of this definition that should be examined.
9. Primarily, Laprie argues that fault tolerance is accomplished using redundancy. This argument is good for errors which are not caused by design faults, however, replicating a design fault in multiple places will not aide in complying with a specification.
10. It is also important to note the emphasis placed on the specification as the final arbiter of what is an error and what is not.
11. Design diversity increases pressure on the specification creators to make multiple variants of the same specification which are equivalent in order to aide the programmer in creating variations in algorithms for the necessary redundancy.
12 .The definition itself may no longer be appropriate for the type of problems that current fault tolerance is trying to solve, both hardware and software.
Key takeaways:
1. Software fault tolerance is the ability for software to detect and recover from a fault that is happening or has already happened in either the software or hardware in the system in which the software is running in order to provide service in accordance with the specification.
2. Software fault tolerance is not a solution unto itself however, and it is important to realize that software fault tolerance is just one piece necessary to create the next generation of systems.
3. There are some important concepts buried within the text of this definition that should be examined.
4. It is also important to note the emphasis placed on the specification as the final arbiter of what is an error and what is not.
1. Boehm proposed COCOMO (Constructive Cost Estimation Model) in 1981.COCOMO is one of the most generally used software estimation models in the world. COCOMO predicts the efforts and schedule of a software product based on the size of the software.
2. The necessary steps in this model are:
2.1 Get an initial estimate of the development effort from evaluation of thousands of delivered lines of source code (KDLOC).
2.2 Determine a set of 15 multiplying factors from various attributes of the project.
2.3 Calculate the effort estimate by multiplying the initial estimate with all the multiplying factors i.e., multiply the values in step1 and step2.
3. The initial estimate (also called nominal estimate) is determined by an equation of the form used in the static single variable models, using KDLOC as the measure of the size. To determine the initial effort Ei in person-months the equation used is of the type is shown below
Ei=a*(KDLOC)b
4. The value of the constant a and b are depends on the project type.
In COCOMO, projects are categorized into three types:
4.1 Organic
4.2 Semidetached
4.3 Embedded
4.1Organic: A development project can be treated of the organic type, if the project deals with developing a well-understood application program, the size of the development team is reasonably small, and the team members are experienced in developing similar methods of projects. Examples of this type of projects are simple business systems, simple inventory management systems, and data processing systems.
4.2 Semidetached: A development project can be treated with semidetached type if the development consists of a mixture of experienced and inexperienced staff. Team members may have finite experience in related systems but may be unfamiliar with some aspects of the order being developed. Example of Semidetached system includes developing a new operating system (OS), a Database Management System (DBMS), and complex inventory management system.
4.3. Embedded: A development project is treated to be of an embedded type, if the software being developed is strongly coupled to complex hardware, or if the stringent regulations on the operational method exist. For Example: ATM, Air Traffic control.
5. For three product categories, Bohem provides a different set of expression to predict effort (in a unit of person month)and development time from the size of estimation in KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due to holidays, weekly off, coffee breaks, etc.
6. According to Boehm, software cost estimation should be done through three stages:
6.1 Basic Model
6.2 Intermediate Model
6.3 Detailed Model
6.1. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the project parameters. The following expressions give the basic COCOMO estimation model:
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Where
KLOC is the estimated size of the software product indicate in Kilo Lines of Code,
a1,a2,b1,b2 are constants for each group of software products,
Tdev is the estimated time to develop the software, expressed in months,
Effort is the total effort required to develop the software product, expressed in person months (PMs).
6.2 Estimation of development effort
For the three classes of software products, the formulas for estimating the effort based on the code size are shown below:
Organic: Effort = 2.4(KLOC) 1.05 PM
Semi-detached: Effort = 3.0(KLOC) 1.12 PM
Embedded: Effort = 3.6(KLOC) 1.20 PM
6.3 Estimation of development time
For the three classes of software products, the formulas for estimating the development time based on the effort are given below:
Organic: Tdev = 2.5(Effort) 0.38 Months
Semi-detached: Tdev = 2.5(Effort) 0.35 Months
Embedded: Tdev = 2.5(Effort) 0.32 Months
Some insight into the basic COCOMO model can be obtained by plotting the estimated characteristics for different software sizes. Fig shows a plot of estimated effort versus product size. From fig, we can observe that the effort is somewhat superliner in the size of the software product. Thus, the effort required to develop a product increases very rapidly with project size.
Key takeaways:
1 Get an initial estimate of the development effort from evaluation of thousands of delivered lines of source code (KDLOC).
2. The initial estimate (also called nominal estimate) is determined by an equation of the form used in the static single variable models, using KDLOC as the measure of the size.
3. A development project can be treated of the organic type, if the project deals with developing a well-understood application program, the size of the development team is reasonably small, and the team members are experienced in developing similar methods of projects.
4. For three product categories, Bohem provides a different set of expression to predict effort (in a unit of person month)and development time from the size of estimation in KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due to holidays, weekly off, coffee breaks, etc.
5. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the project parameters. The following expressions give the basic COCOMO estimation model:
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Basic Model – E= a(KLOC)^b time= c(Effort)^d Person required = Effort/ time The above formula is used for the cost estimation of for the basic COCOMO model, and also is used in the subsequent models. The constant values a,b,c and d for the Basic Model for the different categories of system:
Software Projects a b c d Organic 2.4 1.05 2.5 0.38 Semi Detached 3.0 1.12 2.5 0.35 Embedded 3.6 1.20 2.5 0.32 The effort is measured in Person-Months and as evident from the formula is dependent on Kilo-Lines of code. The development time is measured in Months. These formulas are used as such in the Basic Model calculations, as not much consideration of different factors such as reliability, expertise is taken into account, henceforth the estimate is rough.
Below is the C++ program for Basic COCOMO
// C++ program to implement basic COCOMO #include<bits/stdc++.h> using namespace std;
// Function for rounding off float to int int fround(float x) { int a; x=x+0.5; a=x; return(a); } // Function to calculate parameters of Basic COCOMO void calculate(float table[][4], int n,char mode[][15], int size) { float effort,time,staff;
int model; // Check the mode according to size
if(size>=2 && size<=50) model=0; //organic
else if(size>50 && size<=300) model=1; //semi-detached
else if(size>300) model=2; //embedded
cout<<"The mode is "<<mode[model];
// Calculate Effort effort = table[model][0]*pow(size,table[model][1]);
// Calculate Time time = table[model][2]*pow(effort,table[model][3]);
//Calculate Persons Required staff = effort/time;
// Output the values calculated cout<<"\nEffort = "<<effort<<" Person-Month";
cout<<"\nDevelopment Time = "<<time<<" Months";
cout<<"\nAverage Staff Required = "<<fround(staff)<<" Persons";
}
int main() { float table[3][4]={2.4,1.05,2.5,0.38,3.0,1.12,2.5,0.35,3.6,1.20,2.5,0.32};
char mode[][15]={"Organic","Semi-Detached","Embedded"};
int size = 4;
calculate(table,3,mode,size);
return 0; } Output: The mode is Organic Effort = 10.289 Person-Month Development Time = 6.06237 Months Average Staff Required = 2 Persons |
Key takeaways:
1. Person required = Effort/ time
The above formula is used for the cost estimation of for the basic COCOMO model, and also is used in the subsequent models.
2. The effort is measured in Person-Months and as evident from the formula is dependent on Kilo-Lines of code.
3. The development time is measured in Months.
Detailed Model –
1. Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of the cost driver’s impact on each step of the software engineering process.
2. The detailed model uses different effort multipliers for each cost driver attribute. In detailed cocomo, the whole software is divided into different modules and then we apply COCOMO in different modules to estimate effort and then sum the effort.
3. The Six phases of detailed COCOMO are:
3.1 Planning and requirements
3.2 System design
3.3 Detailed design
3.4 Module code and test
3.5 Integration and test
4. Cost Constructive model
The effort is calculated as a function of program size and a set of cost drivers are given according to each phase of the software lifecycle.
Key takeaways:
1. Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of the cost driver’s impact on each step of the software engineering process.
2. In detailed cocomo, the whole software is divided into different modules and then we apply COCOMO in different modules to estimate effort and then sum the effort.
3. The effort is calculated as a function of program size and a set of cost drivers are given according to each phase of the software lifecycle.
COCOMO-II is the revised version of the original Cocomo (Constructive Cost Model) and is developed at University of Southern California. It is the model that allows one to estimate the cost, effort and schedule when planning a new software development activity.
It consists of three sub-models:
1. End User Programming:
Application generators are used in this sub-model. End user write the code by using these application generators.
Example – Spreadsheets, report generator, etc.
2. Intermediate Sector:
(a) Application Generators and Composition Aids –
This category will create largely prepackaged capabilities for user programming. Their product will have many reusable components. Typical firms operating in this sector are Microsoft, Lotus,
Oracle, IBM, Borland, Novell.
(b) Application Composition Sector –
This category is too diversified and to be handled by prepackaged solutions. It includes GUI, Databases, domain specific components such as financial, medical or industrial process control packages.
(c) System Integration –
This category deals with large scale and highly embedded systems.
3. Infrastructure Sector:
This category provides infrastructure for the software development like Operating System, Database Management System, User Interface Management System, Networking System, etc.
Stages of COCOMO II:
Stage-I:
It supports estimation of prototyping. For this it uses Application Composition Estimation Model. This model is used for the prototyping stage of application generator and system integration.
Stage-II:
It supports estimation in the early design stage of the project, when we less know about it. For this it uses Early Design Estimation Model. This model is used in early design stage of application generators, infrastructure, system integration.
Stage-III:
It supports estimation in the post architecture stage of a project. For this it uses Post Architecture Estimation Model. This model is used after the completion of the detailed architecture of application generator, infrastructure, system integration.
Key takeaways:
1. Application generators are used in this sub-model. End user write the code by using these application generators.
2. This category will create largely prepackaged capabilities for user programming. Their product will have many reusable components. Typical firms operating in this sector are Microsoft, Lotus,
3. Infrastructure Sector:This category provides infrastructure for the software development like Operating System, Database Management System, User Interface Management System, Networking System, etc
References
- Real- Time Systems Design and Analysis.. Tools for the Practitioner by Phillip A Laplante, Seppo J.Ovaska ,Wiley - 4th Edition
- Embedded Real Time Systems: Concepts, Design and Programming - Dr. K.V.K. Prasad -Black Book, Edition: 2014
- Real Time Systems Theory and Practice , Rajib Mall , Pearson Education
Unit - 5
Metrics and cost estimation
1. The phrase “lines of code” (LOC) is a metric generally used to evaluate a software program or codebase according to its size.
2. It is a general identifier taken by adding up the number of lines of code used to write a program. LOC is used in various ways to assess a project, and there is a debate on how effective this measurement is.
3. Source lines of code (SLOC), also known as lines of code (LOC), is a software metric used to measure the size of a computer program by counting the number of lines in the text of the program's source code.
4. SLOC is typically used to predict the amount of effort that will be required to develop a program, as well as to estimate programming productivity or maintainability once the software is produced.
5.1.2 Measurement methods
1. Many useful comparisons involve only the order of magnitude of lines of code in a project. Using lines of code to compare a 10,000-line project to a 100,000-line project is far more useful than when comparing a 20,000-line project with a 21,000-line project.
2. While it is debatable exactly how to measure lines of code, discrepancies of an order of magnitude can be clear indicators of software complexity or man-hours.
3. There are two major types of SLOC measures: physical SLOC (LOC) and logical SLOC (LLOC). Specific definitions of these two measures vary, but the most common definition of physical SLOC is a count of lines in the text of the program's source code excluding comment lines.
4. Logical SLOC attempts to measure the number of executable "statements", but their specific definitions are tied to specific computer languages (one simple logical SLOC measure for C-like programming languages is the number of statement-terminating semicolons).
5. It is much easier to create tools that measure physical SLOC, and physical SLOC definitions are easier to explain. However, physical SLOC measures are sensitive to logically irrelevant formatting and style conventions, while logical SLOC is less sensitive to formatting and style conventions.
6. However, SLOC measures are often stated without giving their definition, and logical SLOC can often be significantly different from physical SLOC.
7. Consider this snippet of C code as an example of the ambiguity encountered when determining SLOC:
for (i = 0; i < 100; i++) printf("hello"); /* How many lines of code is this? */
8. In this example we have:
8.1 1 physical line of code (LOC),
8.2 2 logical lines of code (LLOC) (for statement and printf statement),
8.3 1 comment line.
Key takeaways:
1. The phrase “lines of code” (LOC) is a metric generally used to evaluate a software program or codebase according to its size.
2. SLOC is typically used to predict the amount of effort that will be required to develop a program, as well as to estimate programming productivity or maintainability once the software is produced.
3. While it is debatable exactly how to measure lines of code, discrepancies of an order of magnitude can be clear indicators of software complexity or man-hours.
4. Logical SLOC attempts to measure the number of executable "statements", but their specific definitions are tied to specific computer languages (one simple logical SLOC measure for C-like programming languages is the number of statement-terminating semicolons)
5.2.1 introduction
1. Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program's source code. It was developed by Thomas J. McCabe, Sr. in 1976.
2. Cyclomatic complexity is computed using the control flow graph of the program: the nodes of the graph correspond to indivisible groups of commands of a program, and a directed edge connects two nodes if the second command might be executed immediately after the first command.
3. Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program.
4. One testing strategy, called basis path testing by McCabe who first proposed it, is to test each linearly independent path through the program; in this case, the number of test cases will equal the cyclomatic complexity of the program.
5.2.2 Definition and explanation
1. A control flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). On exiting the loop, there is a conditional statement (group below the loop), and finally the program exits at the blue node. This graph has 9 edges, 8 nodes, and 1 connected component, so the cyclomatic complexity of the program is 9 - 8 + 2*1 = 3.
2. The cyclomatic complexity of a section of source code is the number of linearly independent paths within it—where "linearly independent" means that each path has at least one edge that is not in one of the other paths.
3. For instance, if the source code contained no control flow statements (conditionals or decision points), the complexity would be 1, since there would be only a single path through the code.
4. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.
5. Mathematically, the cyclomatic complexity of a structured program[a] is defined with reference to the control flow graph of the program, a directed graph containing the basic blocks of the program, with an edge between two basic blocks if control may pass from the first to the second. The complexity M is then defined as:
M = E − N + 2P,
where
E = the number of edges of the graph.
N = the number of nodes of the graph.
P = the number of connected components.
6. The same function as above, represented using the alternative formulation, where each exit point is connected back to the entry point. This graph has 10 edges, 8 nodes, and 1 connected component, which also results in a cyclomatic complexity of 3 using the alternative formulation (10 - 8 + 1 = 3).
7. An alternative formulation is to use a graph in which each exit point is connected back to the entry point. In this case, the graph is strongly connected, and the cyclomatic complexity of the program is equal to the cyclomatic number of its graph (also known as the first Betti number), which is defined as
8. M = E − N + P.
9. This may be seen as calculating the number of linearly independent cycles that exist in the graph, i.e. those cycles that do not contain other cycles within themselves. Note that because each exit point loops back to the entry point, there is at least one such cycle for each exit point.
10 For a single program (or subroutine or method), P is always equal to 1. So a simpler formula for a single subroutine is
M = E − N + 2.
10. Applications
10.1 Limiting complexity during development
10.2 Measuring the "structuredness" of a program
10.3 Implications for software testing
10.4 Correlation to number of defects
Key takeaways:
1. Cyclomatic complexity is a software metric used to indicate the complexity of a program
2. Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program.
3. A control flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). On exiting the loop, there is a conditional statement (group below the loop), and finally the program exits at the blue node.
4. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.
5. For a single program (or subroutine or method), P is always equal to 1. So a simpler formula for a single subroutine is
M = E − N + 2.
5.3 Halsted’s metric
1. Halstead complexity measures are software metrics introduced by Maurice Howard Halstead in 1977as part of his treatise on establishing an empirical science of software development.
2. Halstead made the observation that metrics of the software should reflect the implementation or expression of algorithms in different languages, but be independent of their execution on a specific platform. These metrics are therefore computed statically from the code.
3. Halstead's goal was to identify measurable properties of software, and the relations between them. This is similar to the identification of measurable properties of matter (like the volume, mass, and pressure of a gas) and the relationships between them (analogous to the gas equation). Thus his metrics are actually not just complexity metrics.
4. Halstead metrics are:
4.1 Program Volume (V) The unit of measurement of volume is the standard unit for size "bits." It is the actual size of a program if a uniform binary encoding for the vocabulary is used. V=N*log2n
4.2 Program Level (L) The value of L ranges between zero and one, with L=1 representing a program written at the highest possible level (i.e., with minimum size). L=V*/V 4.3 Program Difficulty
The difficulty level or error-proneness (D) of the program is proportional to the number of the unique operator in the program. D= (n1/2) * (N2/n2)
4.4 Programming Effort (E) The unit of measurement of E is elementary mental discriminations. E=V/L=D*V
4.5 Estimated Program Length
According to Halstead, The first Hypothesis of software science is that the length of a well-structured program is a function only of the number of unique operators and operands. N=N1+N2 And estimated program length is denoted by N^ N^ = n1log2n1 + n2log2n2 The following alternate expressions have been published to estimate program length:
NJ = log2 (n1!) + log2 (n2!) NB = n1 * log2n2 + n2 * log2n1 NC = n1 * sqrt(n1) + n2 * sqrt(n2) NS = (n * log2n) / 2
4.6 Potential Minimum Volume The potential minimum volume V* is defined as the volume of the most short program in which a problem can be coded. V* = (2 + n2*) * log2 (2 + n2*)
Here, n2* is the count of unique input and output parameters
4.7 Size of Vocabulary (n) The size of the vocabulary of a program, which consists of the number of unique tokens used to build a program, is defined as: n=n1+n2 where
n=vocabulary of a program n1=number of unique operators n2=number of unique operands
4.8 Language Level - Shows the algorithm implementation program language level. The same algorithm demands additional effort if it is written in a low-level program language. For example, it is easier to program in Pascal than in Assembler. L' = V / D / D lambda = L * V* = L2 * V
Key takeaways: 1. Halstead metrics are: Program Volume (V) The unit of measurement of volume is the standard unit for size "bits." It is the actual size of a program if a uniform binary encoding for the vocabulary is used. V=N*log2n
2. Program Level (L) The value of L ranges between zero and one, with L=1 representing a program written at the highest possible level (i.e., with minimum size). L=V*/V
3. Program Difficulty The difficulty level or error-proneness (D) of the program is proportional to the number of the unique operator in the program.
D= (n1/2) * (N2/n2) 4. Programming Effort (E) The unit of measurement of E is elementary mental discriminations. E=V/L=D*V
5. Estimated Program Length According to Halstead, The first Hypothesis of software science is that the length of a well-structured program is a function only of the number of unique operators and operands. N=N1+N2 And estimated program length is denoted by N^
N^ = n1log2n1 + n2log2n2
The following alternate expressions have been published to estimate program length:
NJ = log2 (n1!) + log2 (n2!) NB = n1 * log2n2 + n2 * log2n1 NC = n1 * sqrt(n1) + n2 * sqrt(n2) NS = (n * log2n) / 2
6. Potential Minimum Volume The potential minimum volume V* is defined as the volume of the most short program in which a problem can be coded.
V* = (2 + n2*) * log2 (2 + n2*)
Here, n2* is the count of unique input and output parameters Size of Vocabulary (n) The size of the vocabulary of a program, which consists of the number of unique tokens used to build a program, is defined as: n=n1+n2 where n=vocabulary of a program n1=number of unique operators n2=number of unique operands |
1. The function point is a "unit of measurement" to express the amount of business functionality an information system (as a product) provides to a user. Function points are used to compute a functional size measurement (FSM) of software. The cost (in dollars or hours) of a single unit is calculated from past projects
2. Function points were defined in 1979 in Measuring Application Development Productivity by Allan Albrecht at IBM.
3. The functional user requirements of the software are identified and each one is categorized into one of five types: outputs, inquiries, inputs, internal files, and external interfaces.
4. Once the function is identified and categorized into a type, it is then assessed for complexity and assigned a number of function points. Each of these functional user requirements maps to an end-user business function, such as a data entry for an Input or a user query for an Inquiry.
5. This distinction is important because it tends to make the functions measured in function points map easily into user-oriented requirements, but it also tends to hide internal functions (e.g. algorithms), which also require resources to implement.
6. There is currently no ISO recognized FSM Method that includes algorithmic complexity in the sizing result. Recently there have been different approaches proposed to deal with this perceived weakness, implemented in several commercial software products.
7. The variations of the Albrecht-based IFPUG method designed to make up for this (and other weaknesses) include:
7.1 Early and easy function points – Adjusts for problem and data complexity with two questions that yield a somewhat subjective complexity measurement; simplifies measurement by eliminating the need to count data elements.
7.2 Engineering function points – Elements (variable names) and operators (e.g., arithmetic, equality/inequality, Boolean) are counted. This variation highlights computational function. The intent is similar to that of the operator/operand-based Halstead complexity measures.
7.3 Bang measure – Defines a function metric based on twelve primitive (simple) counts that affect or show Bang, defined as "the measure of true function to be delivered as perceived by the user." Bang measure may be helpful in evaluating a software unit's value in terms of how much useful function it provides, although there is little evidence in the literature of such application. The use of Bang measure could apply when re-engineering (either complete or piecewise) is being considered, as discussed in Maintenance of Operational Systems—An Overview.
7.4 Feature points – Adds changes to improve applicability to systems with significant internal processing (e.g., operating systems, communications systems). This allows accounting for functions not readily perceivable by the user, but essential for proper operation.
7.5 Weighted Micro Function Points – One of the newer models (2009) which adjusts function points using weights derived from program flow complexity, operand and operator vocabulary, object usage, and algorithm.
7.6 Fuzzy Function Points - Proposes a fuzzy and gradative transition between low x medium and medium x high complexities
Key takeaways:
1. The function point is a "unit of measurement" to express the amount of business functionality an information system (as a product) provides to a user.
2. Function points were defined in 1979 in Measuring Application Development Productivity by Allan Albrecht at IBM.
3. This distinction is important because it tends to make the functions measured in function points map easily into user-oriented requirements, but it also tends to hide internal functions (e.g. algorithms), which also require resources to implement.
4. Early and easy function points – Adjusts for problem and data complexity with two questions that yield a somewhat subjective complexity measurement; simplifies measurement by eliminating the need to count data elements.
5. Fuzzy Function Points - Proposes a fuzzy and gradative transition between low x medium and medium x high complexities
1. Feature point is the superset of function point measure that can be applied to systems and engineering software applications.
2. The feature points are used in those applications in which the algorithmic complexity is high like real-time systems where time constraints are there, embedded systems, etc.
3. Feature points are computed by counting the information domain values and are weighed by only single weight.
4. Feature point includes another measurement parameter-ALGORITHM.
The table for the computation of feature point is as follows:
Feature Point Calculations
Measurement Parameter Count Weighing factor 1. Number of external inputs (EI) - * 4 - 2. Number of external outputs (EO) - * 5 - 3. Number of external inquiries (EQ) - * 4 - 4. Number of internal files (ILF) - * 7 - 5. Number of external interfaces (EIF) - * 7 - 6.Algorithms used Count total → - * 3 - |
Key takeaways:
1. Feature point is the superset of function point measure that can be applied to systems and engineering software applications.
2. Feature points are computed by counting the information domain values and are weighed by only single weight.
1 McCabe Cyclomatic Complexity (CC) – This complexity metric is probably the most popular one and calculates essential information about constancy and maintainability of software system from source code. It gives insight into the complexity of a method
2 Weighted Method per Class (WMC) – This metric indicates the complexity of a class. One way of calculating the complexity of a class is by using cyclomatic complexities of its methods. One should aim for a class with lower value of WMC as a higher value indicates that the class is more complex.
3 Depth of Inheritance Tree (DIT) – DIT measures the maximum path from node to the root of tree. This metric indicates how far down a class is declared in the inheritance hierarchy. The following figure shows the DIT value for a simple class hierarchy.
4 Number of Children (NOC) – This metric indicates how many sub-classes are going to inherit the methods of the parent class. As shown in above figure, class C2 has 2 children, subclasses C21, C22. The value of NOC indicates the level of reuse in an application. If NOC increases it means reuse increases.
5 Coupling between Objects (CBO) – The rationale behind this metric is that an object is coupled to another object if two object acts upon each other. If a class uses the methods of other classes, then they both are coupled. An increase in CBO indicates an increase in responsibilities of a class. Hence, the CBO value for classes should be kept as low as possible.
6 Lack of Cohesion in Methods (LCOM) – LCOM can be used to measure the degree of cohesiveness present. It reflects on how well a system is designed and how complex a class is. LCOM is calculated by ascertaining the number of method pairs whose similarity is zero, minus the count of method pairs whose similarity is not zero.
7 Method Hiding Factor (MHF) – MHF is defined as the ratio of sum of the invisibilities of all methods defined in all classes to the total number of methods defined in the system. The invisibility of a method is the percentage of the total classes from which this method is not visible.
8 Attribute Hiding Factor (AHF) – AHF is calculated as the ratio of the sum of the invisibilities of all attributes defined in all classes to the total number of attributes defined in the system.
9 Method Inheritance Factor (MIF) – MIF measures the ratio of the sum of the inherited methods in all classes of the system to the total number of available method for all classes.
10 Attribute Inheritance Factor (AIF) – AIF measures the ratio of sum of inherited attributes in all classes of the system under consideration to the total number of available attributes.
Key takeaways:
1. McCabe Cyclomatic Complexity (CC) – This complexity metric is probably the most popular one and calculates essential information about constancy and maintainability of software system from source code.
2. Depth of Inheritance Tree (DIT) – DIT measures the maximum path from node to the root of tree.
3. Coupling between Objects (CBO) – The rationale behind this metric is that an object is coupled to another object if two object acts upon each other.
4. Method Hiding Factor (MHF) – MHF is defined as the ratio of sum of the invisibilities of all methods defined in all classes to the total number of methods defined in the system.
1. Software fault tolerance is the ability for software to detect and recover from a fault that is happening or has already happened in either the software or hardware in the system in which the software is running in order to provide service in accordance with the specification.
2. Software fault tolerance is a necessary component in order to construct the next generation of highly available and reliable computing systems from embedded systems to data warehouse systems.
3. Software fault tolerance is not a solution unto itself however, and it is important to realize that software fault tolerance is just one piece necessary to create the next generation of systems.
4. In order to adequately understand software fault tolerance it is important to understand the nature of the problem that software fault tolerance is supposed to solve. Software faults are all design faults.
5. Software manufacturing, the reproduction of software, is considered to be perfect. The source of the problem being solely design faults is very different than almost any other system in which fault tolerance is a desired property.
6. This inherent issue, that software faults are the result of human error in interpreting a specification or correctly implementing an algorithm, creates issues which must be dealt with in the fundamental approach to software fault tolerance.
7. Fault tolerance is defined as how to provide, by redundancy, service complying with the specification in spite of faults having occurred or occurring. (Laprie 1996).
8. There are some important concepts buried within the text of this definition that should be examined.
9. Primarily, Laprie argues that fault tolerance is accomplished using redundancy. This argument is good for errors which are not caused by design faults, however, replicating a design fault in multiple places will not aide in complying with a specification.
10. It is also important to note the emphasis placed on the specification as the final arbiter of what is an error and what is not.
11. Design diversity increases pressure on the specification creators to make multiple variants of the same specification which are equivalent in order to aide the programmer in creating variations in algorithms for the necessary redundancy.
12 .The definition itself may no longer be appropriate for the type of problems that current fault tolerance is trying to solve, both hardware and software.
Key takeaways:
1. Software fault tolerance is the ability for software to detect and recover from a fault that is happening or has already happened in either the software or hardware in the system in which the software is running in order to provide service in accordance with the specification.
2. Software fault tolerance is not a solution unto itself however, and it is important to realize that software fault tolerance is just one piece necessary to create the next generation of systems.
3. There are some important concepts buried within the text of this definition that should be examined.
4. It is also important to note the emphasis placed on the specification as the final arbiter of what is an error and what is not.
1. Boehm proposed COCOMO (Constructive Cost Estimation Model) in 1981.COCOMO is one of the most generally used software estimation models in the world. COCOMO predicts the efforts and schedule of a software product based on the size of the software.
2. The necessary steps in this model are:
2.1 Get an initial estimate of the development effort from evaluation of thousands of delivered lines of source code (KDLOC).
2.2 Determine a set of 15 multiplying factors from various attributes of the project.
2.3 Calculate the effort estimate by multiplying the initial estimate with all the multiplying factors i.e., multiply the values in step1 and step2.
3. The initial estimate (also called nominal estimate) is determined by an equation of the form used in the static single variable models, using KDLOC as the measure of the size. To determine the initial effort Ei in person-months the equation used is of the type is shown below
Ei=a*(KDLOC)b
4. The value of the constant a and b are depends on the project type.
In COCOMO, projects are categorized into three types:
4.1 Organic
4.2 Semidetached
4.3 Embedded
4.1Organic: A development project can be treated of the organic type, if the project deals with developing a well-understood application program, the size of the development team is reasonably small, and the team members are experienced in developing similar methods of projects. Examples of this type of projects are simple business systems, simple inventory management systems, and data processing systems.
4.2 Semidetached: A development project can be treated with semidetached type if the development consists of a mixture of experienced and inexperienced staff. Team members may have finite experience in related systems but may be unfamiliar with some aspects of the order being developed. Example of Semidetached system includes developing a new operating system (OS), a Database Management System (DBMS), and complex inventory management system.
4.3. Embedded: A development project is treated to be of an embedded type, if the software being developed is strongly coupled to complex hardware, or if the stringent regulations on the operational method exist. For Example: ATM, Air Traffic control.
5. For three product categories, Bohem provides a different set of expression to predict effort (in a unit of person month)and development time from the size of estimation in KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due to holidays, weekly off, coffee breaks, etc.
6. According to Boehm, software cost estimation should be done through three stages:
6.1 Basic Model
6.2 Intermediate Model
6.3 Detailed Model
6.1. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the project parameters. The following expressions give the basic COCOMO estimation model:
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Where
KLOC is the estimated size of the software product indicate in Kilo Lines of Code,
a1,a2,b1,b2 are constants for each group of software products,
Tdev is the estimated time to develop the software, expressed in months,
Effort is the total effort required to develop the software product, expressed in person months (PMs).
6.2 Estimation of development effort
For the three classes of software products, the formulas for estimating the effort based on the code size are shown below:
Organic: Effort = 2.4(KLOC) 1.05 PM
Semi-detached: Effort = 3.0(KLOC) 1.12 PM
Embedded: Effort = 3.6(KLOC) 1.20 PM
6.3 Estimation of development time
For the three classes of software products, the formulas for estimating the development time based on the effort are given below:
Organic: Tdev = 2.5(Effort) 0.38 Months
Semi-detached: Tdev = 2.5(Effort) 0.35 Months
Embedded: Tdev = 2.5(Effort) 0.32 Months
Some insight into the basic COCOMO model can be obtained by plotting the estimated characteristics for different software sizes. Fig shows a plot of estimated effort versus product size. From fig, we can observe that the effort is somewhat superliner in the size of the software product. Thus, the effort required to develop a product increases very rapidly with project size.
Key takeaways:
1 Get an initial estimate of the development effort from evaluation of thousands of delivered lines of source code (KDLOC).
2. The initial estimate (also called nominal estimate) is determined by an equation of the form used in the static single variable models, using KDLOC as the measure of the size.
3. A development project can be treated of the organic type, if the project deals with developing a well-understood application program, the size of the development team is reasonably small, and the team members are experienced in developing similar methods of projects.
4. For three product categories, Bohem provides a different set of expression to predict effort (in a unit of person month)and development time from the size of estimation in KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due to holidays, weekly off, coffee breaks, etc.
5. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the project parameters. The following expressions give the basic COCOMO estimation model:
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Basic Model – E= a(KLOC)^b time= c(Effort)^d Person required = Effort/ time The above formula is used for the cost estimation of for the basic COCOMO model, and also is used in the subsequent models. The constant values a,b,c and d for the Basic Model for the different categories of system:
Software Projects a b c d Organic 2.4 1.05 2.5 0.38 Semi Detached 3.0 1.12 2.5 0.35 Embedded 3.6 1.20 2.5 0.32 The effort is measured in Person-Months and as evident from the formula is dependent on Kilo-Lines of code. The development time is measured in Months. These formulas are used as such in the Basic Model calculations, as not much consideration of different factors such as reliability, expertise is taken into account, henceforth the estimate is rough.
Below is the C++ program for Basic COCOMO
// C++ program to implement basic COCOMO #include<bits/stdc++.h> using namespace std;
// Function for rounding off float to int int fround(float x) { int a; x=x+0.5; a=x; return(a); } // Function to calculate parameters of Basic COCOMO void calculate(float table[][4], int n,char mode[][15], int size) { float effort,time,staff;
int model; // Check the mode according to size
if(size>=2 && size<=50) model=0; //organic
else if(size>50 && size<=300) model=1; //semi-detached
else if(size>300) model=2; //embedded
cout<<"The mode is "<<mode[model];
// Calculate Effort effort = table[model][0]*pow(size,table[model][1]);
// Calculate Time time = table[model][2]*pow(effort,table[model][3]);
//Calculate Persons Required staff = effort/time;
// Output the values calculated cout<<"\nEffort = "<<effort<<" Person-Month";
cout<<"\nDevelopment Time = "<<time<<" Months";
cout<<"\nAverage Staff Required = "<<fround(staff)<<" Persons";
}
int main() { float table[3][4]={2.4,1.05,2.5,0.38,3.0,1.12,2.5,0.35,3.6,1.20,2.5,0.32};
char mode[][15]={"Organic","Semi-Detached","Embedded"};
int size = 4;
calculate(table,3,mode,size);
return 0; } Output: The mode is Organic Effort = 10.289 Person-Month Development Time = 6.06237 Months Average Staff Required = 2 Persons |
Key takeaways:
1. Person required = Effort/ time
The above formula is used for the cost estimation of for the basic COCOMO model, and also is used in the subsequent models.
2. The effort is measured in Person-Months and as evident from the formula is dependent on Kilo-Lines of code.
3. The development time is measured in Months.
Detailed Model –
1. Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of the cost driver’s impact on each step of the software engineering process.
2. The detailed model uses different effort multipliers for each cost driver attribute. In detailed cocomo, the whole software is divided into different modules and then we apply COCOMO in different modules to estimate effort and then sum the effort.
3. The Six phases of detailed COCOMO are:
3.1 Planning and requirements
3.2 System design
3.3 Detailed design
3.4 Module code and test
3.5 Integration and test
4. Cost Constructive model
The effort is calculated as a function of program size and a set of cost drivers are given according to each phase of the software lifecycle.
Key takeaways:
1. Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of the cost driver’s impact on each step of the software engineering process.
2. In detailed cocomo, the whole software is divided into different modules and then we apply COCOMO in different modules to estimate effort and then sum the effort.
3. The effort is calculated as a function of program size and a set of cost drivers are given according to each phase of the software lifecycle.
COCOMO-II is the revised version of the original Cocomo (Constructive Cost Model) and is developed at University of Southern California. It is the model that allows one to estimate the cost, effort and schedule when planning a new software development activity.
It consists of three sub-models:
1. End User Programming:
Application generators are used in this sub-model. End user write the code by using these application generators.
Example – Spreadsheets, report generator, etc.
2. Intermediate Sector:
(a) Application Generators and Composition Aids –
This category will create largely prepackaged capabilities for user programming. Their product will have many reusable components. Typical firms operating in this sector are Microsoft, Lotus,
Oracle, IBM, Borland, Novell.
(b) Application Composition Sector –
This category is too diversified and to be handled by prepackaged solutions. It includes GUI, Databases, domain specific components such as financial, medical or industrial process control packages.
(c) System Integration –
This category deals with large scale and highly embedded systems.
3. Infrastructure Sector:
This category provides infrastructure for the software development like Operating System, Database Management System, User Interface Management System, Networking System, etc.
Stages of COCOMO II:
Stage-I:
It supports estimation of prototyping. For this it uses Application Composition Estimation Model. This model is used for the prototyping stage of application generator and system integration.
Stage-II:
It supports estimation in the early design stage of the project, when we less know about it. For this it uses Early Design Estimation Model. This model is used in early design stage of application generators, infrastructure, system integration.
Stage-III:
It supports estimation in the post architecture stage of a project. For this it uses Post Architecture Estimation Model. This model is used after the completion of the detailed architecture of application generator, infrastructure, system integration.
Key takeaways:
1. Application generators are used in this sub-model. End user write the code by using these application generators.
2. This category will create largely prepackaged capabilities for user programming. Their product will have many reusable components. Typical firms operating in this sector are Microsoft, Lotus,
3. Infrastructure Sector:This category provides infrastructure for the software development like Operating System, Database Management System, User Interface Management System, Networking System, etc
References
- Real- Time Systems Design and Analysis.. Tools for the Practitioner by Phillip A Laplante, Seppo J.Ovaska ,Wiley - 4th Edition
- Embedded Real Time Systems: Concepts, Design and Programming - Dr. K.V.K. Prasad -Black Book, Edition: 2014
- Real Time Systems Theory and Practice , Rajib Mall , Pearson Education
Unit - 5
Metrics and cost estimation
1. The phrase “lines of code” (LOC) is a metric generally used to evaluate a software program or codebase according to its size.
2. It is a general identifier taken by adding up the number of lines of code used to write a program. LOC is used in various ways to assess a project, and there is a debate on how effective this measurement is.
3. Source lines of code (SLOC), also known as lines of code (LOC), is a software metric used to measure the size of a computer program by counting the number of lines in the text of the program's source code.
4. SLOC is typically used to predict the amount of effort that will be required to develop a program, as well as to estimate programming productivity or maintainability once the software is produced.
5.1.2 Measurement methods
1. Many useful comparisons involve only the order of magnitude of lines of code in a project. Using lines of code to compare a 10,000-line project to a 100,000-line project is far more useful than when comparing a 20,000-line project with a 21,000-line project.
2. While it is debatable exactly how to measure lines of code, discrepancies of an order of magnitude can be clear indicators of software complexity or man-hours.
3. There are two major types of SLOC measures: physical SLOC (LOC) and logical SLOC (LLOC). Specific definitions of these two measures vary, but the most common definition of physical SLOC is a count of lines in the text of the program's source code excluding comment lines.
4. Logical SLOC attempts to measure the number of executable "statements", but their specific definitions are tied to specific computer languages (one simple logical SLOC measure for C-like programming languages is the number of statement-terminating semicolons).
5. It is much easier to create tools that measure physical SLOC, and physical SLOC definitions are easier to explain. However, physical SLOC measures are sensitive to logically irrelevant formatting and style conventions, while logical SLOC is less sensitive to formatting and style conventions.
6. However, SLOC measures are often stated without giving their definition, and logical SLOC can often be significantly different from physical SLOC.
7. Consider this snippet of C code as an example of the ambiguity encountered when determining SLOC:
for (i = 0; i < 100; i++) printf("hello"); /* How many lines of code is this? */
8. In this example we have:
8.1 1 physical line of code (LOC),
8.2 2 logical lines of code (LLOC) (for statement and printf statement),
8.3 1 comment line.
Key takeaways:
1. The phrase “lines of code” (LOC) is a metric generally used to evaluate a software program or codebase according to its size.
2. SLOC is typically used to predict the amount of effort that will be required to develop a program, as well as to estimate programming productivity or maintainability once the software is produced.
3. While it is debatable exactly how to measure lines of code, discrepancies of an order of magnitude can be clear indicators of software complexity or man-hours.
4. Logical SLOC attempts to measure the number of executable "statements", but their specific definitions are tied to specific computer languages (one simple logical SLOC measure for C-like programming languages is the number of statement-terminating semicolons)
5.2.1 introduction
1. Cyclomatic complexity is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program's source code. It was developed by Thomas J. McCabe, Sr. in 1976.
2. Cyclomatic complexity is computed using the control flow graph of the program: the nodes of the graph correspond to indivisible groups of commands of a program, and a directed edge connects two nodes if the second command might be executed immediately after the first command.
3. Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program.
4. One testing strategy, called basis path testing by McCabe who first proposed it, is to test each linearly independent path through the program; in this case, the number of test cases will equal the cyclomatic complexity of the program.
5.2.2 Definition and explanation
1. A control flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). On exiting the loop, there is a conditional statement (group below the loop), and finally the program exits at the blue node. This graph has 9 edges, 8 nodes, and 1 connected component, so the cyclomatic complexity of the program is 9 - 8 + 2*1 = 3.
2. The cyclomatic complexity of a section of source code is the number of linearly independent paths within it—where "linearly independent" means that each path has at least one edge that is not in one of the other paths.
3. For instance, if the source code contained no control flow statements (conditionals or decision points), the complexity would be 1, since there would be only a single path through the code.
4. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.
5. Mathematically, the cyclomatic complexity of a structured program[a] is defined with reference to the control flow graph of the program, a directed graph containing the basic blocks of the program, with an edge between two basic blocks if control may pass from the first to the second. The complexity M is then defined as:
M = E − N + 2P,
where
E = the number of edges of the graph.
N = the number of nodes of the graph.
P = the number of connected components.
6. The same function as above, represented using the alternative formulation, where each exit point is connected back to the entry point. This graph has 10 edges, 8 nodes, and 1 connected component, which also results in a cyclomatic complexity of 3 using the alternative formulation (10 - 8 + 1 = 3).
7. An alternative formulation is to use a graph in which each exit point is connected back to the entry point. In this case, the graph is strongly connected, and the cyclomatic complexity of the program is equal to the cyclomatic number of its graph (also known as the first Betti number), which is defined as
8. M = E − N + P.
9. This may be seen as calculating the number of linearly independent cycles that exist in the graph, i.e. those cycles that do not contain other cycles within themselves. Note that because each exit point loops back to the entry point, there is at least one such cycle for each exit point.
10 For a single program (or subroutine or method), P is always equal to 1. So a simpler formula for a single subroutine is
M = E − N + 2.
10. Applications
10.1 Limiting complexity during development
10.2 Measuring the "structuredness" of a program
10.3 Implications for software testing
10.4 Correlation to number of defects
Key takeaways:
1. Cyclomatic complexity is a software metric used to indicate the complexity of a program
2. Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program.
3. A control flow graph of a simple program. The program begins executing at the red node, then enters a loop (group of three nodes immediately below the red node). On exiting the loop, there is a conditional statement (group below the loop), and finally the program exits at the blue node.
4. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so the complexity would be 2. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 3.
5. For a single program (or subroutine or method), P is always equal to 1. So a simpler formula for a single subroutine is
M = E − N + 2.
5.3 Halsted’s metric
1. Halstead complexity measures are software metrics introduced by Maurice Howard Halstead in 1977as part of his treatise on establishing an empirical science of software development.
2. Halstead made the observation that metrics of the software should reflect the implementation or expression of algorithms in different languages, but be independent of their execution on a specific platform. These metrics are therefore computed statically from the code.
3. Halstead's goal was to identify measurable properties of software, and the relations between them. This is similar to the identification of measurable properties of matter (like the volume, mass, and pressure of a gas) and the relationships between them (analogous to the gas equation). Thus his metrics are actually not just complexity metrics.
4. Halstead metrics are:
4.1 Program Volume (V) The unit of measurement of volume is the standard unit for size "bits." It is the actual size of a program if a uniform binary encoding for the vocabulary is used. V=N*log2n
4.2 Program Level (L) The value of L ranges between zero and one, with L=1 representing a program written at the highest possible level (i.e., with minimum size). L=V*/V 4.3 Program Difficulty
The difficulty level or error-proneness (D) of the program is proportional to the number of the unique operator in the program. D= (n1/2) * (N2/n2)
4.4 Programming Effort (E) The unit of measurement of E is elementary mental discriminations. E=V/L=D*V
4.5 Estimated Program Length
According to Halstead, The first Hypothesis of software science is that the length of a well-structured program is a function only of the number of unique operators and operands. N=N1+N2 And estimated program length is denoted by N^ N^ = n1log2n1 + n2log2n2 The following alternate expressions have been published to estimate program length:
NJ = log2 (n1!) + log2 (n2!) NB = n1 * log2n2 + n2 * log2n1 NC = n1 * sqrt(n1) + n2 * sqrt(n2) NS = (n * log2n) / 2
4.6 Potential Minimum Volume The potential minimum volume V* is defined as the volume of the most short program in which a problem can be coded. V* = (2 + n2*) * log2 (2 + n2*)
Here, n2* is the count of unique input and output parameters
4.7 Size of Vocabulary (n) The size of the vocabulary of a program, which consists of the number of unique tokens used to build a program, is defined as: n=n1+n2 where
n=vocabulary of a program n1=number of unique operators n2=number of unique operands
4.8 Language Level - Shows the algorithm implementation program language level. The same algorithm demands additional effort if it is written in a low-level program language. For example, it is easier to program in Pascal than in Assembler. L' = V / D / D lambda = L * V* = L2 * V
Key takeaways: 1. Halstead metrics are: Program Volume (V) The unit of measurement of volume is the standard unit for size "bits." It is the actual size of a program if a uniform binary encoding for the vocabulary is used. V=N*log2n
2. Program Level (L) The value of L ranges between zero and one, with L=1 representing a program written at the highest possible level (i.e., with minimum size). L=V*/V
3. Program Difficulty The difficulty level or error-proneness (D) of the program is proportional to the number of the unique operator in the program.
D= (n1/2) * (N2/n2) 4. Programming Effort (E) The unit of measurement of E is elementary mental discriminations. E=V/L=D*V
5. Estimated Program Length According to Halstead, The first Hypothesis of software science is that the length of a well-structured program is a function only of the number of unique operators and operands. N=N1+N2 And estimated program length is denoted by N^
N^ = n1log2n1 + n2log2n2
The following alternate expressions have been published to estimate program length:
NJ = log2 (n1!) + log2 (n2!) NB = n1 * log2n2 + n2 * log2n1 NC = n1 * sqrt(n1) + n2 * sqrt(n2) NS = (n * log2n) / 2
6. Potential Minimum Volume The potential minimum volume V* is defined as the volume of the most short program in which a problem can be coded.
V* = (2 + n2*) * log2 (2 + n2*)
Here, n2* is the count of unique input and output parameters Size of Vocabulary (n) The size of the vocabulary of a program, which consists of the number of unique tokens used to build a program, is defined as: n=n1+n2 where n=vocabulary of a program n1=number of unique operators n2=number of unique operands |
1. The function point is a "unit of measurement" to express the amount of business functionality an information system (as a product) provides to a user. Function points are used to compute a functional size measurement (FSM) of software. The cost (in dollars or hours) of a single unit is calculated from past projects
2. Function points were defined in 1979 in Measuring Application Development Productivity by Allan Albrecht at IBM.
3. The functional user requirements of the software are identified and each one is categorized into one of five types: outputs, inquiries, inputs, internal files, and external interfaces.
4. Once the function is identified and categorized into a type, it is then assessed for complexity and assigned a number of function points. Each of these functional user requirements maps to an end-user business function, such as a data entry for an Input or a user query for an Inquiry.
5. This distinction is important because it tends to make the functions measured in function points map easily into user-oriented requirements, but it also tends to hide internal functions (e.g. algorithms), which also require resources to implement.
6. There is currently no ISO recognized FSM Method that includes algorithmic complexity in the sizing result. Recently there have been different approaches proposed to deal with this perceived weakness, implemented in several commercial software products.
7. The variations of the Albrecht-based IFPUG method designed to make up for this (and other weaknesses) include:
7.1 Early and easy function points – Adjusts for problem and data complexity with two questions that yield a somewhat subjective complexity measurement; simplifies measurement by eliminating the need to count data elements.
7.2 Engineering function points – Elements (variable names) and operators (e.g., arithmetic, equality/inequality, Boolean) are counted. This variation highlights computational function. The intent is similar to that of the operator/operand-based Halstead complexity measures.
7.3 Bang measure – Defines a function metric based on twelve primitive (simple) counts that affect or show Bang, defined as "the measure of true function to be delivered as perceived by the user." Bang measure may be helpful in evaluating a software unit's value in terms of how much useful function it provides, although there is little evidence in the literature of such application. The use of Bang measure could apply when re-engineering (either complete or piecewise) is being considered, as discussed in Maintenance of Operational Systems—An Overview.
7.4 Feature points – Adds changes to improve applicability to systems with significant internal processing (e.g., operating systems, communications systems). This allows accounting for functions not readily perceivable by the user, but essential for proper operation.
7.5 Weighted Micro Function Points – One of the newer models (2009) which adjusts function points using weights derived from program flow complexity, operand and operator vocabulary, object usage, and algorithm.
7.6 Fuzzy Function Points - Proposes a fuzzy and gradative transition between low x medium and medium x high complexities
Key takeaways:
1. The function point is a "unit of measurement" to express the amount of business functionality an information system (as a product) provides to a user.
2. Function points were defined in 1979 in Measuring Application Development Productivity by Allan Albrecht at IBM.
3. This distinction is important because it tends to make the functions measured in function points map easily into user-oriented requirements, but it also tends to hide internal functions (e.g. algorithms), which also require resources to implement.
4. Early and easy function points – Adjusts for problem and data complexity with two questions that yield a somewhat subjective complexity measurement; simplifies measurement by eliminating the need to count data elements.
5. Fuzzy Function Points - Proposes a fuzzy and gradative transition between low x medium and medium x high complexities
1. Feature point is the superset of function point measure that can be applied to systems and engineering software applications.
2. The feature points are used in those applications in which the algorithmic complexity is high like real-time systems where time constraints are there, embedded systems, etc.
3. Feature points are computed by counting the information domain values and are weighed by only single weight.
4. Feature point includes another measurement parameter-ALGORITHM.
The table for the computation of feature point is as follows:
Feature Point Calculations
Measurement Parameter Count Weighing factor 1. Number of external inputs (EI) - * 4 - 2. Number of external outputs (EO) - * 5 - 3. Number of external inquiries (EQ) - * 4 - 4. Number of internal files (ILF) - * 7 - 5. Number of external interfaces (EIF) - * 7 - 6.Algorithms used Count total → - * 3 - |
Key takeaways:
1. Feature point is the superset of function point measure that can be applied to systems and engineering software applications.
2. Feature points are computed by counting the information domain values and are weighed by only single weight.
1 McCabe Cyclomatic Complexity (CC) – This complexity metric is probably the most popular one and calculates essential information about constancy and maintainability of software system from source code. It gives insight into the complexity of a method
2 Weighted Method per Class (WMC) – This metric indicates the complexity of a class. One way of calculating the complexity of a class is by using cyclomatic complexities of its methods. One should aim for a class with lower value of WMC as a higher value indicates that the class is more complex.
3 Depth of Inheritance Tree (DIT) – DIT measures the maximum path from node to the root of tree. This metric indicates how far down a class is declared in the inheritance hierarchy. The following figure shows the DIT value for a simple class hierarchy.
4 Number of Children (NOC) – This metric indicates how many sub-classes are going to inherit the methods of the parent class. As shown in above figure, class C2 has 2 children, subclasses C21, C22. The value of NOC indicates the level of reuse in an application. If NOC increases it means reuse increases.
5 Coupling between Objects (CBO) – The rationale behind this metric is that an object is coupled to another object if two object acts upon each other. If a class uses the methods of other classes, then they both are coupled. An increase in CBO indicates an increase in responsibilities of a class. Hence, the CBO value for classes should be kept as low as possible.
6 Lack of Cohesion in Methods (LCOM) – LCOM can be used to measure the degree of cohesiveness present. It reflects on how well a system is designed and how complex a class is. LCOM is calculated by ascertaining the number of method pairs whose similarity is zero, minus the count of method pairs whose similarity is not zero.
7 Method Hiding Factor (MHF) – MHF is defined as the ratio of sum of the invisibilities of all methods defined in all classes to the total number of methods defined in the system. The invisibility of a method is the percentage of the total classes from which this method is not visible.
8 Attribute Hiding Factor (AHF) – AHF is calculated as the ratio of the sum of the invisibilities of all attributes defined in all classes to the total number of attributes defined in the system.
9 Method Inheritance Factor (MIF) – MIF measures the ratio of the sum of the inherited methods in all classes of the system to the total number of available method for all classes.
10 Attribute Inheritance Factor (AIF) – AIF measures the ratio of sum of inherited attributes in all classes of the system under consideration to the total number of available attributes.
Key takeaways:
1. McCabe Cyclomatic Complexity (CC) – This complexity metric is probably the most popular one and calculates essential information about constancy and maintainability of software system from source code.
2. Depth of Inheritance Tree (DIT) – DIT measures the maximum path from node to the root of tree.
3. Coupling between Objects (CBO) – The rationale behind this metric is that an object is coupled to another object if two object acts upon each other.
4. Method Hiding Factor (MHF) – MHF is defined as the ratio of sum of the invisibilities of all methods defined in all classes to the total number of methods defined in the system.
1. Software fault tolerance is the ability for software to detect and recover from a fault that is happening or has already happened in either the software or hardware in the system in which the software is running in order to provide service in accordance with the specification.
2. Software fault tolerance is a necessary component in order to construct the next generation of highly available and reliable computing systems from embedded systems to data warehouse systems.
3. Software fault tolerance is not a solution unto itself however, and it is important to realize that software fault tolerance is just one piece necessary to create the next generation of systems.
4. In order to adequately understand software fault tolerance it is important to understand the nature of the problem that software fault tolerance is supposed to solve. Software faults are all design faults.
5. Software manufacturing, the reproduction of software, is considered to be perfect. The source of the problem being solely design faults is very different than almost any other system in which fault tolerance is a desired property.
6. This inherent issue, that software faults are the result of human error in interpreting a specification or correctly implementing an algorithm, creates issues which must be dealt with in the fundamental approach to software fault tolerance.
7. Fault tolerance is defined as how to provide, by redundancy, service complying with the specification in spite of faults having occurred or occurring. (Laprie 1996).
8. There are some important concepts buried within the text of this definition that should be examined.
9. Primarily, Laprie argues that fault tolerance is accomplished using redundancy. This argument is good for errors which are not caused by design faults, however, replicating a design fault in multiple places will not aide in complying with a specification.
10. It is also important to note the emphasis placed on the specification as the final arbiter of what is an error and what is not.
11. Design diversity increases pressure on the specification creators to make multiple variants of the same specification which are equivalent in order to aide the programmer in creating variations in algorithms for the necessary redundancy.
12 .The definition itself may no longer be appropriate for the type of problems that current fault tolerance is trying to solve, both hardware and software.
Key takeaways:
1. Software fault tolerance is the ability for software to detect and recover from a fault that is happening or has already happened in either the software or hardware in the system in which the software is running in order to provide service in accordance with the specification.
2. Software fault tolerance is not a solution unto itself however, and it is important to realize that software fault tolerance is just one piece necessary to create the next generation of systems.
3. There are some important concepts buried within the text of this definition that should be examined.
4. It is also important to note the emphasis placed on the specification as the final arbiter of what is an error and what is not.
1. Boehm proposed COCOMO (Constructive Cost Estimation Model) in 1981.COCOMO is one of the most generally used software estimation models in the world. COCOMO predicts the efforts and schedule of a software product based on the size of the software.
2. The necessary steps in this model are:
2.1 Get an initial estimate of the development effort from evaluation of thousands of delivered lines of source code (KDLOC).
2.2 Determine a set of 15 multiplying factors from various attributes of the project.
2.3 Calculate the effort estimate by multiplying the initial estimate with all the multiplying factors i.e., multiply the values in step1 and step2.
3. The initial estimate (also called nominal estimate) is determined by an equation of the form used in the static single variable models, using KDLOC as the measure of the size. To determine the initial effort Ei in person-months the equation used is of the type is shown below
Ei=a*(KDLOC)b
4. The value of the constant a and b are depends on the project type.
In COCOMO, projects are categorized into three types:
4.1 Organic
4.2 Semidetached
4.3 Embedded
4.1Organic: A development project can be treated of the organic type, if the project deals with developing a well-understood application program, the size of the development team is reasonably small, and the team members are experienced in developing similar methods of projects. Examples of this type of projects are simple business systems, simple inventory management systems, and data processing systems.
4.2 Semidetached: A development project can be treated with semidetached type if the development consists of a mixture of experienced and inexperienced staff. Team members may have finite experience in related systems but may be unfamiliar with some aspects of the order being developed. Example of Semidetached system includes developing a new operating system (OS), a Database Management System (DBMS), and complex inventory management system.
4.3. Embedded: A development project is treated to be of an embedded type, if the software being developed is strongly coupled to complex hardware, or if the stringent regulations on the operational method exist. For Example: ATM, Air Traffic control.
5. For three product categories, Bohem provides a different set of expression to predict effort (in a unit of person month)and development time from the size of estimation in KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due to holidays, weekly off, coffee breaks, etc.
6. According to Boehm, software cost estimation should be done through three stages:
6.1 Basic Model
6.2 Intermediate Model
6.3 Detailed Model
6.1. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the project parameters. The following expressions give the basic COCOMO estimation model:
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Where
KLOC is the estimated size of the software product indicate in Kilo Lines of Code,
a1,a2,b1,b2 are constants for each group of software products,
Tdev is the estimated time to develop the software, expressed in months,
Effort is the total effort required to develop the software product, expressed in person months (PMs).
6.2 Estimation of development effort
For the three classes of software products, the formulas for estimating the effort based on the code size are shown below:
Organic: Effort = 2.4(KLOC) 1.05 PM
Semi-detached: Effort = 3.0(KLOC) 1.12 PM
Embedded: Effort = 3.6(KLOC) 1.20 PM
6.3 Estimation of development time
For the three classes of software products, the formulas for estimating the development time based on the effort are given below:
Organic: Tdev = 2.5(Effort) 0.38 Months
Semi-detached: Tdev = 2.5(Effort) 0.35 Months
Embedded: Tdev = 2.5(Effort) 0.32 Months
Some insight into the basic COCOMO model can be obtained by plotting the estimated characteristics for different software sizes. Fig shows a plot of estimated effort versus product size. From fig, we can observe that the effort is somewhat superliner in the size of the software product. Thus, the effort required to develop a product increases very rapidly with project size.
Key takeaways:
1 Get an initial estimate of the development effort from evaluation of thousands of delivered lines of source code (KDLOC).
2. The initial estimate (also called nominal estimate) is determined by an equation of the form used in the static single variable models, using KDLOC as the measure of the size.
3. A development project can be treated of the organic type, if the project deals with developing a well-understood application program, the size of the development team is reasonably small, and the team members are experienced in developing similar methods of projects.
4. For three product categories, Bohem provides a different set of expression to predict effort (in a unit of person month)and development time from the size of estimation in KLOC(Kilo Line of code) efforts estimation takes into account the productivity loss due to holidays, weekly off, coffee breaks, etc.
5. Basic COCOMO Model: The basic COCOMO model provide an accurate size of the project parameters. The following expressions give the basic COCOMO estimation model:
Effort=a1*(KLOC) a2 PM
Tdev=b1*(efforts)b2 Months
Basic Model – E= a(KLOC)^b time= c(Effort)^d Person required = Effort/ time The above formula is used for the cost estimation of for the basic COCOMO model, and also is used in the subsequent models. The constant values a,b,c and d for the Basic Model for the different categories of system:
Software Projects a b c d Organic 2.4 1.05 2.5 0.38 Semi Detached 3.0 1.12 2.5 0.35 Embedded 3.6 1.20 2.5 0.32 The effort is measured in Person-Months and as evident from the formula is dependent on Kilo-Lines of code. The development time is measured in Months. These formulas are used as such in the Basic Model calculations, as not much consideration of different factors such as reliability, expertise is taken into account, henceforth the estimate is rough.
Below is the C++ program for Basic COCOMO
// C++ program to implement basic COCOMO #include<bits/stdc++.h> using namespace std;
// Function for rounding off float to int int fround(float x) { int a; x=x+0.5; a=x; return(a); } // Function to calculate parameters of Basic COCOMO void calculate(float table[][4], int n,char mode[][15], int size) { float effort,time,staff;
int model; // Check the mode according to size
if(size>=2 && size<=50) model=0; //organic
else if(size>50 && size<=300) model=1; //semi-detached
else if(size>300) model=2; //embedded
cout<<"The mode is "<<mode[model];
// Calculate Effort effort = table[model][0]*pow(size,table[model][1]);
// Calculate Time time = table[model][2]*pow(effort,table[model][3]);
//Calculate Persons Required staff = effort/time;
// Output the values calculated cout<<"\nEffort = "<<effort<<" Person-Month";
cout<<"\nDevelopment Time = "<<time<<" Months";
cout<<"\nAverage Staff Required = "<<fround(staff)<<" Persons";
}
int main() { float table[3][4]={2.4,1.05,2.5,0.38,3.0,1.12,2.5,0.35,3.6,1.20,2.5,0.32};
char mode[][15]={"Organic","Semi-Detached","Embedded"};
int size = 4;
calculate(table,3,mode,size);
return 0; } Output: The mode is Organic Effort = 10.289 Person-Month Development Time = 6.06237 Months Average Staff Required = 2 Persons |
Key takeaways:
1. Person required = Effort/ time
The above formula is used for the cost estimation of for the basic COCOMO model, and also is used in the subsequent models.
2. The effort is measured in Person-Months and as evident from the formula is dependent on Kilo-Lines of code.
3. The development time is measured in Months.
Detailed Model –
1. Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of the cost driver’s impact on each step of the software engineering process.
2. The detailed model uses different effort multipliers for each cost driver attribute. In detailed cocomo, the whole software is divided into different modules and then we apply COCOMO in different modules to estimate effort and then sum the effort.
3. The Six phases of detailed COCOMO are:
3.1 Planning and requirements
3.2 System design
3.3 Detailed design
3.4 Module code and test
3.5 Integration and test
4. Cost Constructive model
The effort is calculated as a function of program size and a set of cost drivers are given according to each phase of the software lifecycle.
Key takeaways:
1. Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of the cost driver’s impact on each step of the software engineering process.
2. In detailed cocomo, the whole software is divided into different modules and then we apply COCOMO in different modules to estimate effort and then sum the effort.
3. The effort is calculated as a function of program size and a set of cost drivers are given according to each phase of the software lifecycle.
COCOMO-II is the revised version of the original Cocomo (Constructive Cost Model) and is developed at University of Southern California. It is the model that allows one to estimate the cost, effort and schedule when planning a new software development activity.
It consists of three sub-models:
1. End User Programming:
Application generators are used in this sub-model. End user write the code by using these application generators.
Example – Spreadsheets, report generator, etc.
2. Intermediate Sector:
(a) Application Generators and Composition Aids –
This category will create largely prepackaged capabilities for user programming. Their product will have many reusable components. Typical firms operating in this sector are Microsoft, Lotus,
Oracle, IBM, Borland, Novell.
(b) Application Composition Sector –
This category is too diversified and to be handled by prepackaged solutions. It includes GUI, Databases, domain specific components such as financial, medical or industrial process control packages.
(c) System Integration –
This category deals with large scale and highly embedded systems.
3. Infrastructure Sector:
This category provides infrastructure for the software development like Operating System, Database Management System, User Interface Management System, Networking System, etc.
Stages of COCOMO II:
Stage-I:
It supports estimation of prototyping. For this it uses Application Composition Estimation Model. This model is used for the prototyping stage of application generator and system integration.
Stage-II:
It supports estimation in the early design stage of the project, when we less know about it. For this it uses Early Design Estimation Model. This model is used in early design stage of application generators, infrastructure, system integration.
Stage-III:
It supports estimation in the post architecture stage of a project. For this it uses Post Architecture Estimation Model. This model is used after the completion of the detailed architecture of application generator, infrastructure, system integration.
Key takeaways:
1. Application generators are used in this sub-model. End user write the code by using these application generators.
2. This category will create largely prepackaged capabilities for user programming. Their product will have many reusable components. Typical firms operating in this sector are Microsoft, Lotus,
3. Infrastructure Sector:This category provides infrastructure for the software development like Operating System, Database Management System, User Interface Management System, Networking System, etc
References
- Real- Time Systems Design and Analysis.. Tools for the Practitioner by Phillip A Laplante, Seppo J.Ovaska ,Wiley - 4th Edition
- Embedded Real Time Systems: Concepts, Design and Programming - Dr. K.V.K. Prasad -Black Book, Edition: 2014
- Real Time Systems Theory and Practice , Rajib Mall , Pearson Education