../../ | NoSlides

Software Engineering Metrics

Prabhaker Mateti

Does programming in the Small differ from Large (Giga etc) qualitatively, or is it just additive scale?

Questions with No Answers?

What is the largest program/sofware ever written? (No matter how long it took.)

  1. by one person?
  2. by a team? what is the size?

Do you know of a program > 1000 lines that is bug-free?

Sizes of Some "Programs"

(S)LOC: count of source code lines, ignoring blank lines and comments. Debian/Ubuntu tools {bogosec, cccc, SLOCCount}. The numbers below are in MLOC.

1993 Windows NT 3.1 6
1994 Windows NT 3.5 10
1996 Windows NT 4.0 16
2000 Windows 2000 29
2002 Windows XP 40
2003 Windows Server 2003 50
2008 Windows 7 40
200x Gimp-2.3.8 0.65
200x SABRE 200
200x SAP 160
200x IRS 62
 
200x Red Hat Linux 6.2 17
200x Red Hat Linux 7.1 30
200x Debian 2.2 56
200x Debian 3.0 104
200x Debian 3.1 213
200x Linux kernel 2.6.0 6
2010 Linux kernel 3.x.x 13.5
2011 Linux kernel 3.6.x 15.9
2013 Linux kernel 3.9.x 11 *
200x Sun Solaris 7 0.5

[For the above, no real refs can be given, but mostly from wikipedia.org Source_lines_of_code in millions LOC. What is included in the count?]

sloccount /usr/local/src/linux-3.10.2 # Aug 2013

SLOC    Directory       SLOC-by-Language (Sorted)
6453062 drivers         ansic=6448301,yacc=1688,asm=1476,perl=792,lex=779,
                        sh=26
2019776 arch            ansic=1750979,asm=267172,sh=810,awk=476,pascal=231,
                        python=45,perl=33,sed=30
770378  fs              ansic=770378
580043  sound           ansic=579860,asm=183
556120  net             ansic=556024,awk=96
354424  include         ansic=351241,cpp=3141,asm=42
135562  kernel          ansic=135553,asm=9
88104   tools           ansic=80499,perl=3775,python=1810,sh=1317,yacc=432,
                        lex=257,asm=14
61376   crypto          ansic=61376
60419   mm              ansic=60419
56568   Documentation   xml=46640,ansic=5117,perl=2453,sh=915,python=907,
                        lisp=218,asm=189,awk=129
46430   security        ansic=46430
42702   scripts         ansic=26260,perl=9229,sh=2499,cpp=1821,yacc=1440,
                        lex=1006,python=447
35769   lib             ansic=35636,perl=120,awk=13
18449   block           ansic=18449
6195    ipc             ansic=6195
5407    virt            ansic=5407
2507    init            ansic=2507
1991    samples         ansic=1991
1876    firmware        asm=1660,ansic=216
567     usr             ansic=553,asm=14
0       top_dir         (none)
Totals grouped by language
ansi-c:     10943391 (96.86%)
asm:         270759 (2.40%)
xml:          46640 (0.41%)
perl:         16402 (0.15%)
sh:            5567 (0.05%)
cpp:           4962 (0.04%)
yacc:          3560 (0.03%)
python:        3209 (0.03%)
lex:           2042 (0.02%)
awk:            714 (0.01%)
pascal:         231 (0.00%)
lisp:           218 (0.00%)
sed:             30 (0.00%)

sloccount /usr/local/src/linux-4.19, 2018-Oct-22

  1. 2. Before kernel build, =du -sh linux-4.19= is =908M=
  2. 3. After kernel build, =du -sh linux-4.19= is TBD.
  3. 4. Source Lines of Code (SLOC) Totals grouped by language
          : ansic:     16756046 (97.89%)
          : asm:         271828 (1.59%)
          : sh:           29564 (0.17%)
          : perl:         27344 (0.16%)
          : python:       17875 (0.10%)
          : cpp:           5063 (0.03%)
          : yacc:          4648 (0.03%)
          : lex:           2583 (0.02%)
          : awk:           1385 (0.01%)
          : ruby:            25 (0.00%)
          : sed:              5 (0.00%)
    

Programming in the Small .. Giga

  1. Can we define: program, software? Small, Medium, Large, ..., Giga?
  2. Our working/arbirary definitions:
    Tiny: -- 01 KLOC;
    Small: -- 10 KLOC;
    Medium: -- 01 MLOC;
    Large: -- 10 MLOC;
    Giga: 10+ MLOC

The Field of Software Metrics

There are books on this topic. In this course, we will discuss this topic only in passing. Main reason for interest: Cost estimation. Main reason for disinterest: Polemics.

  1. We do not have any metrics that cannot be sabotaged.
  2. Other than SLOC: Number of functions/procedures/methods;
    Man years
  3. complexity; McCabe Cyclomatic Number; wikipedia.org Cyclomatic_complexity Halstead Software Science; see book on Metrics.
  4. wikipedia.org Efferent_Coupling "It measures the number of data types a class knows about. This includes inheritance, interface implementation, parameter types, variable types, and exceptions. A large efferent coupling can indicate that a class is unfocused. It may also indicate brittleness, since it depends on the stability of all the types to which it is coupled."

References

  1. Google Talk on how they build their software: (i) Video, (ii) PDF Recommended Watching/Reading
  2. Building Windows 8 Recommended Reading.
  3. 2013 ICSE conference Keynote "Does Scale Really Matter? -- Ultra-Large-Scale Systems Seven Years after the Study" Linda Northrop is director of the Research, Technology, and Systems Solution Program at the Software Engineering Institute (SEI). In 2006, Ultra-Large-Scale Systems: The Software Challenge of the Future (ISBN 0-9786956-0-7) documented the results of a year-long study. Free book pdf. Recommended Reading.

Copyright © 2013 Prabhaker Mateti