Term Paper on: Numerical Precision Case Study Essay
The Floating point Format
The Floating point is a system that supports a wide range of values used to represent actual numbers. The term floating point is used because the decimal point, the radix point or the binary point has the capability of being placed anywhere in relation to the significant digits of the given number. The decimal point is thus seen to ‘float’. To store a number in floating point representation, a computer word is divided into 3fields, representing the sign, the exponent E, and the significand m respectively (Forouzan & Mosharraf, 2008). A 32-bit word can be broken down into the following fields: sign -1 bit, exponent- 8 bits and significand 23 bits. Since the exponent field is 8 bits, it can be used to represent exponents between 128 and 127.
Manipulating and Using Floating Point Numbers in Arithmetic Calculations
The adjustability of the decimal point allows for calculations over a range of magnitudes, using a given and fixed number of digits while at the same time ensuring precision. The difficulty in manipulating floating numbers is evidenced by the fact that when it comes to adding or subtracting two numbers, one must first of all adjust the two values in such a way that their exponents are the same. When the result does not fit into the given number of significant figures, one is forced to truncate or round it off to required significant figures, usually three and consequently produce an inaccurate result. The loss of accuracy in a single computation usually seems unless one is greatly concerned about the accuracy of their computations.
Purchase a Subscription To Read The Remaining Section