Document #: US-14577,EN ------------------------------------------------------------------------------ Major subject: data_mgmt Minor subjects: datatypes_be Keywords: aggregates, tech_notes Abstract: Floating Point Numbers Expert note: Floating Point Numbers ====================== Overview -------- Floating point numbers can present problems unless you have a solid understanding of how they work and what type of precision to expect under different circumstances. This document will explain floating point numbers and precision in terms of the INGRES floating point datatype and formatting templates in the following sections: o Precision o Display Templates - Terminal Monitor o Display Templates - Forms o Retrieving Floating Point Numbers o When to Expect Precision o Causes of Imprecision o If Imprecision Isn't Good Enough o Summary Precision --------- An f4/float4 datatype can contain numbers ranging from -10**38 to 10**38, but can maintain only 7 decimal places precisely. An f8/float8 can also store numbers in the same range, but instead will guarantee accuracy to 16 decimal places. Floating point numbers allow you to store very large and small numbers, but the more significant figures the number has, the more imprecise the number may be. An f4/float4 number and an i4/integer4 number both use 4 bytes, but an f4/float4 can store up to -/+ 10**38, whereas an i4/integer4 number can only store values in the range -2,147,483,648 through 2,147,483,647 (or approx. 10 decimal digits). The difference is that floating point numbers with many significant figures are only approximations, they are not "precise", while integers are always precise. Floating point numbers are stored differently than integer numbers. A floating point number consists of a sign bit, an 8 bit exponent field, and a fractional number. An integer consists only of a sign bit and the number. Thus, an i4/integer4, with a sign bit and 31 other bits, can precisely hold numbers in the range -/+ 2**31. ---- Pictorial Diagram of an f4/float4 on the VAX --- Bits: 15 14 -- 7 6 0 + - + - - - - - - - - + - - - - - - - + |S | exp | fraction | + - + - - - - - - - - + - - - - - - - + | fraction | + - - - - - - - - - - - - - - - - + Bits: 16 -- 31 On a VAX, the exponent portion of a floating point can indicate binary exponents of -127 through +127. The fraction part of the floating point number, with 23 bits, can hold numbers up to 2**23; that is, typically 7 decimal digits. Architecturally, however, there is an additional "implied" bit for VAX floating point numbers, effectively implying significance to 24 bits. That is why if your number goes beyond approximately 7 decimal digits in an f4/float4, the fraction portion of the f4 can no longer store your number precisely. An f8/float8, with a fraction portion of 55 bits, can precisely store numbers to 2**56 (with implied bit), or typically 16 decimal digits. Consider as an example the SQL or QUEL command line flag "-f4n24.10". "24.10" is called the display template. The number 24 represents the total number of digits available in the display. Not that many digits will necessarily appear. The number 10 is the number of digits to appear after the decimal point in the display. Exactly that many digits will always appear after the decimal point, whether or not there is that much data available. What "precise to 7 digits" means can be illustrated using the same command line flag as an example. This flag specifies the f4/float4 format, using a display of 24 digits, 10 of them after the decimal point. This is more accuracy than a f4/float4 number can maintain, but illustrates how precision is lost after between 7 or 8 decimal digits. sql -f4n24.10or quel -f4n24.10 |Ascii version of number appended |f4/float4column output | |-----------------------------------------------------------------| |1.0000000 | 1.0000000000| |1.00000005 | 1.0000000000| |1.123456789 | 1.1234568357| |1234567.7654321 | 1234567.7500000000| |12345678.12345678 | 12345678.0000000000| |123456789123.12345678 | 123456790528.0000000000| |-----------------------------------------------------------------| In the f4/float4 format, INGRES can store 1.0000000 accurately, since only 7 decimal digits are entered. INGRES lost precision on 1.00000005 because 9 decimal digits were entered, causing the least significant portion of the number to be lost. The third row shows that within an f4/float4, INGRES can maintain the accuracy of the number 1.123456, but adding additional digits makes the number imprecise. Display Templates - Terminal Monitor ------------------------------------ By default, Terminal Monitor uses a display template for both an f4/float4 and an f8/float8 of 10 decimal places of precision, 7 digits in front of the decimal point and three decimal places precision after the decimal point. If the number is too big or too small to fit into this display format, the number is printed in exponential notation. The default template would be specified on the INGRES command line by "sql -f4n10.3 -f8n10.3 " or "quel -f4n10.3 -f8n10.3 ". You can optionally specify your own floating point display format using the "-f" flag on the INGRES command line. The syntax is: -fkxM.N k 4 or 8, controlling template for either f4/float4's or f8/float8's x E,F,G, or N E - Exponential format F - Floating point format. If number is too large for format; stars (****) are printed to represent overflow of display. G - Floating point format. Decimal points are aligned. If number too large for format, will use E instead. N - Floating point format. Left justified (leftmost digits line up). If number too large for format, will use E instead. M is the total column width of the field N is the portion of the total column width displayed after the decimal point Example Showing Output With Different Display Templates: |Number | No Flags | -f4F10.3 | -f4E10.3 | -f4G10.3 | -f4N10.3 | |----------------------|----------|----------|----------|----------|----------| |1.0000000 | 1.000| 1.000| 1.000e+00| 1.000 | 1.000| |1.00000005 | 1.000| 1.000| 1.000e+00| 1.000 | 1.000| |1.123456789 | 1.123| 1.123| 1.123e+00| 1.123 | 1.123| |1234567.7654321 | 1.234e+06|**********| 1.234e+06| 1.234e+06| 1.234e+06| |12345678.12345678 | 1.234e+07|**********| 1.234e+07| 1.234e+07| 1.234e+07| |123456789123.12345678 | 1.234e+11|**********| 1.234e+11| 1.234e+11| 1.234e+11| |-----------------------------------------------------------------------------| Display Templates - Forms ------------------------- Display templates for forms can be changed by editing the data portion of the field on the form. The default template for a floating point field (both f4/float4 and f8/float8) is -f10.3, that is, left justified, a full 10 places displayed (not including the decimal point) with 3 digits after the decimal point. The format for different display templates in VIFRED is very similar to the format for the Terminal Monitor. The main difference is that you need not specify "-fk", and right or left justification is controlled by a "+" (right,default) or "-" (left) sign before the template. Template Resultant Field Data Value Displayed Value f10.3 f_____.___ 1 x: 1.000 123.456 x: 123.456 12345678.999 x: ********** -n10.3 f_____.___ 123.456 x: 123.456 12345678.999 x: 1.234e+07 123.4569 x: 123.457 +e10.3 f_.___E___ 1 x: 1.000e+00 123.456 x: 1.234e+02 +g10.3 f_____.___ 1.000 x: 1.000 123.456 x: 1.234e+02 Some rounding is done automatically by the forms system. For instance, if your floating point template has been specified as "-n10.3" and you enter the number 1.9999, when you tab past that field, the forms system will round to three decimal places and your number will appear on your form as 2.000. Retrieving Floating Point Numbers --------------------------------- Because floating point numbers can be imprecise, and because the Terminal Monitor by default displays only 3 decimal digits after the decimal point, retrieving floating point numbers can sometimes be quite mystifying. To illustrate some of the problems you may run into, let us use as an example the f4/float4 datatype (the same issues apply to an f8/float8). SQL: CREATE TABLE test (float4column=float4); INSERT INTO test (float4column) VALUES (123.456); INSERT into test (float4column) VALUES (123.4567); INSERT into test (float4column) VALUES (123.45678); QUEL: CREATE test (f4column=f4) APPEND TO test (f4column=123.456) APPEND TO test (f4column=123.4567) APPEND TO test (f4column=123.45678) The following retrieval results in inaccuracies because the default setting of the Terminal Monitor shows only 3 decimal places of precision after the decimal point. The additional decimal places have been rounded off. SQL: SELECT * FROM test; |float4column | |-------------| | 123.456| | 123.457| | 123.457| |-------------| (3 rows) QUEL: RETRIEVE (test.all) |f4column | |----------| | 123.456| | 123.457| | 123.457| |----------| (3 rows) If we try to retrieve one of the last two rows as displayed, no rows are returned: SQL: SELECT * FROM test WHERE float4column=123.457; |float4column | |--------------| |--------------| (0 rows) QUEL: RETRIEVE (test.all) WHERE test.f4column=123.457 |f4column | |----------| |----------| (0 rows) Instead of trying to retrieve using an exact-match query, we can try using a range of values instead. The output is once again rounded, so it looks as if there really is a value of 123.457: SQL: SELECT * FROM test WHERE float4column > 123.456 AND float4column < 123.458; |f4loatcolumn | |-------------| | 123.457| | 123.457| |-------------| (2 rows) QUEL: RETRIEVE (test.all) WHERE test.f4column > 123.456 AND test.f4column < 123.458 |f4column | |----------| | 123.457| | 123.457| |----------| (2 rows) This technique of retrieving a range of values around the floating point value you are looking for is good to remember. When to Expect Precision ------------------------ If, in an f4/float4 datatype, you store a number which is 7 digits or less (or 16 digits in the case of f8/float8) and you have not operated on it in any way (see Causes of Imprecision) you should be able to retrieve your row, even though the results of the retrieval may not look accurate if INGRES has rounded them to 3 decimal digits. example: SQL: SELECT * FROM test WHERE float4column=123.45678; |float4column | |-------------| | 123.457| |-------------| (1 row) QUEL: RETRIEVE (test.all) WHERE test.f4column=123.45678 |f4column | |----------| | 123.457| |----------| (1 row) If we try to use the INGRES command line "-f" flag to display more digits, our results still don't look like what we entered. After 7 digits of precision, garbage values may be displayed. This is because the display template on the command line requires 7 digits to appear after the decimal point, whether there is any data there or not. SQL: sql -f4n17.7 SELECT * FROM test; |float4column | |-----------------| | 123.4560013| | 123.4567032| | 123.4567795| |-----------------| (3 rows) QUEL: quel -f4n17.7 RETRIEVE (test.all) |f4column | |-----------------| | 123.4560013| | 123.4567032| | 123.4567795| |-----------------| (3 rows) If you run this query yourself, you may get different numbers to the right of the third place after the decimal point. If you are having trouble retrieving a floating point number, choose a floating point display template larger than 7 (f4/float4) or 16 (f8/float8) decimal places, and try a retrieval using a range of values. Because floating point numbers are inherently imprecise, you should not use floating point numbers as keys for your table if you plan to do "exact match" key lookups (although for range lookups, floating point numbers are satisfactory keys). Causes of Imprecision --------------------- The main cause of imprecision is entering numbers which are greater than 7 digits in an f4/float4 or 16 digits in an f8/float8. Floating point numbers greater than these limits are inherently imprecise. There are other causes of imprecision which are less obvious, however. Imprecision can be caused when arithmetic operations are performed in INGRES on floating point numbers. If you stored the number 1 in a table, and then divided by 3, with these queries: example: SQL: INSERT INTO test (float4column) VALUES (1); UPDATE test SET float4column=float4column/3; QUEL: APPEND TO test (f4column=1) REPLACE test (f4column= test.f4column/3 ) the value would lose its precision. You would instead store 1/3 in base two using 24 bits, which in decimal representation is .33333333, and a retrieval of the sort: SQL: SELECT * FROM test WHERE float4column=.333; QUEL: RETRIEVE (test.all) WHERE test.f4column=.333 would not return any values. Use the range of values technique (for example, the statements: WHERE test.f4column>.3 AND WHERE test.f4column<.4) to pull up the record you are looking for. (NOTE: if you exactly specified the digits the fractional portion of the floating point had stored, you could still use an exact match retrieval to retrieve your record. In this case the retrieval would have worked had we said "where col=.33333333 (8 digits.)". If you MODIFY your table, you will still be guaranteed that the number is the same binary format as before. If you COPY out your data in ascii format, the number may lose precision when coerced to ascii, unless all your data fits into the precision range specified by the display template for the type of column in question. (Default for f4/float4's and f8/float8's is 10.3, 7 digits before the decimal point, 3 digits after; rounding will be automatic. The default template for character fields is exactly the same as the default retrieval output in TM). You can specify another display template to COPY out numbers more precisely, but bear in mind you will encounter the display problems outlined above. If you COPY out your data in binary (i.e. COPY tablename () INTO "file"), your floating point numbers should not lose precision when copied back in. However, if you are using one of the floating point formats on the INGRES command line, you may lose precision (i.e. INGRES -f8e12 ). If the number you enter has sufficient precision defined (7 digits for an f4/float4, 16 for an f8/float8) you will store that number and be able to retrieve that number. If Imprecision Isn't Good Enough -------------------------------- You will have to chose another datatype. Floating point numbers are inherently subject to imprecision. If you understand when and why imprecision occurs, you can decide when floating point is an appropriate datatype. The MONEY datatype may be an alternative in some cases. MONEY is rounded to two decimal places after the decimal point on input and out- put, but is internally stored as an f8/float8. MONEY is limited to 16 significant digits (because an f8/float8 can precisely store 16 signifi- cant digits). INGRES will always round MONEY values to two decimal places. To manage penny fractions, you can retrieve your MONEY data, do your own arithmetic in an embedded query language and then store your results back in the database. Another option is to store your floating point numbers as integers, with an implied decimal point. If you do your own math and use INGRES only to store your results, you should not lose pennies or encounter any roundoff problems. The MONEY datatype provides for larger numbers than does the i4/integer4, however. Alternatively, if neither MONEY nor the i4/integer4 is large enough for you, you might choose CHARACTER or TEXT (but be aware there is a performance cost to coercing data from one datatype to another). Summary ------- Users are often confused by floating point numbers. This document has attempted to explain some of the common problems you can run into, and demystify floating point imprecision. Date Written : 27-may-1986 Date Modified : 21-jun-1987 Releases affected: all(all.all) - Releases not affected: Errors: Bugs/SIRS: ------------------------------------------------------------------------------
© William Yuan 2000
Email William