CS 473
Homework 1: Floating Point
Due Friday, September 17, 2004
Exercises
In all of the following, be sure to show all your work (including the
tables for the multiplication and division methods when performing
radix conversions)
-
(10 points each)
Convert each of the following numbers from decimal into 32-bit
IEEE floating point format. Give your final answer as an eight
digit hexadecimal number.
- 13.3125
- 0
- -4.5
- 0.7 (yes, it has a repeating fraction)
-
(10 points each)
Convert each of the following numbers from 32 bit floating point
format into ordinary human-readable decimal.
- 3f800000
- 42148000
- c22d4000
- bd800000
-
(20 points each)
Perform the following floating point operations using the
algorithms we described in class.
- 3f800000 + 3f800000
- 41420000 + c15a0000
- 40000000 * 40000000
- 422e0000 * be000000
- 42c80000 / 40a00000
- 41700000 / c1100000
Programming
100 points
Write a C program to perform floating point multiplication - by hand.
Your program will need to:
- Read two numbers from standard input, as strings. The numbers
will have the following format:
- an optional minus sign
- 1 or more digits between 0 and 9. There will be no
leading 0s, unless it is the only digit before the decimal
point.
- a decimal point.
- 1 or more digits between 0 and 9. There will be no
trailing 0s, unless it is the only digit after the decimal
point.
The two numbers will be on a single line, with at least one
space (it'll be a space character -- no tabs) between them.
There won't be any extraneous characters (in particular, no
multiplication symbol) between them.
- Convert the numbers into IEEE floating point format. Since the
computer is working in binary, you will use the multiplication
algorithm to convert the integer part to binary, and the
division algorithm to convert the fractional part to binary.
Your result is to be a 32-bit number in IEEE floating point
format, stored in a
uint32_t
(a 32 bit unsigned
value, as defined in /usr/include/stdint.h
).
- Use the floating point multiplication algorithm to multiply the
numbers together.
- Convert the result into a string in the same syntax as defined
above, and print it out.
You'll need to use a couple of tricks to get this assignment to work.
The first one is that you'll have to use bit shifting to put a
number's binary point where you want it: in the assignment I said to
use the division method to calculate fractions. Well, if you just
take an integer and divide it by 10, you'll get 0 which isn't terribly
helpful. But notice what happens if you take a digit, left-shift by
28 bits, and divide by ten, as in
(5 << 28) / 10
Then you'll get a hexadecimal 08000000 -- exactly the right thing for
.5, left-shifted 28 places.
A second trick you'll need is to perform your multiplications 12 bits
at a time. I'll try to explain this better in class than here, but:
- Consider your 24 bit mantissas as two 12-bit fields each. Name
the fields in the first number as A and B, the fields in the
second one as C and D.
- Multiply B*D. This will give you a 24 bit result. Now
right-shift 12 bits. Call this result E.
- Now calculate and add A*D + B*C + E. Right-shift this 12 bits
(there are some rounding nasties that can crop up here. Ignore
them) and call it F.
- Now calculate and add A*C + F. That's the most significant 24
bits of your result, and is what you need for the multiplication
in this assignment.
Last modified: Tue Sep 14 15:50:05 MDT 2004