SlideShare a Scribd company logo
1 of 6
Download to read offline
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE)
e-ISSN: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 1, Ver. 1 (Jan - Feb. 2015), PP 49-54
www.iosrjournals.org
DOI: 10.9790/2834-10114954 www.iosrjournals.org 49 | Page
Design and Implementation of High Speed Area Efficient Double
Precision Floating Point Arithmetic Unit
Onkar Singh1
, Kanika Sharma2
1
Arni University, Kathgarh, Indora, HP, India.
2
NITTTR, Chandigarh. India.
Abstract: A floating-point arithmetic unit designed to carry out operations on floating point numbers. Floating
point numbers can support a much wider range of values than fixed point representation. Floating Point units
are mainly used in high speed objects recognition system, high performance computer systems, embedded
systems, mobile applications. Latch based design is implemented in the proposed work so the longer
combinational paths can be compensated by shorter path delays in the subsequent logic gates. That is why the
performance has increased in the design. All four individual units addition, subtraction, multiplication and
division are designed using Verilog verified by using Questa Sim and implemented on vertex-5 FPGA
Keywords: Floating Point, IEEE, FPGA, Vertex-5, Double Precision, Verilog, Arithmetic Unit
I. Introduction
A floating point arithmetic unit designed to carry out operations on floating point numbers. Floating
point arithmetic unit is widely used in high speed objects recognition system, high performance computer
systems, embedded systems, mobile applications and signal processing applications. Floating point
representation can support a much wider range of values than fixed point representation. To represent very large
or small values, wide range is required as the integer representation is no longer appropriate. These values can
be represented using the IEEE-754 standard based floating point representation. The design has been
implemented on latches so the longer combinational paths can be compensated by shorter path delays in the
subsequent logic gates. That is why the performance has increased in the design.
IEEE Single Precision Format: The IEEE single precision format uses 32 bits for representing a floating point
number, divided into three subfields, as illustrated in figure 1. The first field is the sign bit for the fraction part.
The next field consists of 8 bits which are used for exponent the third field consists of the remaining 23 bits and
is used for the fractional part.
Sign Exponent Fraction
1 bit 8 bits 23 bits
Fig 1: IEEE format for single precision
IEEE Double Precision Format: The IEEE double precision format uses 64 bits for representing a floating
point number, as illustrated in figure 2. The first bit is the sign bit for the fraction part. The next 11 bits are used
for the exponent, and the remaining 52 bits are used for the fractional part.
Sign Exponent Fraction
1 bit 11 bits 52 bits
Fig 2: IEEE format for double precision
II. Implementation Of Double Precision Floating Point Arithmetic Unit
The block diagram of the proposed floating point arithmetic unit is given in figure 3. The unit supports
four arithmetic operations: Add, Subtract, Multiply and Divide. All arithmetic operations have been carried out
in four separate modules one for addition, one for subtraction, one for multiplication and one for division as
shown in figure 3. In this unit one can select operation to be performed on the 64-bit operands by a 3-bit op-
code and the same op-code selects the output from that particular module and connects it to the final output of
the unit. Particular exception signal will be high whenever that type of exception will occur.
Design and implementation of high speed area efficient double precision floating point arithmetic unit
DOI: 10.9790/2834-10114954 www.iosrjournals.org 50 | Page
Fig 4: RTL view of double precision floating point
arithmetic unit
Fig 3: Block diagram of double precision floating
point arithmetic unit
The floating point arithmetic unit consists of following blocks
2.1) Fpu_Add- Floating Point adder
2.2) Fpu_Sub- Floating Point Subtractor
2.3)Fpu_Mul-Floating Point Multiplier
2.4) Fpu_Div- Floating Point Division
2.5)Fpu_Round-Floating Point Rounding Unit
2.6) Fpu_Exception- Floating Point Exception Unit
2.1) Fpu_Add- Floating Point adder- Two floating point numbers are added as shown below.
(f1 x 2e1
) + (f2 x 2e2
) = F x 2E
In order to add two fractions, the associated exponents must be equal. Thus, if the two exponents are
different, we must un normalize one of the fractions and adjust the exponents accordingly. The smaller number
is the one that should adjusted so that if significant digits are lost, the effect is not significant. In this module two
64-bit numbers are added and after going through rounding and exception part final added result will come to
the output.
2.2) Fpu_Sub- Floating Point Subtractor- Two floating point numbers are subtracted as shown below.
(f1 x 2e1
) - (f2 x 2e2
) = F x 2E
In order to subtract two fractions, the associated exponents must be equal. Thus, if the two exponents
are different, we must un normalize one of the fractions and adjust the exponents accordingly. The smaller
number is the one that should adjusted so that if significant digits are lost, the effect is not significant. In this
module two 64-bit numbers are subtracted and after going through rounding and exception part final subtracted
result will come to the output.
2.3) Fpu_Mul- Floating Point Multiplier-Two floating point numbers are multiplied as shown below.
(f1 x 2e1
) x (f2 * 2e2
) = (f1 x f2) x 2(e1+e2)
= F x 2E
In this module two 64-bit numbers are multiplied using sub multipliers and after going through
rounding and exception part final multiplied result will come to the output. The mantissa of operand A are
stored in the 53-bit register (mul_a). The mantissa of operand B are stored in the 53-bit register (mul_b).
Multiplying all 53 bits of mul_a by 53 bits of mul_b would result in a 106-bit wide product and a 53 by 53 bit
multiplier is not available in the most popular Xilinx FPGAs, so the multiply would be broken down into
smaller multiplies and the results would be added together to give the final 106-bit product. the module
(fpu_mul) breaks up the multiply into smaller 24-bit by 17-bit multiplies. The Xilinx Virtex5 device contains
Design and implementation of high speed area efficient double precision floating point arithmetic unit
DOI: 10.9790/2834-10114954 www.iosrjournals.org 51 | Page
DSP48E slices with 25 by 18 twos complement multipliers, which can perform a 24 by 17-bit multiply. The
multiply is broken up as follows:
Multiplier_1 = mul_a[23:0] x mul_b[16:0]
Multiplier_2 = mul_a[23:0] x mul_b[33:17]
Multiplier_3 = mul_a[23:0] x mul_b[50:34]
Multiplier_4= mul_a[23:0] x mul_b[52:51]
Multiplier_5 = mul_a[40:24] x mul_b[16:0]
Multiplier_6 = mul_a[40:24] x mul_b[33:17]
Multiplier_7= mul_a[40:24] x mul_b[52:34]
Multiplier_8 = mul_a[52:41] x mul_b[16:0]
Multiplier_9 = mul_a[52:41] x mul_b[33:17]
Multiplier_10 = mul_a[52:41] x mul_b[52:34]
The multiplier (1-10) are added together, with the appropriate offsets based on which part of the mul_a
and mul_b arrays they are multiplying. The summation of the products is accomplished by adding one product
result to the previous product result instead of adding all 10 multiplier (1-10) together in one summation. The
final 106-bit product is stored in register (product). The exponent fields of operands A and B are added together
and then the value (1022) is subtracted from the sum of A and B. If the resultant exponent is less than 0, than the
(product) register needs to be right shifted by the amount. The final exponent of the output operand will be 0 in
this case, and the result will be a denormalized number.
2.4) Fpu_Div- Floating Point Division - Two floating point numbers are divided as shown below.
(f1 x 2e1
) / (f2 x 2e2
) = (f1 / f2) x 2(e1-e2)
= F x 2E
In this module two 64-bit numbers are divided and after going through rounding and exception part
final divided result will come to the output. The leading „1‟ (if normalized) and mantissa of operand A is the
dividend, and the leading „1‟ (if normalized) and mantissa of operand B is the divisor. In division one bit of the
quotient calculated each clock cycle based on a comparison between the dividend and divisor register. If the
dividend is greater than the divisor, the quotient bit is „1‟, and then the divisor is subtracted from the dividend,
and the resulting difference is shifted one bit to the left, and after shifting it becomes the dividend for the next
clock cycle. And in another case if the dividend is less than the divisor, the dividend is shifted one bit to the left,
and then this shifted value becomes the dividend for the next clock cycle. Repeat the steps by the number of bits
time. The number in the dividend place gives remainder value and quotient place gives quotient value
2.5) Fpu_Round-Floating Point Rounding Unit - Rounding module is used to modifies a number and fit it in
the destination‟s format. The various rounding modes are written below.
2.5.1) Round to nearest even: This is the standard default rounding. The value is rounded down or up to the
nearest infinitely precise result.
2.5.2) Round-to-Zero: Basically in this mode the number will not be rounded. The excess bits will simply get
truncated, e.g. 5.48 will be truncated to 5.5
2.5.3) Round-Up: In this mode the number will be rounded up towards +∞, e.g. 6.4 will be rounded to 7, while -
5.4 to -5
2.5.4) Round-Down: The opposite of round-up, the number will be rounded up towards -∞, e.g. 6.4 will be
rounded to 6, while -5.4 to -6
2.6) Fpu_Exception- Floating Point Exception Unit- Exception occurs when an operation on some particular
operands has no outputs suitable for a reasonable application.
The five possible exceptions are:
2.6.1) Invalid: Operation are like square root of a negative number, returning of NaN by default, etc., output of
which does not exist.
2.6.2) Division by zero: It is an operation on a finite operand which gives an exact infinite result for e.g., 1/0 or
log (0) that returns positive or negative infinity by default.
2.6.3) Overflow: It occurs when an operation results a very large number that can‟t be represented correctly i.e.
which returns ±infinity by default (for round-to-nearest mode).
2.6.4) Underflow: It occurs when an operation results very small i.e. outside the normal range and inexact by
default.
2.6.5) Inexact: It occurs whenever the result of an arithmetic operation is not exact due to the restricted exponent
or precision range.
Design and implementation of high speed area efficient double precision floating point arithmetic unit
DOI: 10.9790/2834-10114954 www.iosrjournals.org 52 | Page
III. Synthesis, Timing And Simulation Result
3.1) Synthesis Result
DEVICE UTILIZATION SUMMARY
Logic Utilization Used Available Utilization
Number of Slice Registers 4762 28800 16%
Number of Slice LUTs 6525 28800 22%
Number of fully used LUT-FF pairs 3013 7600 36%
Number of bonded IOBs 206 220 93%
Number of BUFG/BUFGCTRLs 6 32 18%
Number of DSP48Es 9 48 18%
3.2) Timing Result
Minimum Period 3.817ns (Maximum Frequency: 262.006MHz)
Minimum Input arrival time before clocks 3.900ns
Maximum output required time after clocks 2.775ns
3.3) Simulation Result- The simulation results of double precision floating point arithmetic unit (Addition,
Subtraction, Multiplication and Division) is shown in fig 5, fig 6, fig 7, fig 8 respectively. It is calculated for the
two input operands of 64 bits each. The reset signal is kept low throughout the simulation, so that operands are
initialised all at once, then at the high of enable signal, operation of the two operands are calculated. After
calculating the result, the result goes into fpu_round and then goes into fpu_exceptions. From fpu_exceptions
the out signal gives the output. In the waveforms clock defines the applied frequency (262.006MHz) to the
signals. Fpu_op defines the operation to be preformed that is 0=addition,1=subtraction, 2=multiplication and
3=division. Opa1 and Opa1 defines the input operand one and input operand two respectively. The r_mode
signal defines the various rounding modes (00=Round to nearest even, 01=Round-to-Zero, 10=Round-Up,
11=Round-Down. Fpu_out defines the final output of the signals.
3.3.1) Simulation Result of floating point addition- It is calculated for the two input operands of 64 bits each.
15 clock cycles are required by floating point unit to complete addition process. As frequency is 262.006MHz so
one clock cycle completes 3.82ns and 15 clock cycles completes in 3.82ns x 15 =57.3ns. Therefore the addition
process completes in 57.3ns.
Fig 5: Simulation Result of Floating Point Addition
Operation Time taken by modules in ns
Addition 57.255ns (15 cycles)
Subtraction 57.255ns (15 cycles)
Multiplication 57.255ns (15 cycles)
Division 259.556ns (68 cycles)
Design and implementation of high speed area efficient double precision floating point arithmetic unit
DOI: 10.9790/2834-10114954 www.iosrjournals.org 53 | Page
3.3.2) Simulation result of floating point subtraction- It is calculated for the two input operands of 64 bits
each. 15 clock cycles are required by floating point unit to complete subtraction process. As frequency is
262.006MHz so one clock cycle completes 3.82ns and 15 clock cycles completes in 3.82ns x 15 =57.3ns.
Therefore the subtraction process completes in 57.3ns.
Fig 6: Simulation Result of Floating Point Subtraction
3.3.3) Simulation result of floating point multiplication- It is calculated for the two input operands of 64 bits
each. 15 clock cycles are required by floating point unit to complete multiplication process. As frequency is
262.006MHz so one clock cycle completes 3.82ns and 15 clock cycles completes in 3.82ns x 15 =57.3ns.
Therefore the multiplication process completes in 57.3ns.
Fig 7: Simulation Result of Floating Point Multiplication
3.3.4) Simulation result of floating point division- It is calculated for the two input operands of 64 bits each.
68 clock cycles are required by floating point unit to complete division process. As frequency is 262.006MHz so
one clock cycle completes 3.82ns and 68 clock cycles completes in 3.82ns x 68 =259.76ns. Therefore the
division process completes in 259.76ns.
Design and implementation of high speed area efficient double precision floating point arithmetic unit
DOI: 10.9790/2834-10114954 www.iosrjournals.org 54 | Page
Fig 8: Simulation Result of Floating Point Division
IV. Conclusion
This paper presents a high performance implementation of double precision floating point arithmetic
unit. The complete design is captured in Verilog Hardware description language (HDL), tested in simulation
using Questa Sim, placed and routed on a Vertex 5 FPGA from Xilinx.It works on Maximum Frequency of
262.006MHz. When synthesized, this module used 16% number of slice registers, 22% Number of Slice LUTS,
and 36% number of fully used LUT-FF pairs. The overall performance is increased in this design. The proposed
work can be further proceed by adding some more modules like square root, logarithmic units to the floating
point unit and also the complete design can be implemented on high performance vertex-6 FPGA.
References
[1]. Chaitanya a. Kshirsagar, P.M. Palsodkar “An FPGA implementation of IEEE - 754 double precision floating point unit using
verilog” international journal of electrical, electronics and data communication, ISSN: 2320-2084 , volume-2, issue-6, june-2014
[2]. Paschalakis, S., Lee, P., “Double Precision Floating-Point Arithmetic on FPGAs”, In Proc. 2003 2nd
IEEE International Conference
on Field Programmable Technology (FPT ‟03), Tokyo, Japan, pp. 352-358, 2003
[3]. Addanki Puma Ramesh, A. V. N. Tilak, A.M.Prasad “An FPGA Based High Speed IEEE-754 Double Precision Floating Point
Multiplier Using Verilog” 2013 International Conference on Emerging Trends in VLSI, Embedded System, Nano Electronics and
Telecommunication System (ICEVENT), pp. 1-5, 7-9 Jan. 2013
[4]. Ushasree G, R Dhanabal, Sarat Kumar Sahoo “Implementation of a High Speed Single Precision Floating Point Unit using Verilog”
International Journal of Computer Applications National conference on VSLI and Embedded systems, pp.32-36, 2013
[5]. Pramod Kumar Jain, Hemant Ghayvat , D.S Ajnar “Double Precision Optimized Arithmetic Hardware Design For Binary &
Floating Point Operands” International Journal of Power Control Signal and Computation (IJPCSC) Vol. 2 No. 2 ISSN : 0976-268X
[6]. Basit Riaz Sheikh and Rajit Manohar “An Operand-Optimized Asynchronous IEEE 754 Double-Precision Floating-Point Adder”,
IEEE Symposium on Asynchronous Circuits and Systems (ASYNC), pp. 151 – 162, 3-6 May 2010
[7]. Ms. Anjana Sasidharan, Mr. M.K. Arun” Vhdl Implementation Of Ieee 754 Floating Point Unit” IJAICT, ISSN 2348 –
9928Volume 1, Issue 2, June 2014
[8]. Dhiraj Sangwan , Mahesh K. Yadav “Design and Implementation of Adder/Subtractor and Multiplication Units for Floating-Point
Arithmetic” International Journal of Electronics Engineering, 2(1), pp. 197-203, 2010
[9]. Tarek Ould Bachir, Jean-Pierre David “Performing Floating-Point Accumulation on a modern FPGA in Single and Double
Precision” 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, pp.105-108, 2010
[10]. Geetanjali Wasson “IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers”
International Journal of Research in Computer Science, Volume 1, Issue 1, pp. 1-7, 2011
[11]. KavithaSravanthi, Addula Saikumar “An FPGA Based Double Precision Floating Point Arithmetic Unit using Verilog”
International Journal of Engineering Research & Technology ISSN: 2278-0181, Vol. 2 Issue 10, October - 2013
[12]. Rathindra Nath Giri, M.K.Pandit “Pipelined Floating-Point Arithmetic Unit (FPU) for Advanced Computing Systems using FPGA”
International Journal of Engineering and Advanced Technology (IJEAT), Volume-1, Issue-4, pp. 168-174, April 2012
[13]. H. Yamada, T. Hottat, T. Nishiyama, F. Murabayashi, T. Yamauchi, and H. Sawamoto “A 13.3ns Double-precision Floating-point
ALU and Multiplier”, IEEE International Conference on Computer Design: VLSI in Computers and Processors, pp. 466 – 470, 2-4
Oct 1995
[14]. Shrivastava Purnima, Tiwari Mukesh, Singh Jaikaran and Rathore Sanjay “VHDL Environment for Floating point Arithmetic Logic
Unit - ALU Design and Simulation” Research Journal of Engineering Sciences, Vol. 1(2), pp.1-6, August -2012

More Related Content

What's hot

Principles of Combinational Logic-1
Principles of Combinational Logic-1Principles of Combinational Logic-1
Principles of Combinational Logic-1Supanna Shirguppe
 
FYBSC IT Digital Electronics Unit III Chapter I Combinational Logic Circuits
FYBSC IT Digital Electronics Unit III Chapter I Combinational Logic CircuitsFYBSC IT Digital Electronics Unit III Chapter I Combinational Logic Circuits
FYBSC IT Digital Electronics Unit III Chapter I Combinational Logic CircuitsArti Parab Academics
 
Low Power Implementation of Booth’s Multiplier using Reversible Gates
Low Power Implementation of Booth’s Multiplier using Reversible GatesLow Power Implementation of Booth’s Multiplier using Reversible Gates
Low Power Implementation of Booth’s Multiplier using Reversible GatesIJMTST Journal
 
Implementation of Low-Complexity Redundant Multiplier Architecture for Finite...
Implementation of Low-Complexity Redundant Multiplier Architecture for Finite...Implementation of Low-Complexity Redundant Multiplier Architecture for Finite...
Implementation of Low-Complexity Redundant Multiplier Architecture for Finite...ijcisjournal
 
Paper id 27201434
Paper id 27201434Paper id 27201434
Paper id 27201434IJRAT
 
IRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
IRJET- Radix 8 Booth Encoded Interleaved Modular MultiplicationIRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
IRJET- Radix 8 Booth Encoded Interleaved Modular MultiplicationIRJET Journal
 
Design and implementation of high speed baugh wooley and modified booth multi...
Design and implementation of high speed baugh wooley and modified booth multi...Design and implementation of high speed baugh wooley and modified booth multi...
Design and implementation of high speed baugh wooley and modified booth multi...eSAT Publishing House
 
A comparative study of different multiplier designs
A comparative study of different multiplier designsA comparative study of different multiplier designs
A comparative study of different multiplier designsHoopeer Hoopeer
 
Design and Implementation of 8 Bit Multiplier Using M.G.D.I. Technique
Design and Implementation of 8 Bit Multiplier Using M.G.D.I. TechniqueDesign and Implementation of 8 Bit Multiplier Using M.G.D.I. Technique
Design and Implementation of 8 Bit Multiplier Using M.G.D.I. TechniqueIJMER
 
Implementation of cyclic convolution based on fnt
Implementation of cyclic convolution based on fntImplementation of cyclic convolution based on fnt
Implementation of cyclic convolution based on fnteSAT Publishing House
 
Power Optimization using Reversible Gates for Booth’s Multiplier
Power Optimization using Reversible Gates for Booth’s MultiplierPower Optimization using Reversible Gates for Booth’s Multiplier
Power Optimization using Reversible Gates for Booth’s MultiplierIJMTST Journal
 
Iaetsd fpga implementation of cordic algorithm for pipelined fft realization and
Iaetsd fpga implementation of cordic algorithm for pipelined fft realization andIaetsd fpga implementation of cordic algorithm for pipelined fft realization and
Iaetsd fpga implementation of cordic algorithm for pipelined fft realization andIaetsd Iaetsd
 
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...IRJET Journal
 
Ec2203 digital electronics questions anna university by www.annaunivedu.org
Ec2203 digital electronics questions anna university by www.annaunivedu.orgEc2203 digital electronics questions anna university by www.annaunivedu.org
Ec2203 digital electronics questions anna university by www.annaunivedu.organnaunivedu
 
Area Efficient and Reduced Pin Count Multipliers
Area Efficient and Reduced Pin Count MultipliersArea Efficient and Reduced Pin Count Multipliers
Area Efficient and Reduced Pin Count MultipliersCSCJournals
 

What's hot (19)

Chapter 3
Chapter 3Chapter 3
Chapter 3
 
Digital Logic & Design
Digital Logic & DesignDigital Logic & Design
Digital Logic & Design
 
Principles of Combinational Logic-1
Principles of Combinational Logic-1Principles of Combinational Logic-1
Principles of Combinational Logic-1
 
FYBSC IT Digital Electronics Unit III Chapter I Combinational Logic Circuits
FYBSC IT Digital Electronics Unit III Chapter I Combinational Logic CircuitsFYBSC IT Digital Electronics Unit III Chapter I Combinational Logic Circuits
FYBSC IT Digital Electronics Unit III Chapter I Combinational Logic Circuits
 
Low Power Implementation of Booth’s Multiplier using Reversible Gates
Low Power Implementation of Booth’s Multiplier using Reversible GatesLow Power Implementation of Booth’s Multiplier using Reversible Gates
Low Power Implementation of Booth’s Multiplier using Reversible Gates
 
Implementation of Low-Complexity Redundant Multiplier Architecture for Finite...
Implementation of Low-Complexity Redundant Multiplier Architecture for Finite...Implementation of Low-Complexity Redundant Multiplier Architecture for Finite...
Implementation of Low-Complexity Redundant Multiplier Architecture for Finite...
 
Paper id 27201434
Paper id 27201434Paper id 27201434
Paper id 27201434
 
IRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
IRJET- Radix 8 Booth Encoded Interleaved Modular MultiplicationIRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
IRJET- Radix 8 Booth Encoded Interleaved Modular Multiplication
 
Ch19
Ch19Ch19
Ch19
 
Design and implementation of high speed baugh wooley and modified booth multi...
Design and implementation of high speed baugh wooley and modified booth multi...Design and implementation of high speed baugh wooley and modified booth multi...
Design and implementation of high speed baugh wooley and modified booth multi...
 
A comparative study of different multiplier designs
A comparative study of different multiplier designsA comparative study of different multiplier designs
A comparative study of different multiplier designs
 
Design and Implementation of 8 Bit Multiplier Using M.G.D.I. Technique
Design and Implementation of 8 Bit Multiplier Using M.G.D.I. TechniqueDesign and Implementation of 8 Bit Multiplier Using M.G.D.I. Technique
Design and Implementation of 8 Bit Multiplier Using M.G.D.I. Technique
 
Implementation of cyclic convolution based on fnt
Implementation of cyclic convolution based on fntImplementation of cyclic convolution based on fnt
Implementation of cyclic convolution based on fnt
 
D0161926
D0161926D0161926
D0161926
 
Power Optimization using Reversible Gates for Booth’s Multiplier
Power Optimization using Reversible Gates for Booth’s MultiplierPower Optimization using Reversible Gates for Booth’s Multiplier
Power Optimization using Reversible Gates for Booth’s Multiplier
 
Iaetsd fpga implementation of cordic algorithm for pipelined fft realization and
Iaetsd fpga implementation of cordic algorithm for pipelined fft realization andIaetsd fpga implementation of cordic algorithm for pipelined fft realization and
Iaetsd fpga implementation of cordic algorithm for pipelined fft realization and
 
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
 
Ec2203 digital electronics questions anna university by www.annaunivedu.org
Ec2203 digital electronics questions anna university by www.annaunivedu.orgEc2203 digital electronics questions anna university by www.annaunivedu.org
Ec2203 digital electronics questions anna university by www.annaunivedu.org
 
Area Efficient and Reduced Pin Count Multipliers
Area Efficient and Reduced Pin Count MultipliersArea Efficient and Reduced Pin Count Multipliers
Area Efficient and Reduced Pin Count Multipliers
 

Viewers also liked

Bandwidth enhancement of rectangular microstrip patch antenna using slots
Bandwidth enhancement of rectangular microstrip patch antenna using slotsBandwidth enhancement of rectangular microstrip patch antenna using slots
Bandwidth enhancement of rectangular microstrip patch antenna using slotsIOSR Journals
 
Performance Evaluation of Basic Segmented Algorithms for Brain Tumor Detection
Performance Evaluation of Basic Segmented Algorithms for Brain Tumor DetectionPerformance Evaluation of Basic Segmented Algorithms for Brain Tumor Detection
Performance Evaluation of Basic Segmented Algorithms for Brain Tumor DetectionIOSR Journals
 
Modelling and Control of a Robotic Arm Using Artificial Neural Network
Modelling and Control of a Robotic Arm Using Artificial Neural NetworkModelling and Control of a Robotic Arm Using Artificial Neural Network
Modelling and Control of a Robotic Arm Using Artificial Neural NetworkIOSR Journals
 
Identification and Monitoring the Change of Land Use Pattern Using Remote Sen...
Identification and Monitoring the Change of Land Use Pattern Using Remote Sen...Identification and Monitoring the Change of Land Use Pattern Using Remote Sen...
Identification and Monitoring the Change of Land Use Pattern Using Remote Sen...IOSR Journals
 

Viewers also liked (20)

D0941822
D0941822D0941822
D0941822
 
M1302027275
M1302027275M1302027275
M1302027275
 
H1302023840
H1302023840H1302023840
H1302023840
 
H1302034856
H1302034856H1302034856
H1302034856
 
D013112636
D013112636D013112636
D013112636
 
J012518692
J012518692J012518692
J012518692
 
I012636670
I012636670I012636670
I012636670
 
Bandwidth enhancement of rectangular microstrip patch antenna using slots
Bandwidth enhancement of rectangular microstrip patch antenna using slotsBandwidth enhancement of rectangular microstrip patch antenna using slots
Bandwidth enhancement of rectangular microstrip patch antenna using slots
 
F017364451
F017364451F017364451
F017364451
 
O0124399103
O0124399103O0124399103
O0124399103
 
J017125865
J017125865J017125865
J017125865
 
L017326972
L017326972L017326972
L017326972
 
E017142429
E017142429E017142429
E017142429
 
H017334953
H017334953H017334953
H017334953
 
Performance Evaluation of Basic Segmented Algorithms for Brain Tumor Detection
Performance Evaluation of Basic Segmented Algorithms for Brain Tumor DetectionPerformance Evaluation of Basic Segmented Algorithms for Brain Tumor Detection
Performance Evaluation of Basic Segmented Algorithms for Brain Tumor Detection
 
Modelling and Control of a Robotic Arm Using Artificial Neural Network
Modelling and Control of a Robotic Arm Using Artificial Neural NetworkModelling and Control of a Robotic Arm Using Artificial Neural Network
Modelling and Control of a Robotic Arm Using Artificial Neural Network
 
D1802051622
D1802051622D1802051622
D1802051622
 
Identification and Monitoring the Change of Land Use Pattern Using Remote Sen...
Identification and Monitoring the Change of Land Use Pattern Using Remote Sen...Identification and Monitoring the Change of Land Use Pattern Using Remote Sen...
Identification and Monitoring the Change of Land Use Pattern Using Remote Sen...
 
C010131619
C010131619C010131619
C010131619
 
J010234960
J010234960J010234960
J010234960
 

Similar to High Speed Floating Point Design

An FPGA Based Floating Point Arithmetic Unit Using Verilog
An FPGA Based Floating Point Arithmetic Unit Using VerilogAn FPGA Based Floating Point Arithmetic Unit Using Verilog
An FPGA Based Floating Point Arithmetic Unit Using VerilogIJMTST Journal
 
Design of 32-bit Floating Point Unit for Advanced Processors
Design of 32-bit Floating Point Unit for Advanced ProcessorsDesign of 32-bit Floating Point Unit for Advanced Processors
Design of 32-bit Floating Point Unit for Advanced ProcessorsIJERA Editor
 
Fast Multiplier for FIR Filters
Fast Multiplier for FIR FiltersFast Multiplier for FIR Filters
Fast Multiplier for FIR FiltersIJSTA
 
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...iosrjce
 
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...IJERA Editor
 
Floating Point Unit (FPU)
Floating Point Unit (FPU)Floating Point Unit (FPU)
Floating Point Unit (FPU)Silicon Mentor
 
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...inventionjournals
 
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
Survey On Two-Term Dot Product Of Multiplier Using Floating PointSurvey On Two-Term Dot Product Of Multiplier Using Floating Point
Survey On Two-Term Dot Product Of Multiplier Using Floating PointIRJET Journal
 
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...IRJET Journal
 
IRJET - Design and Implementation of Double Precision FPU for Optimised Speed
IRJET - Design and Implementation of Double Precision FPU for Optimised SpeedIRJET - Design and Implementation of Double Precision FPU for Optimised Speed
IRJET - Design and Implementation of Double Precision FPU for Optimised SpeedIRJET Journal
 
Design and Analysis of High Performance Floating Point Arithmetic Unit
Design and Analysis of High Performance Floating Point Arithmetic UnitDesign and Analysis of High Performance Floating Point Arithmetic Unit
Design and Analysis of High Performance Floating Point Arithmetic Unitijtsrd
 
Area and power performance analysis of floating point ALU using pipelining
Area and power performance analysis of floating point  ALU using pipeliningArea and power performance analysis of floating point  ALU using pipelining
Area and power performance analysis of floating point ALU using pipeliningIRJET Journal
 
A High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
A High Speed Transposed Form FIR Filter Using Floating Point Dadda MultiplierA High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
A High Speed Transposed Form FIR Filter Using Floating Point Dadda MultiplierIJRES Journal
 
Encoding Schemes for Multipliers
Encoding Schemes for MultipliersEncoding Schemes for Multipliers
Encoding Schemes for MultipliersSilicon Mentor
 
A Pipelined Fused Processing Unit for DSP Applications
A Pipelined Fused Processing Unit for DSP ApplicationsA Pipelined Fused Processing Unit for DSP Applications
A Pipelined Fused Processing Unit for DSP Applicationsijiert bestjournal
 
A floating-point adder (IEEE 754 floating-point.pptx
A floating-point adder (IEEE 754 floating-point.pptxA floating-point adder (IEEE 754 floating-point.pptx
A floating-point adder (IEEE 754 floating-point.pptxNiveditaAcharyya2035
 

Similar to High Speed Floating Point Design (20)

An FPGA Based Floating Point Arithmetic Unit Using Verilog
An FPGA Based Floating Point Arithmetic Unit Using VerilogAn FPGA Based Floating Point Arithmetic Unit Using Verilog
An FPGA Based Floating Point Arithmetic Unit Using Verilog
 
Design of 32-bit Floating Point Unit for Advanced Processors
Design of 32-bit Floating Point Unit for Advanced ProcessorsDesign of 32-bit Floating Point Unit for Advanced Processors
Design of 32-bit Floating Point Unit for Advanced Processors
 
Fast Multiplier for FIR Filters
Fast Multiplier for FIR FiltersFast Multiplier for FIR Filters
Fast Multiplier for FIR Filters
 
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...
 
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
A Novel Efficient VLSI Architecture for IEEE 754 Floating point multiplier us...
 
Floating Point Unit (FPU)
Floating Point Unit (FPU)Floating Point Unit (FPU)
Floating Point Unit (FPU)
 
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
 
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
Survey On Two-Term Dot Product Of Multiplier Using Floating PointSurvey On Two-Term Dot Product Of Multiplier Using Floating Point
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
 
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
 
IRJET - Design and Implementation of Double Precision FPU for Optimised Speed
IRJET - Design and Implementation of Double Precision FPU for Optimised SpeedIRJET - Design and Implementation of Double Precision FPU for Optimised Speed
IRJET - Design and Implementation of Double Precision FPU for Optimised Speed
 
Design and Analysis of High Performance Floating Point Arithmetic Unit
Design and Analysis of High Performance Floating Point Arithmetic UnitDesign and Analysis of High Performance Floating Point Arithmetic Unit
Design and Analysis of High Performance Floating Point Arithmetic Unit
 
Area and power performance analysis of floating point ALU using pipelining
Area and power performance analysis of floating point  ALU using pipeliningArea and power performance analysis of floating point  ALU using pipelining
Area and power performance analysis of floating point ALU using pipelining
 
Ijetr011743
Ijetr011743Ijetr011743
Ijetr011743
 
Lo3420902093
Lo3420902093Lo3420902093
Lo3420902093
 
A High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
A High Speed Transposed Form FIR Filter Using Floating Point Dadda MultiplierA High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
A High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
 
Encoding Schemes for Multipliers
Encoding Schemes for MultipliersEncoding Schemes for Multipliers
Encoding Schemes for Multipliers
 
A Pipelined Fused Processing Unit for DSP Applications
A Pipelined Fused Processing Unit for DSP ApplicationsA Pipelined Fused Processing Unit for DSP Applications
A Pipelined Fused Processing Unit for DSP Applications
 
A floating-point adder (IEEE 754 floating-point.pptx
A floating-point adder (IEEE 754 floating-point.pptxA floating-point adder (IEEE 754 floating-point.pptx
A floating-point adder (IEEE 754 floating-point.pptx
 
At36276280
At36276280At36276280
At36276280
 
Ap32283286
Ap32283286Ap32283286
Ap32283286
 

More from IOSR Journals (20)

A011140104
A011140104A011140104
A011140104
 
M0111397100
M0111397100M0111397100
M0111397100
 
L011138596
L011138596L011138596
L011138596
 
K011138084
K011138084K011138084
K011138084
 
J011137479
J011137479J011137479
J011137479
 
I011136673
I011136673I011136673
I011136673
 
G011134454
G011134454G011134454
G011134454
 
H011135565
H011135565H011135565
H011135565
 
F011134043
F011134043F011134043
F011134043
 
E011133639
E011133639E011133639
E011133639
 
D011132635
D011132635D011132635
D011132635
 
C011131925
C011131925C011131925
C011131925
 
B011130918
B011130918B011130918
B011130918
 
A011130108
A011130108A011130108
A011130108
 
I011125160
I011125160I011125160
I011125160
 
H011124050
H011124050H011124050
H011124050
 
G011123539
G011123539G011123539
G011123539
 
F011123134
F011123134F011123134
F011123134
 
E011122530
E011122530E011122530
E011122530
 
D011121524
D011121524D011121524
D011121524
 

Recently uploaded

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Recently uploaded (20)

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

High Speed Floating Point Design

  • 1. IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-ISSN: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 1, Ver. 1 (Jan - Feb. 2015), PP 49-54 www.iosrjournals.org DOI: 10.9790/2834-10114954 www.iosrjournals.org 49 | Page Design and Implementation of High Speed Area Efficient Double Precision Floating Point Arithmetic Unit Onkar Singh1 , Kanika Sharma2 1 Arni University, Kathgarh, Indora, HP, India. 2 NITTTR, Chandigarh. India. Abstract: A floating-point arithmetic unit designed to carry out operations on floating point numbers. Floating point numbers can support a much wider range of values than fixed point representation. Floating Point units are mainly used in high speed objects recognition system, high performance computer systems, embedded systems, mobile applications. Latch based design is implemented in the proposed work so the longer combinational paths can be compensated by shorter path delays in the subsequent logic gates. That is why the performance has increased in the design. All four individual units addition, subtraction, multiplication and division are designed using Verilog verified by using Questa Sim and implemented on vertex-5 FPGA Keywords: Floating Point, IEEE, FPGA, Vertex-5, Double Precision, Verilog, Arithmetic Unit I. Introduction A floating point arithmetic unit designed to carry out operations on floating point numbers. Floating point arithmetic unit is widely used in high speed objects recognition system, high performance computer systems, embedded systems, mobile applications and signal processing applications. Floating point representation can support a much wider range of values than fixed point representation. To represent very large or small values, wide range is required as the integer representation is no longer appropriate. These values can be represented using the IEEE-754 standard based floating point representation. The design has been implemented on latches so the longer combinational paths can be compensated by shorter path delays in the subsequent logic gates. That is why the performance has increased in the design. IEEE Single Precision Format: The IEEE single precision format uses 32 bits for representing a floating point number, divided into three subfields, as illustrated in figure 1. The first field is the sign bit for the fraction part. The next field consists of 8 bits which are used for exponent the third field consists of the remaining 23 bits and is used for the fractional part. Sign Exponent Fraction 1 bit 8 bits 23 bits Fig 1: IEEE format for single precision IEEE Double Precision Format: The IEEE double precision format uses 64 bits for representing a floating point number, as illustrated in figure 2. The first bit is the sign bit for the fraction part. The next 11 bits are used for the exponent, and the remaining 52 bits are used for the fractional part. Sign Exponent Fraction 1 bit 11 bits 52 bits Fig 2: IEEE format for double precision II. Implementation Of Double Precision Floating Point Arithmetic Unit The block diagram of the proposed floating point arithmetic unit is given in figure 3. The unit supports four arithmetic operations: Add, Subtract, Multiply and Divide. All arithmetic operations have been carried out in four separate modules one for addition, one for subtraction, one for multiplication and one for division as shown in figure 3. In this unit one can select operation to be performed on the 64-bit operands by a 3-bit op- code and the same op-code selects the output from that particular module and connects it to the final output of the unit. Particular exception signal will be high whenever that type of exception will occur.
  • 2. Design and implementation of high speed area efficient double precision floating point arithmetic unit DOI: 10.9790/2834-10114954 www.iosrjournals.org 50 | Page Fig 4: RTL view of double precision floating point arithmetic unit Fig 3: Block diagram of double precision floating point arithmetic unit The floating point arithmetic unit consists of following blocks 2.1) Fpu_Add- Floating Point adder 2.2) Fpu_Sub- Floating Point Subtractor 2.3)Fpu_Mul-Floating Point Multiplier 2.4) Fpu_Div- Floating Point Division 2.5)Fpu_Round-Floating Point Rounding Unit 2.6) Fpu_Exception- Floating Point Exception Unit 2.1) Fpu_Add- Floating Point adder- Two floating point numbers are added as shown below. (f1 x 2e1 ) + (f2 x 2e2 ) = F x 2E In order to add two fractions, the associated exponents must be equal. Thus, if the two exponents are different, we must un normalize one of the fractions and adjust the exponents accordingly. The smaller number is the one that should adjusted so that if significant digits are lost, the effect is not significant. In this module two 64-bit numbers are added and after going through rounding and exception part final added result will come to the output. 2.2) Fpu_Sub- Floating Point Subtractor- Two floating point numbers are subtracted as shown below. (f1 x 2e1 ) - (f2 x 2e2 ) = F x 2E In order to subtract two fractions, the associated exponents must be equal. Thus, if the two exponents are different, we must un normalize one of the fractions and adjust the exponents accordingly. The smaller number is the one that should adjusted so that if significant digits are lost, the effect is not significant. In this module two 64-bit numbers are subtracted and after going through rounding and exception part final subtracted result will come to the output. 2.3) Fpu_Mul- Floating Point Multiplier-Two floating point numbers are multiplied as shown below. (f1 x 2e1 ) x (f2 * 2e2 ) = (f1 x f2) x 2(e1+e2) = F x 2E In this module two 64-bit numbers are multiplied using sub multipliers and after going through rounding and exception part final multiplied result will come to the output. The mantissa of operand A are stored in the 53-bit register (mul_a). The mantissa of operand B are stored in the 53-bit register (mul_b). Multiplying all 53 bits of mul_a by 53 bits of mul_b would result in a 106-bit wide product and a 53 by 53 bit multiplier is not available in the most popular Xilinx FPGAs, so the multiply would be broken down into smaller multiplies and the results would be added together to give the final 106-bit product. the module (fpu_mul) breaks up the multiply into smaller 24-bit by 17-bit multiplies. The Xilinx Virtex5 device contains
  • 3. Design and implementation of high speed area efficient double precision floating point arithmetic unit DOI: 10.9790/2834-10114954 www.iosrjournals.org 51 | Page DSP48E slices with 25 by 18 twos complement multipliers, which can perform a 24 by 17-bit multiply. The multiply is broken up as follows: Multiplier_1 = mul_a[23:0] x mul_b[16:0] Multiplier_2 = mul_a[23:0] x mul_b[33:17] Multiplier_3 = mul_a[23:0] x mul_b[50:34] Multiplier_4= mul_a[23:0] x mul_b[52:51] Multiplier_5 = mul_a[40:24] x mul_b[16:0] Multiplier_6 = mul_a[40:24] x mul_b[33:17] Multiplier_7= mul_a[40:24] x mul_b[52:34] Multiplier_8 = mul_a[52:41] x mul_b[16:0] Multiplier_9 = mul_a[52:41] x mul_b[33:17] Multiplier_10 = mul_a[52:41] x mul_b[52:34] The multiplier (1-10) are added together, with the appropriate offsets based on which part of the mul_a and mul_b arrays they are multiplying. The summation of the products is accomplished by adding one product result to the previous product result instead of adding all 10 multiplier (1-10) together in one summation. The final 106-bit product is stored in register (product). The exponent fields of operands A and B are added together and then the value (1022) is subtracted from the sum of A and B. If the resultant exponent is less than 0, than the (product) register needs to be right shifted by the amount. The final exponent of the output operand will be 0 in this case, and the result will be a denormalized number. 2.4) Fpu_Div- Floating Point Division - Two floating point numbers are divided as shown below. (f1 x 2e1 ) / (f2 x 2e2 ) = (f1 / f2) x 2(e1-e2) = F x 2E In this module two 64-bit numbers are divided and after going through rounding and exception part final divided result will come to the output. The leading „1‟ (if normalized) and mantissa of operand A is the dividend, and the leading „1‟ (if normalized) and mantissa of operand B is the divisor. In division one bit of the quotient calculated each clock cycle based on a comparison between the dividend and divisor register. If the dividend is greater than the divisor, the quotient bit is „1‟, and then the divisor is subtracted from the dividend, and the resulting difference is shifted one bit to the left, and after shifting it becomes the dividend for the next clock cycle. And in another case if the dividend is less than the divisor, the dividend is shifted one bit to the left, and then this shifted value becomes the dividend for the next clock cycle. Repeat the steps by the number of bits time. The number in the dividend place gives remainder value and quotient place gives quotient value 2.5) Fpu_Round-Floating Point Rounding Unit - Rounding module is used to modifies a number and fit it in the destination‟s format. The various rounding modes are written below. 2.5.1) Round to nearest even: This is the standard default rounding. The value is rounded down or up to the nearest infinitely precise result. 2.5.2) Round-to-Zero: Basically in this mode the number will not be rounded. The excess bits will simply get truncated, e.g. 5.48 will be truncated to 5.5 2.5.3) Round-Up: In this mode the number will be rounded up towards +∞, e.g. 6.4 will be rounded to 7, while - 5.4 to -5 2.5.4) Round-Down: The opposite of round-up, the number will be rounded up towards -∞, e.g. 6.4 will be rounded to 6, while -5.4 to -6 2.6) Fpu_Exception- Floating Point Exception Unit- Exception occurs when an operation on some particular operands has no outputs suitable for a reasonable application. The five possible exceptions are: 2.6.1) Invalid: Operation are like square root of a negative number, returning of NaN by default, etc., output of which does not exist. 2.6.2) Division by zero: It is an operation on a finite operand which gives an exact infinite result for e.g., 1/0 or log (0) that returns positive or negative infinity by default. 2.6.3) Overflow: It occurs when an operation results a very large number that can‟t be represented correctly i.e. which returns ±infinity by default (for round-to-nearest mode). 2.6.4) Underflow: It occurs when an operation results very small i.e. outside the normal range and inexact by default. 2.6.5) Inexact: It occurs whenever the result of an arithmetic operation is not exact due to the restricted exponent or precision range.
  • 4. Design and implementation of high speed area efficient double precision floating point arithmetic unit DOI: 10.9790/2834-10114954 www.iosrjournals.org 52 | Page III. Synthesis, Timing And Simulation Result 3.1) Synthesis Result DEVICE UTILIZATION SUMMARY Logic Utilization Used Available Utilization Number of Slice Registers 4762 28800 16% Number of Slice LUTs 6525 28800 22% Number of fully used LUT-FF pairs 3013 7600 36% Number of bonded IOBs 206 220 93% Number of BUFG/BUFGCTRLs 6 32 18% Number of DSP48Es 9 48 18% 3.2) Timing Result Minimum Period 3.817ns (Maximum Frequency: 262.006MHz) Minimum Input arrival time before clocks 3.900ns Maximum output required time after clocks 2.775ns 3.3) Simulation Result- The simulation results of double precision floating point arithmetic unit (Addition, Subtraction, Multiplication and Division) is shown in fig 5, fig 6, fig 7, fig 8 respectively. It is calculated for the two input operands of 64 bits each. The reset signal is kept low throughout the simulation, so that operands are initialised all at once, then at the high of enable signal, operation of the two operands are calculated. After calculating the result, the result goes into fpu_round and then goes into fpu_exceptions. From fpu_exceptions the out signal gives the output. In the waveforms clock defines the applied frequency (262.006MHz) to the signals. Fpu_op defines the operation to be preformed that is 0=addition,1=subtraction, 2=multiplication and 3=division. Opa1 and Opa1 defines the input operand one and input operand two respectively. The r_mode signal defines the various rounding modes (00=Round to nearest even, 01=Round-to-Zero, 10=Round-Up, 11=Round-Down. Fpu_out defines the final output of the signals. 3.3.1) Simulation Result of floating point addition- It is calculated for the two input operands of 64 bits each. 15 clock cycles are required by floating point unit to complete addition process. As frequency is 262.006MHz so one clock cycle completes 3.82ns and 15 clock cycles completes in 3.82ns x 15 =57.3ns. Therefore the addition process completes in 57.3ns. Fig 5: Simulation Result of Floating Point Addition Operation Time taken by modules in ns Addition 57.255ns (15 cycles) Subtraction 57.255ns (15 cycles) Multiplication 57.255ns (15 cycles) Division 259.556ns (68 cycles)
  • 5. Design and implementation of high speed area efficient double precision floating point arithmetic unit DOI: 10.9790/2834-10114954 www.iosrjournals.org 53 | Page 3.3.2) Simulation result of floating point subtraction- It is calculated for the two input operands of 64 bits each. 15 clock cycles are required by floating point unit to complete subtraction process. As frequency is 262.006MHz so one clock cycle completes 3.82ns and 15 clock cycles completes in 3.82ns x 15 =57.3ns. Therefore the subtraction process completes in 57.3ns. Fig 6: Simulation Result of Floating Point Subtraction 3.3.3) Simulation result of floating point multiplication- It is calculated for the two input operands of 64 bits each. 15 clock cycles are required by floating point unit to complete multiplication process. As frequency is 262.006MHz so one clock cycle completes 3.82ns and 15 clock cycles completes in 3.82ns x 15 =57.3ns. Therefore the multiplication process completes in 57.3ns. Fig 7: Simulation Result of Floating Point Multiplication 3.3.4) Simulation result of floating point division- It is calculated for the two input operands of 64 bits each. 68 clock cycles are required by floating point unit to complete division process. As frequency is 262.006MHz so one clock cycle completes 3.82ns and 68 clock cycles completes in 3.82ns x 68 =259.76ns. Therefore the division process completes in 259.76ns.
  • 6. Design and implementation of high speed area efficient double precision floating point arithmetic unit DOI: 10.9790/2834-10114954 www.iosrjournals.org 54 | Page Fig 8: Simulation Result of Floating Point Division IV. Conclusion This paper presents a high performance implementation of double precision floating point arithmetic unit. The complete design is captured in Verilog Hardware description language (HDL), tested in simulation using Questa Sim, placed and routed on a Vertex 5 FPGA from Xilinx.It works on Maximum Frequency of 262.006MHz. When synthesized, this module used 16% number of slice registers, 22% Number of Slice LUTS, and 36% number of fully used LUT-FF pairs. The overall performance is increased in this design. The proposed work can be further proceed by adding some more modules like square root, logarithmic units to the floating point unit and also the complete design can be implemented on high performance vertex-6 FPGA. References [1]. Chaitanya a. Kshirsagar, P.M. Palsodkar “An FPGA implementation of IEEE - 754 double precision floating point unit using verilog” international journal of electrical, electronics and data communication, ISSN: 2320-2084 , volume-2, issue-6, june-2014 [2]. Paschalakis, S., Lee, P., “Double Precision Floating-Point Arithmetic on FPGAs”, In Proc. 2003 2nd IEEE International Conference on Field Programmable Technology (FPT ‟03), Tokyo, Japan, pp. 352-358, 2003 [3]. Addanki Puma Ramesh, A. V. N. Tilak, A.M.Prasad “An FPGA Based High Speed IEEE-754 Double Precision Floating Point Multiplier Using Verilog” 2013 International Conference on Emerging Trends in VLSI, Embedded System, Nano Electronics and Telecommunication System (ICEVENT), pp. 1-5, 7-9 Jan. 2013 [4]. Ushasree G, R Dhanabal, Sarat Kumar Sahoo “Implementation of a High Speed Single Precision Floating Point Unit using Verilog” International Journal of Computer Applications National conference on VSLI and Embedded systems, pp.32-36, 2013 [5]. Pramod Kumar Jain, Hemant Ghayvat , D.S Ajnar “Double Precision Optimized Arithmetic Hardware Design For Binary & Floating Point Operands” International Journal of Power Control Signal and Computation (IJPCSC) Vol. 2 No. 2 ISSN : 0976-268X [6]. Basit Riaz Sheikh and Rajit Manohar “An Operand-Optimized Asynchronous IEEE 754 Double-Precision Floating-Point Adder”, IEEE Symposium on Asynchronous Circuits and Systems (ASYNC), pp. 151 – 162, 3-6 May 2010 [7]. Ms. Anjana Sasidharan, Mr. M.K. Arun” Vhdl Implementation Of Ieee 754 Floating Point Unit” IJAICT, ISSN 2348 – 9928Volume 1, Issue 2, June 2014 [8]. Dhiraj Sangwan , Mahesh K. Yadav “Design and Implementation of Adder/Subtractor and Multiplication Units for Floating-Point Arithmetic” International Journal of Electronics Engineering, 2(1), pp. 197-203, 2010 [9]. Tarek Ould Bachir, Jean-Pierre David “Performing Floating-Point Accumulation on a modern FPGA in Single and Double Precision” 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, pp.105-108, 2010 [10]. Geetanjali Wasson “IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers” International Journal of Research in Computer Science, Volume 1, Issue 1, pp. 1-7, 2011 [11]. KavithaSravanthi, Addula Saikumar “An FPGA Based Double Precision Floating Point Arithmetic Unit using Verilog” International Journal of Engineering Research & Technology ISSN: 2278-0181, Vol. 2 Issue 10, October - 2013 [12]. Rathindra Nath Giri, M.K.Pandit “Pipelined Floating-Point Arithmetic Unit (FPU) for Advanced Computing Systems using FPGA” International Journal of Engineering and Advanced Technology (IJEAT), Volume-1, Issue-4, pp. 168-174, April 2012 [13]. H. Yamada, T. Hottat, T. Nishiyama, F. Murabayashi, T. Yamauchi, and H. Sawamoto “A 13.3ns Double-precision Floating-point ALU and Multiplier”, IEEE International Conference on Computer Design: VLSI in Computers and Processors, pp. 466 – 470, 2-4 Oct 1995 [14]. Shrivastava Purnima, Tiwari Mukesh, Singh Jaikaran and Rathore Sanjay “VHDL Environment for Floating point Arithmetic Logic Unit - ALU Design and Simulation” Research Journal of Engineering Sciences, Vol. 1(2), pp.1-6, August -2012