FFT is a crucial part of modern digital communication systems such as Digital VideoBroadcasting (DVB), Ultra Wideband Systems (UWB) and Orthogonal Frequency DivisionMultiplexing based systems. The applications which have need of a large point ofFast Fourier Transform (FFT) processor for multiple carrier modulation such as 4096point FFT in Very high bit rate Digital Subscriber Line (VDSL), Digital VideoBroadcasting-terrestrial (DVB-T) and Digital Audio broadcasting (DAB). FFT and IFFTare central part functions in such multi-carrier modulations based transmissionsystems. Further, FFT is used for source tracking in the area of sensor signalprocessing, frequency domain beam forming, and for analyzing biomedical signalssuch as electroencephalogram (EEG) and electrocardiogram (ECG) in a frequencydomain. For provided that high performance and meet up the real-time necessities ofrecent applications, hardware designers have constantly tried to put into practiceefficient architectures for the working out of the FFT.From the past decade different FFT algorithms have been proposed, from theresearch, the previous algorithms focused mainly to trim down the hardwarecomplexity of multipliers and adders. The researchers do not give that muchimportance to reduce the phase vectors or twiddle factors required in theprocessor. The phase vectors generation, storage and multiplying with input signalrequires a lot of hardware resources. The most complex process in FFT processor isphase vectors multiplication. Generally, ROM tables are used to store the phasevectors, when implementing a huge point FFT processor, the table required to storethe phase vectors becomes large and it requires more area in the design. Thecomplexity of the algorithm depends on the number phase vectors multiplication. Itcan be implemented by using various methods like LUT used to store the phasevectors with complex multipliers, complex constant multiplication and CORDICalgorithm 9, 10 and 11.In the proposed system eight parallel 4096 point radix-24 MDC FFT / IFFT processoris designed to improve the performance of the OFDM system. In this paper, variousapproaches for twiddle factor multiplications are discussed; a radix-24 MDCFeedforward pipeline structure is adopted in our design. It improves theperformance of FFT / IFFT processor in terms of reducing the multiplier complexityand also reduces a normalized area of the processor. The organization of this briefis as follows: Section 2 describes the balanced binary tree structure. Proposedeight parallel 4096 point radix-24 MDC FFT / IFFT processor is explained in Section3, Implementation of the proposed model is shown in Section 4, result anddiscussions are discussed in Section 5 and finally, conclusions are provided inSection 6.In addition to that, the different binary tree developed for 512-point FFT inRadix-24, Radix- 26 and Radix-23. From the Fig.3, we know that the binary trees arenot balanced properly. From this, we can understand all the possible decompositionmethods introduce same number of butterfly operations with different twiddlefactors.Different hardware architecture has been proposed to design a high-speedFFT/IFFT processor. High performance can be achieved by using pipelined processingwith a reasonable hardware cost. Pipeline architecture requires completelyscheduled operation sequences. It can be classified as Single path Delay Feedback(SDF), Multipath Delay Feedback (MDF), Single path Delay Commutator (SDC) andMultipath Delay Commutator (MDC) based on the quantity of data to be processed at atime 11, 12 and 13.Generally, a pipeline FFT processor is designed using one of two popular methods.The first is Single-path Delay Feedback (SDF) pipeline architecture and the secondis Multipath Delay Commutator (MDC) pipeline architecture. The SDF pipeline FFToffers the advantage of requiring less memory space and its multiplicationcomputation usage is less than 50% in addition its control unit is easy to design.The features are advantages in low-power designs, especially in applications forportable DSP devices. A commonly used architecture for transform of length N=br isthe pipelined FFT. The pipelined architecture is characterized by continuousprocessing of input data. In addition, the pipeline architecture is highly regular,making it straight forward to automatically generate FFT’s of various lengths.For the applications which requires the throughput rate is more than 1 Giga Samplesper second the MDF architecture is normally used. The required number of delayelement in MDF architectures is less due to the delay feedback design while thenumber of data-paths is equal to the level of parallelism. It can be obtained byincreasing the number of data paths to 8 or 16, as a result of increases in thehardware cost.The following Fig. 6 represents the block diagram of 512 point radix-2k Single pathDelay Feedback architecture. In that, Butterfly Units (BF), twiddle factors (W) andFirst in First out (FIFO) for each stage are represented. This architecture isgeneric while the required range of each complete twiddle factor multiplier isoutlined in table for varying numbers of i. For the twiddle factor multipliers withsmall ranges special methods have been proposed. Especially one can note that for aW4 multiplier the possible co-efficient are (+-1, +-j) and hence this can be simplysolved by optionally interchanging real and imaginary parts and possible negative(or replace the addition with a subtraction i the subsequent stage). For largerranges (W8, W16 and W32) approaches have been proposed in (4), (6) and (8).