PFAFFT - Functions to perform Prime Factor (PFA) FFT's, in place
npfa return valid n for complex-to-complex PFA
npfar return valid n for real-to-complex/complex-to-real PFA
npfao return optimal n for complex-to-complex PFA
npfaro return optimal n for real-to-complex/complex-to-real PFA
pfacc 1D PFA complex to complex
pfacr 1D PFA complex to real
pfarc 1D PFA real to complex
pfamcc multiple PFA complex to real
pfa2cc 2D PFA complex to complex
pfa2cr 2D PFA complex to real
pfa2rc 2D PFA real to complex
Function Prototypes:
int npfa (int nmin);
int npfao (int nmin, int nmax);
int npfar (int nmin);
int npfaro (int nmin, int nmax);
void pfacc (int isign, int n, complex z[]);
void pfacr (int isign, int n, complex cz[], float rz[]);
void pfarc (int isign, int n, float rz[], complex cz[]);
void pfamcc (int isign, int n, int nt, int k, int kt, complex z[]);
void pfa2cc (int isign, int idim, int n1, int n2, complex z[]);
void pfa2cr (int isign, int idim, int n1, int n2, complex cz[], float rz[]);
void pfa2rc (int isign, int idim, int n1, int n2, float rz[], complex cz[]);
npfa:
Input:
nmin lower bound on returned value (see notes below)
Returned: valid n for prime factor fft
npfao
Input:
nmin lower bound on returned value (see notes below)
nmax desired (but not guaranteed) upper bound on returned value
Returned: valid n for prime factor fft
npfar
Input:
nmin lower bound on returned value
Returned: valid n for real-to-complex/complex-to-real prime factor fft
npfaro:
Input:
nmin lower bound on returned value
nmax desired (but not guaranteed) upper bound on returned value
Returned: valid n for real-to-complex/complex-to-real prime factor fft
pfacc:
Input:
isign sign of isign is the sign of exponent in fourier kernel
n length of transform (see notes below)
z array[n] of complex numbers to be transformed in place
Output:
z array[n] of complex numbers transformed
pfacr:
Input:
isign sign of isign is the sign of exponent in fourier kernel
n length of transform (see notes below)
cz array[n/2+1] of complex values (may be equivalenced to rz)
Output:
rz array[n] of real values (may be equivalenced to cz)
pfarc:
Input:
isign sign of isign is the sign of exponent in fourier kernel
n length of transform; must be even (see notes below)
rz array[n] of real values (may be equivalenced to cz)
Output:
cz array[n/2+1] of complex values (may be equivalenced to rz)
pfamcc:
Input:
isign sign of isign is the sign of exponent in fourier kernel
n number of complex elements per transform (see notes below)
nt number of transforms
k stride in complex elements within transforms
kt stride in complex elements between transforms
z array of complex elements to be transformed in place
Output:
z array of complex elements transformed
pfa2cc:
Input:
isign sign of isign is the sign of exponent in fourier kernel
idim dimension to transform, either 1 or 2 (see notes)
n1 1st (fast) dimension of array to be transformed (see notes)
n2 2nd (slow) dimension of array to be transformed (see notes)
z array[n2][n1] of complex elements to be transformed in place
Output:
z array[n2][n1] of complex elements transformed
pfa2cr:
Input:
isign sign of isign is the sign of exponent in fourier kernel
idim dimension to transform, which must be either 1 or 2 (see notes)
n1 1st (fast) dimension of array to be transformed (see notes)
n2 2nd (slow) dimension of array to be transformed (see notes)
cz array of complex values (may be equivalenced to rz)
Output:
rz array of real values (may be equivalenced to cz)
pfa2rc:
Input:
isign sign of isign is the sign of exponent in fourier kernel
idim dimension to transform, which must be either 1 or 2 (see notes)
n1 1st (fast) dimension of array to be transformed (see notes)
n2 2nd (slow) dimension of array to be transformed (see notes)
rz array of real values (may be equivalenced to cz)
Output:
cz array of complex values (may be equivalenced to rz)
Notes:
Table of valid n and cost for prime factor fft. For each n, cost
was estimated to be the inverse of the number of ffts done in 1 sec
on an IBM RISC System/6000 Model 320H, by Dave Hale, 08/04/91.
(Redone by Jack Cohen for 15 sec to rebuild NTAB table on advice of
David and Gregory Chudnovsky, 05/03/94).
Cost estimates are least accurate for very small n. An alternative method
for estimating cost would be to count multiplies and adds, but this
method fails to account for the overlapping of multiplies and adds
that is possible on some computers, such as the IBM RS/6000 family.
npfa:
The returned n will be composed of mutually prime factors from
the set {2,3,4,5,7,8,9,11,13,16}. Because n cannot exceed
720720 = 5*7*9*11*13*16, 720720 is returned if nmin exceeds 720720.
npfao:
The returned n will be composed of mutually prime factors from
the set {2,3,4,5,7,8,9,11,13,16}. Because n cannot exceed
720720 = 5*7*9*11*13*16, 720720 is returned if nmin exceeds 720720.
If nmin does not exceed 720720, then the returned n will not be
less than nmin. The optimal n is chosen to minimize the estimated
cost of performing the fft, while satisfying the constraint, if
possible, that n not exceed nmax.
npfar and npfaro:
Current implemenations of real-to-complex and complex-to-real prime
factor ffts require that the transform length n be even and that n/2
be a valid length for a complex-to-complex prime factor fft. The
value returned by npfar satisfies these conditions. Also, see notes
for npfa.
pfacc:
n must be factorable into mutually prime factors taken
from the set {2,3,4,5,7,8,9,11,13,16}. in other words,
n = 2**p * 3**q * 5**r * 7**s * 11**t * 13**u
where
0 <= p <= 4, 0 <= q <= 2, 0 <= r,s,t,u <= 1
is required for pfa to yield meaningful results. this
restriction implies that n is restricted to the range
1 <= n <= 720720 (= 5*7*9*11*13*16)
pfacr:
Because pfacr uses pfacc to do most of the work, n must be even
and n/2 must be a valid length for pfacc. The simplest way to
obtain a valid n is via n = npfar(nmin). A more optimal n can be
obtained with npfaro.
pfarc:
Because pfarc uses pfacc to do most of the work, n must be even
and n/2 must be a valid length for pfacc. The simplest way to
obtain a valid n is via n = npfar(nmin). A more optimal n can be
obtained with npfaro.
pfamcc:
To perform a two-dimensional transform of an n1 by n2 complex array
(assuming that both n1 and n2 are valid "n"), stored with n1 fast
and n2 slow:
pfamcc(isign,n1,n2,1,n1,z); (to transform 1st dimension)
pfamcc(isign,n2,n1,n1,1,z); (to transform 2nd dimension)
pfa2cc:
Only one (either the 1st or 2nd) dimension of the 2-D array is transformed.
If idim equals 1, then n2 transforms of n1 complex elements are performed;
else, if idim equals 2, then n1 transforms of n2 complex elements are
performed.
Although z appears in the argument list as a one-dimensional array,
z may be viewed as an n1 by n2 two-dimensional array: z[n2][n1].
Valid n is computed via the "np" subroutines.
To perform a two-dimensional transform of an n1 by n2 complex array
(assuming that both n1 and n2 are valid "n"), stored with n1 fast
and n2 slow: pfa2cc(isign,1,n1,n2,z); pfa2cc(isign,2,n1,n2,z);
pfa2cr:
If idim equals 1, then n2 transforms of n1/2+1 complex elements to n1 real
elements are performed; else, if idim equals 2, then n1 transforms of n2/2+1
complex elements to n2 real elements are performed.
Although rz appears in the argument list as a one-dimensional array,
rz may be viewed as an n1 by n2 two-dimensional array: rz[n2][n1].
Likewise, depending on idim, cz may be viewed as either an n1/2+1 by
n2 or an n1 by n2/2+1 two-dimensional array of complex elements.
Let n denote the transform length, either n1 or n2, depending on idim.
Because pfa2rc uses pfa2cc to do most of the work, n must be even
and n/2 must be a valid length for pfa2cc. The simplest way to
obtain a valid n is via n = npfar(nmin). A more optimal n can be
obtained with npfaro.
pfa2rc:
If idim equals 1, then n2 transforms of n1 real elements to n1/2+1 complex
elements are performed; else, if idim equals 2, then n1 transforms of n2
real elements to n2/2+1 complex elements are performed.
Although rz appears in the argument list as a one-dimensional array,
rz may be viewed as an n1 by n2 two-dimensional array: rz[n2][n1].
Likewise, depending on idim, cz may be viewed as either an n1/2+1 by
n2 or an n1 by n2/2+1 two-dimensional array of complex elements.
Let n denote the transform length, either n1 or n2, depending on idim.
Because pfa2rc uses pfa2cc to do most of the work, n must be even
and n/2 must be a valid length for pfa2cc. The simplest way to
obtain a valid n is via n = npfar(nmin). A more optimal n can be
obtained with npfaro.
References:
Temperton, C., 1985, Implementation of a self-sorting
in-place prime factor fft algorithm: Journal of
Computational Physics, v. 58, p. 283-299.
Temperton, C., 1988, A new set of minimum-add rotated
rotated dft modules: Journal of Computational Physics,
v. 75, p. 190-198.
Differ significantly from Press et al, 1988, Numerical Recipes in C, p. 417.
Author: Dave Hale, Colorado School of Mines, 04/27/89