-
Notifications
You must be signed in to change notification settings - Fork 192
/
Copy pathCHANGELOG
173 lines (144 loc) · 6.43 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
# ########################################################################
# Copyright 2013 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ########################################################################
clFFT Readme
Version: 1.10
Release Date: April 2013
ChangeLog:
____________
Current Version:
* This release tested using the 9.012 runtime driver and the 2.8 APPSDK
____________
Version 1.8.291:
Fixed:
* Memory leaks affecting use cases where 'clfftEnqueueTransform' is used in a loop
____________
Version 1.8.269 (beta):
New:
* clFFT now supports real-to-complex and complex-to-real transforms;
refer to documentation for details
* This release tested using the 12.4 Catalyst software suite
Known Issues:
* Some degradation in performance of real transforms due to known
runtime/driver issues
* Failures in real transforms have been seen on 7xxx series GPUs with certain
problem sizes involving powers of 3 and 5
____________
Version 1.6.244:
Fixed:
* Failures observed in v1.6.236 in backward transforms of certain power of 2
(involving radix 4 and radix 8) problem sizes.
____________
Version 1.6.236:
New:
* Performance of the FFT library has been improved for Radix-2 1D and 2D transforms
* Support for R4XXX GPUs is deprecated and no longer tested
* Preview: Support for AMD Radeon™ HD7000 series GPUs
* This release tested using the 8.92 runtime driver and the 2.6 APP SDK
____________
Version 1.4:
New:
* clFFT now supports transform lengths whose factors consist exclusively
of powers of 2, 3, and 5
* clFFT supports double precision data types
* clFFT executes on OpenCL 1.0 compliant devices
* This release tested using the 8.872 runtime driver and the 2.5 APP SDK
* A helper bash script appmlEnv.sh has been added to the root installation
directory to assist in properly setting up a terminal environment to
execute clFFT samples
Fixed:
* If the library is required to allocate a temporary buffer, and the user does
not specify a temporary buffer on the Enqueue call, the library will
allocate a temporary buffer internally and the lifetime of that temporary
buffer is managed by the lifetime of the FFT plan; deleting the plan will
release the buffer.
* Test failures on CPU device for 32-bit systems (Windows/Linux)
Known Issues:
* Failures have been seen on graphics cards using R4550 (RV710) GPUs.
____________
Version 1.2:
New:
* Reduced the number of internal LDS bank conflicts for our 1D FFT transforms,
increasing performance.
* Padded reads/writes to global memory, decreasing bank conflicts and
increasing performance on 2D transforms.
* This release tested using the 8.841 runtime driver and the 2.4 APP SDK
Fixed:
* Failures have been seen attempting to queue work on the second GPU device on
a multi GPU 5970 card on Linux.
Known Issues:
* It is recommended that users query for and explicitely create an
intermediate buffer if clFFT requires one. If the library creates the
intermediate buffer internally, a race condition may occur on freeing the
buffer on lower end hardware.
* Failures have been seen on graphics cards using R4550 (RV710) GPUs.
* Test failures on CPU device for 32-bit systems (Windows/Linux)
* It is recommended that windows users uninstall previous version of clFFT
before installing newer versions. Otherwise, Add/Remove programs only
removes the latest version. Linux users can delete the install directory.
____________
Version 1.0:
* Initial release, available on all platforms
Known Issues:
* Failures have been seen attempting to queue work on the second GPU device on
a multi GPU 5970 card on Linux.
_____________________
Building the Samples:
To install the Linux versions of clFFT, uncompress the initial download and
then execute the install script.
For example:
tar -xf clFFT-${version}.tar.gz
- This installs three files into the local directory, one being an
executable bash script.
sudo mkdir /opt/clFFT-${version}
- This pre-creates the install directory with proper permissions in /opt
if it is to be installed there (This is the default).
./install-clFFT-${version}.sh
- This prints an EULA and uncompresses files into the chosen install
directory.
cd ${installDir}/bin64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${OpenCLLibDir}:${clfftLibDir}
- Export library dependencies to resolve all external linkages to the
client program. The user can create a bash script to help automate this
procedure.
./Client -h
- Understand the command line options that are available to the user
through the sample client.
./Client -iv
- Watch for the version strings to print out; watch for
'Client Test *****PASS*****' to print out.
The sample program does not ship with native build files. Instead, a CMake
file is shipped, and users generate a native build file for their system.
For example:
cd ${installDir}
mkdir samplesBin/
- This creates a sister directory to the samples directory that will house
the native makefiles and the generated files from the build.
cd samplesBin/
ccmake ../samples/
- ccmake is a curses-based cmake program. It takes a parameter that
specifies the location of the source code to compile.
- Hit 'c' to configure for the platform; ensure that the dependencies to
external libraries are satisfied, including paths to 'ATI Stream SDK'
and 'Boost'.
- After dependencies are satisfied, hit 'c' again to finalize configure
step, then hit 'g' to generate makefile and exit ccmake.
make help
- Look at the available options for make.
make
- Build the sample client program.
./clfft.Sample -iv
- Watch for the version strings to print out; watch for
'Client Test *****PASS*****' to print out.