-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCUDALucas.ini
195 lines (164 loc) · 9.03 KB
/
CUDALucas.ini
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
# You can use this file to customize CUDALucas without having to create a long
# and complex command. I got tired of having to hit the up arrow a bunch of
# times whenever I rebooted, so I created this. You can set most of the command
# line options here. However, if you do use command line options, they will
# override their corresponding value in this file.
# This sets the name of the work file used by CUDALucas.
WorkFile=worktodo.txt
# This sets the name of the results file used by CUDALucas.
ResultsFile=results.txt
# SaveAllCheckpoints is the same as the -s option. When active, CUDALucas will
# save each checkpoint separately in the folder specified in the "SaveFolder"
# option above. This is a binary option; set to 1 to activate, 0 to de-activate.
SaveAllCheckpoints=0
# This option is the name of the folder where the separate checkpoint files are
# saved. This option is only checked if SaveAllCheckpoints is activated.
SaveFolder=savefiles
# DeviceNumber is the same as the -d option. Use this to run CUDALucas on a GPU
# other than "the first one". Only useful if you have more than one GPU.
DeviceNumber=0
# PrintDeviceInfo is the same as the -info option. This sets whether or not
# CUDALucas prints information about your GPU. This is a binary option; set to
# 1 to activate, 0 to de-activate.
PrintDeviceInfo=1
# ErrorIterations tells how often the roundoff error is checked. Larger values
# give shorter iteration times, but introduce some uncertainty as to the actual
# maximum roundoff error that occurs during the test. Default is 100.
# ReportIterations is the same as the -x option; it determines how often
# screen output is written. Default is 10000.
# CheckpointIterations is the same as the -c option; it determines how often
# checkpoints are written. Default is 100000.
# Each of these values should be of the form k * 10^n with k = 1, 2, or 5.
ErrorIterations=100
ReportIterations=10000
CheckpointIterations=100000
# Polite is related to the -polite option. If Polite=1 (enabled), and
# PoliteValue=50 then at every 50th iteration all work queued up on the gpu
# is allowed to finish before any new work is scheduled. This gives the gpu
# a (very short) rest. Set Polite=0 to turn off completely. Polite!=0 will
# incur a slight performance drop, but the screen should be more responsive.
# Trade responsiveness for performance. (Note: Polite=0 is known to cause
# CUDALucas to use some extra CPU time; Polite=50 or higher is a good compromise.)
# (decreasing PoliteValue increases the gpu wait time)
Polite=0
PoliteValue=50
# Some tests done by version 2.03 gave consistently incorrect results. I suspect
# these errors were a result of integer overflow in the carry arithmetic. The
# BigCarry option when set to 1 uses 64 bit instead of 32 bit arithmetic when
# computing the carries. This will slow the iterations down by a few percent,
# but is worth a try if a particular exponent gives an otherwise unexplained
# incorrect residue.
BigCarry=0
# ErrorReset is a percent by which the error threshold is reduced at each checkpoint.
# Small values ensure that the actual maximum error since the last checkpoint is reported.
# With larger values, the reported error might be larger than the largest actual
# error since the last checkpoint. With a value of 100, only the maximum error since
# the program was last started will be reported. The advantage of large values is
# that fewer atomic operations are done. These are very slow. The slow down becomes
# increasingly apparent for values of ErrorRest <= 80. Default is 85.
ErrorReset=85
# Interactive is the same as the -k option. When active, you can press a key to
# change the program's behavior. The active keys are:
# p -- toggles polite mode.
# s -- toggles the save all checkpoints option.
# b -- toggles 32/64 bit carry arithmetic.
# n -- resets the timer. Useful if a test is resumed with a slower/faster card,
# or slower/faster settings.
# m -- re-reads the ini file output format strings.
# z -- prints information about current settings.
# F/f -- increase/decrease fft length.
# W/w -- increase/decrease mult threads.
# E/e -- increase/decrease splice threads.
# R/r -- increase/decrease error reset percentage.
# T/t -- increase/decrease checkpoint interval.
# Y/y -- increase/decrease report interval.
# U/u -- increase/decrease error check interval.
# I/i -- increase/decrease error threshold.
# O/o -- increase/decrease polite interval.
#
# This is a binary option; set to 1 to activate, 0 to de-activate.
Interactive=1
# RoundoffTest is a binary option. If set to 1, roundoff errors are checked at the
# beginning of each test.
# Default 0
RoundoffTest=0
# Output format options allow formatting the screen output to suit your own aesthetics.
# OutputHInterval tells how frequently to write the column header which is the line specified
# by the OutputHeader option. The OutputString value is written to the screen every report
# exactly as given, with any character following a % symbol be replaced with the values
# according to this list:
# %M -- Month
# %D -- Date
# %h -- Hours
# %m -- Minutes
# %s -- Seconds
# %i -- Iteration Number
# %p -- Exponent
# %r -- Residue.
# %f -- fft in multiples of K
# %F -- fft
# %C -- Elapsed time since last checkpoint hh:mm:ss format
# %T -- ETA in d:hh:mm:ss format
# %N -- Program name
# %dk -- Percent done truncated after k decimal places, k = 0 -- 5
# %xk -- Round off error truncated after k decimal places, k = 0 -- 5
# %qk -- Iteration timing truncated after k decimal places, k = 0 -- 5
# %ck -- Elapsed time since last checkpoint in format xxx.xxxxx seconds truncated after k decimal places, k = 0 -- 5
# Example output of the format given below:
#
# | Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
# | Dec 15 12:46:01 | M57885161 18900000 0x219381f578a2abf9 | 3136K 0.28516 5.3992 539.92s | 2:11:05:11 32.65% |
# | Dec 15 12:55:01 | M57885161 19000000 0x41e63fccc43ad9a5 | 3136K 0.29297 5.3993 539.93s | 2:10:50:19 32.82% |
# | Dec 15 13:04:01 | M57885161 19100000 0x6e8b11826f5f2620 | 3136K 0.29688 5.3998 539.98s | 2:10:37:04 32.99% |
#
# To use the more traditional formatting, set OutputHInterval=0, for no header with the new format, set OutputHInterval=-1
OutputHInterval=19
OutputHeader=| Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
OutputString=| %M %D %h%m%s | %P %i %r | %f %x5 %q4 %c2s | %T%d2%% |
# Threads is the same as the -threads option. This sets the number of threads
# used in the multiplication and splice kernels. Each of the two values must
# be 32, 64, 128, 256, 512, or 1024. (Some FFT lengths have a higher minimum than 32.)
# These vaules will be used only if no <gpu> threads.txt file is present or no entry
# for the current exponent is in that file. The file is generated by running
#
# ./CUDALucas -threadbench s e i m
#
# This will time i repetitions of a 50 ll iteration loop, for certain fft
# lengths between s * 1024 and e * 1024. The parameter m gives some control
# over which fft lengths are tested, which thread values are tested,
# and screen output:
# bit 0: if set, only fft values from <gpu> fft.txt will be tested,
# otherwise, all reasonable fft lengths will be tested.
# bit 1: if set, skips thread value 32.
# bit 2: if set, skips thread value 1024.
# bit 3: if set, supresses intermediate output: only the optimal
# thread values for each fft will be printed to the screen.
# E.g.
#
# ./CUDALucas -threadbench 1 8192 5 10
#
# tests all reasonable (7-smooth multiples of 1024) fft lengths from 1k to 8192k
# using thread values 64, 128, 256, 512, and 1024, supressing intermediate output.
#
# The <gpu> threads.txt file can be manually edited. Each row must be of the form:
#
# fft mult splice
#
# where fft is the fft length as a multiple of 1K, mult, and splice are the
# threads settings and must again be 32, 64, 128, 256, 512, or 1024. Note that some
# of these values might cause an error, depending on the fft length.
# Default: 256 128
Threads=256 128
# FFTLength is the same as the -f option. If this is 0, CUDALucas will
# autoselect a length for each exponent. Otherwise, you can set this with an
# override length; this length will be used for all exponents in worktodo.txt,
# which may not be optimal (or even possible).
#
# Users should be aware however that you can now specify the FFT length on a
# per-exponent basis via the work file; to use (e.g.) a 1440K length for a test,
# the line should look like "Test=<assignment key>,<exponent>,1440K". Note
# that there can be no space between the number (1440) and the K. You must have
# a K or M (e.g. ",<exponent>,3M" for a 3M length) for the program to recognize
# the field as an FFT length. This newer work file feature should render this
# ini option obsolete (which should thus be kept at 0).
FFTLength=0