Back in the 80′s Creative Computing published a BASIC benchmark written by David H. Ahl and at least one article that compared a lot of the machines of the day. I’ve found the main article and source for the program on the web but can’t find the results anymore. If anyone has the magazine I’d appreciate it if you would post the results.
I know the CoCo and MC10 were evaluated and I’m pretty sure the CoCo times were about average but I’m pretty sure the author was not a fan of the CoCo. I never saw an evaluation of the CoCo running with the high speed poke or the CoCo3 (it didn’t exist yet).
Here is the article minus the benchmark results and program (VOL. 9, NO. 11 / NOVEMBER 1983):
[url]http://www.atarimagazines.com/creative/v9n11/259_Benchmark_comparison_test.php[/url]
And here is the program:
[url]http://setiathome.berkeley.edu/~korpela/ahl/basic.html[/url]
Do you have a good random number generator for the 6809?
I wrote one for use in a C library but I haven’t tested how well it works yet. It was pretty much a conversion of a simple one for the 6502. It was designed more for speed than perfect distribution of results.
Now if I can just remember where I put it. The machine it’s on crashes if I try to do much with it. I hate windows.
There is one by Mark McDougall in Mary’s Oct. 2006 newsletter. There is one in my Symmetry program which you can download from my web pages. The source is included.
[url]http://home.att.net/~robert.gault/Coco/Downloads/Downloads.htm[/url]
I think there is one on Lomonster’s site.[/url]
For a decently fast one, try one of these http://en.wikipedia.org/wiki/Linear_feedback_shift_register
However, they do not have great properties, nor are they secure. Making a cryptographically secure one on the Coco is a very hard task, so you should go for speed.
Many modern C rand() functions use a linear congruential method, which would be slow on a CoCo due to a large multiply (and possibly an expensive modulus operation).
Another one that would work well on a Coco, and provide tremendously good quality are the WELL512 type algorithms by LEcuyer ( http://www.iro.umontreal.ca/%7Epanneton/WELLRNG.html ), which are even better than the popular Mersenne twister based PRNGs.
I’ve created a few PRNGs for embedded projects where bits and memory are constrained as in the CoCo. The original HypnoCube ( http://www.hypnocube.com ) used a home grown one based on Galois relations. The newer HypnoCubes (and soon to be release HypnoSquare) use the WELL512 algorithm, which provides state of the art high quality pseudo-random numbers, and is fast and easy to implement on the PICs.
Hope this sets you in a good direction
.
Chris, do you know if any LSFR sequences can pass a spectral test, say plotting N vs N1? I just get two parallel lines which is pretty bad.
No – none of the LFSRs pass linear relations tests, since they are *linear* feedback. The output bit is a linear function of the state, hence it only takes a small amount of output to deduce the internal state.
Same thing goes for linear congruential methods.
The only benefit for LFSRs is they are fast and easy to implement and plenty good for many things such as games. They are not good when quality RNGs are needed, like weather simulation
I’ve just tried some test on an LFSR, 32 bits (to match the Coco floating point notation) with taps at 1, 2, 22, and 32. The length for this LFSR is 4,294,967,295 so there is lots of room for skipping output. If I take every seventh output (i.e. skip 6) and plot the N vs N+1 values, a 192×192 screen is almost filled.
That’s a massive improvement (I think) in the spectral test for pairs. I’ve not tried to plot triplets. However, there does not seem to be any degradation in the sums of sets of 10,000 values. They are within about +-40 for 5000-sum. If I plot the hits for 0-191 vs count, the bar graph is fairly smooth.
What would you say about that type of prng, as a working mathematician.
Again – it all depends on what you are trying to do. If you’re getting your values 0-191 by taking 8 bits (0-255 value) of output from the PRNG, and then wrapping, you’ll get twice as many occurrences of small numbers as you should expect. If you discard, you’ll get more uniform values, at the cost of losing some of the period.
If you just want to pass spectral tests for a *fixed* dimension, just pick a value n relatively prime to the size of the space, and repeatedly add this value to itself, converting to pixel coords. This will hit each point exactly one before repeats. For example, 256×256 = 65536, so taking a variable p = 0, adding, say, 29501, wrapping around at 65536, converting p to an x,y (x = p mod 256, y = floor(p/256)) and plotting, you’ll get perfect 2D coverage (and actually, perfect coverage for 256x256x256…. to any dimension). However, this is weak for other purposes.
For general use, your 32 LFSR will show bias in some experiments and situations. Otherwise it is a decent PRNG for many uses.
Some more benchmarking over on a C128 forum:
[url]http://landover.no-ip.com/128/viewtopic.php?pid=6532#p6532[/url]
Chris,
Actually all 32 bits are being used since they fit exactly into the Basic mantissa for floating point numbers. So, the output of the LFSR becomes a fraction between 0 and 1, and times 191 gives an answer between 0 and 191.
I was just curious to see what it would take to improve the output for pairs. A better way to use this LFSR to plot random points on a 192×192 grid would be to use single results not pairs. R=LFSR*191^2, Y=INT(R/191+.5), and X=INT(Z-Y*191+.5). That does not produce any spectral test problems, no lines are seen, and there is total coverage of the grid.
It would seem that non-perfect prngs can be of value if used appropriately. The trick for those of us that are not mathematicians is finding the right methods or having a good “cookbook”.
If you’re taking the entire 32-bit state then there will be a lot of bias – for example 1/2 of the numbers will be rougly twice as large as the previous one (or half as large, depending on how you shift), since the state is shifted one bit per pass. If you’re taking every 7th, then there will again be too many sequences where numbers tend to double, and fewer sequences where they halve, leaving a significant bias. The only output you should take from a LFSR is the single bit shifted off the end. A LFSR state is nowhere near random, since only one bit changes per pass, but the bits coming off the end are pretty random.
If you’re just trying the spectral test a simple counter passes perfectly
BTW, the CoCo3 with a 6309 in Fast mode should run the longer BASIC benchmark off the C128 forum in almost half the time of the C128 in Fast mode.
Summary of results so far
* all machines marked with an asterisk were benchmarked on an emulator and results may or may not be accurate
Atari 8 bit—————————-
Atari Advan Basic 456.2 sec
Accuracy .036606
Random 18.92, 11.51
Atari 8K Basic 405 sec.
Basic XL Time 395.88
Basic XE Time 49.7
Turbo Basic XL 41.6
Turbo Basic Compiler 33.1
Accuracy = .013959
Random = 16.77, 8.80 (actually varies with each run on all versions of BASIC)
Atari Microsoft Basic Time=101.4
Accuracy .150879
Random=2.06506, 2.06506 (5.60 when RANDOMIZEd)
Tandy Color Computer 3 in high speed mode: ———————
Time: 71 sec.
With the fast keyboard routine enabled: 69 sec
With 6309 running in native mode, no 6309 BASIC patches 56 seconds
accuracy 5.96284867E-04
random 7.3876276 test1, 24.5601945 test2, 9.40813446 test3
BASIC O9 under OS-9 (Nitros-09?) 6 seconds
accuracy 9.45091248E-04
Apple IIc Plus ————————
~ 31 seconds (timed by hand)
Apple IIgs 2.8MHz ——————-
Time 42.5 (timed by hand)
Accuracy 1.04141235E-03
Random 8.42399192
The Commodore Plus/4 (PAL) —————
Time: 109 sec
Accuracy 1.04141235-03
Random 11.1208959
Oric Atmos—————-
Time 144 sec
Accuracy : 1.04141235E-03
Randow : 3.47206211
Dick Smith VZ300 ——————————-
Time : 108
Accuracy : .0338745
Random : 13.5997
C64 timed by hand —————————————–
A^2 method Time =123 sec.
Accuracy=1.041 e-3
Random 10.09
A*A method Time=76 sec.
Accuracy 3.227 e-4
Random 4.19, 13.83
Sinclair Spectrum* ————————————–
Time 290 sec (278 appears to have been a math error)
Time 175 (using “let a=a*a” instead of let a=a^2 and iterative square root instead of SQR)
Accuracy 0.0006685257
Random 4.322937
Amstrad CPC 6128 Plus Locomotive BASIC 1.1 * —————————
Time 38.9666667 sec
Accuracy 3.50475E-05
Random 3.07867885
Well, after a lot of benchmarking we discovered that if you replace:
a=a^2
with
a=a*a
it will speed up the program.
On VCC it was around 30% but VCC’s times don’t match the real thing.
If someone could rerun benchmarks using that change I’ll add the new numbers to the results.
It does not make any sense to change the benchmarking program to get a faster result as it is not relevant. The whole point of a benchmark is to determine the effects of a single variable, in this case an OS and Basic.
It is a reasonable question to ask what the difference in speed is between exponentiation and multiplication. While this would be better tested as a separate benchmark, using the code for which I originally reported, the time changes from 1min12sec to 44sec by changing A=A^2 to A=A*A.
So what impact does this have on Coco programming? It just means you should only use exponentiation when multiplication won’t work or space is at a premium, ex. A=A^3.141… or A=A^500. Neither one of these examples is appropriate for multiplication.
Don’t be so quick to assume A=A^500 is faster than multiplying it cleverly. If the exponentiation algorithm is general purpose (like the CoCo) it might be much slower than computing A^500 with eight multiplies and five adds (and there might be faster ways – eight and five is just the easy way to do it). Exponentiating to a positive integral power is almost always faster to do with multiplies than with a general purpose exponentiation routine.
Later if I get inspired I’ll time it.
I think the real lesson here is there are many ways to do the same thing and doing a little timing and experimenting will help you write faster programs.
Indeed, there are many ways to program the same task.
Here is a short test program to try.
[code:1:aab2bbde16]
10 A=1.1:B=A:C=900
20 PRINT DATE$ a custom function with a Smartwatch
30 PRINT A^(C+1)
40 PRINT DATE$
50 FOR I=1 TO C
60 B=B*A
70 NEXT
80 PRINT DATE$
90 PRINT B[/code:1:aab2bbde16]
The first thing you will find as C is changed is exponentiation and multiplication drift apart at as C increases in size on a Coco. The above takes 1 sec for line 30 and 7 seconds for the FOR/NEXT loop. For values of C less than about 100, it is not possible to measure the time accurately as both methods are about 1 sec or less.
One point I was trying to make was that a long string of B=A*A*A*A... is not practical and eventually a loop will have too much overhead compared with exponentiation.
The main point was that you don't care how fast a benchmark runs when the object is to compare systems. It is the system to system difference that matters not the absolute time. Ahl's benchmark times are meaningless if A^2 is used on one system and A*A used on another.
That’s why I’m posting *both* times.
You're still doing exponentially more work than needed for the multiplication case, so it is not a very good way to test. You can compute the 901th power of something with 13 multiplies rather than 900. You are doing exponentially more work than necessary naively computing powers with a loop.
Expand 901 in binary {1, 1, 1, 0, 0, 0, 0, 1, 0, 1}, and keep squaring B, multiplying B into a temp for each 1 in the binary expansion, like so
T=B:B=B*B:B=B*B:T=T*B:B=B*B:B=B*B:B=B*B:B=B*B:B=B*B:T=T*B:B=B*B:T=T*B:B=B*B:T=T*B
and T is your answer. No loop. Saves you 887 multiplies. I'd guess this cuts the multiplication method by a factor of 50-80, making it pretty fast, and perhaps faster than the exponentiation.
Besides, the direct multiplication method is also more accurate. Using a general purpose exponentiation routine for integral powers is almost always the wrong way to do computation.
Your method would be like using 900 adds to compute 901 times a number instead of doing a simple multiply. Of course all the adds would be slower since you're doing exponentially more work than needed.
Since the CoCo has software multiplication, it might still be slow
The CoCo (as far as I can tell) does exponentiation using a^x = e^(x ln a), so it does a natural log and a natural exponentiation, both of which lose precision compared to simple multiplies.