PDA

View Full Version : Pseudo-Percentiles, Calculated for All Stat Arrays



Reltzik
2007-03-22, 05:31 PM
I have just calculated the pseudo-percentile of every possible DnD stat array. Not the percentiles for highest-stat, next-highest, next-higest, et cetera... but for every possible array.

*watches every jaw drop out of sheer and undiluted apathy*

Nonono, this is useful, it lets you say to your players "We're playing a 65th percentile game", and all your players are then free to choose any stat array that's 65th percentile or lower from this list. This avoids some of the problems of standard point buy (everyone having an 8 in one thing and an 18 in another, the WotC mandated adjustment downward "to counteract the lack of uncertainty", et cetera). I'm sure it'll have some problems -- I've already thought of a few -- but it's nice to have alternatives.

The output was about fifty-seven THOUSAND lines long (that's how many possible stat arrays there are), so I'm not going to post it all here for fear of the mods banning me for flooding. Instead, I hope to recruit someone to make a nice stat-generating GUI out of this.

The next post is a bunch of dry math stuff about methodology (very interesting math to me, very boring to most everyone sane), and the one after that is dry stuff about me trying to get someone else to program a GUI for stat generation (ditto, except it's programming stuff). If there aren't at least three posts when you first read this thread, wait a minute and hit refresh.

After that, we'll play around a bit. Give me the stat arrays from some of your favorite characters and I'll tell you what pseudo-percentile they fall in. (Note: No more than three per poster per each of my responses, no more than ten posters per day guaranteed.) We'll get a feel for this system, see how we like it, and how it's broken. Hopefully that'll last us until someone makes a GUI and we can REALLY experiment.

... or everyone can ignore this thread just like they ignore most every other undertaking of mine. I'm not picky.

Reltzik
2007-03-22, 05:31 PM
Math Stuff! (Math-phobes, you may want to skip this post.)

Calculating the "average" stat array from the standard 4d6-drop-lowest method is fairly easy, and unless I'm mistaken has been done several times.

But there's another, less-oft used form of central measure, called the median. Despite some disadvantages compared to averages, medians also have their advantages... specifically, they tend to be more representative of the data set. The "average" roll in DnD would strike most people as a bit low, because a few REALLY low rolls counterbalance a lot of higher rolls. Medians reflect the large number of higher rolls and ignore the extremeness of lower rolls. Their extensions (percentiles) are also far less of a headache to deal with then those of averages (standard deviations). I remember that someone on this board calculated the "median" stat array, maybe about six months ago or so.

One of the problems associated with both, though, is that you need some way to put them into a proper order. That is to say, you need to have a way of saying having all 15s is better or worse (okay, fine, worse) than having three 14s and three 16s. If there isn't some way of ordering them, you don't really have a median; and if there isn't some way of adding and dividing, you don't really have an average. (That's why I put them into quotes.) The two methods, mentioned above, did this by calculating the average/median highest stat, the average/median second-highest stat, and so on, and then combining them into a stat array. While somewhat enlightening, a lot of data is lost.

All this is why the math geeks reading this thread (and screaming "I KNOW THIS ALREADY!" during the last three paragraphs) probably smelled a rat when I started talking about percentiles for stat arrays. That's why I called it a pseudo-percentile. This sort of naming method makes it the same rat that I was smelling about the time that I (a vegetarian) learned that the "natural flavoring" in McDonald's fries was, in fact, beef.

In other words, I invented a measure which I am claiming is SORT OF like a percentile and called it a pseudo-percentile, and everyone should be looking at me askew and asking, "Okay, but what is it REALLY?"

So where's the beef? What do I mean when I say pseudo-percentile?

First, I treat the set of all stat arrays as a partially ordered system. For the laymen out there, that means that you have four, not three, possibilities when you compare items. You can have some that are clearly superior (such as all 15s to all 14s), some that are inferior (the same arrays compared in the opposite direction) some that are equal (3,3,3,18,18,18 versus 3,18,3,18,3,18 -- effectively the same thing since you assign at will), and finally some that just don't lend themselves to clear-cut comparisons (all 15s, versus three 14s and three 16s). Most of us should be familiar with greater-than, less-than, and equal; the last category is called "non-comparable" and it's what makes partially-ordered systems partially-ordered.

To be explicit, I arrange the stats in ascending order. (Or descending, both work.) Then array A = array B if and only if A[n] = B[n] for n from one to six. A is less than or equal to B if and only if A[n] is less than or equal to B[n] for all n from one to six. Intuitively, if A[n] is less than or equal to B[n], then B[n] is greater than or equal to A[n]. They are non-comparable if A[n] is less than B[n] for some value of n from 1 to 6, and greater than it for another. Finally, less-than, greater-than, equal, and non-comparable are mutually exclusive qualities.

Okay, so that's our comparator. Now, each stat array has a weighting: The odds of rolling THAT PARTICULAR stat array. For example, the odds of rolling all threes is 1 out of 4,738,381,338,321,616,896. (That's worse than 1 in 4.7 million trillion. If you've ever rolled a character with all threes, you've got bad dice by any sane scientific standard of certainty. Dispose of them in an appropriately gruesome manner.)

To calculate the pseudo-percentile of whatever stat array you are looking at, calculate the sum weight of EVERY stat array greater than it (call this value q), of EVERY stat array less than it (call this value p), and the stat array's own weight (call this value x). Ignore the non-comparable stat arrays. The pseudo-percentile is p / (p + x + q). I also calculated an alternative pseudo-percentile of (p + x) / (p + x + q), and the average is also attractive in an ugly sort of way: (2p + x) / 2(p + x + q). My output file contains the first two.

(My apologies for the lack of credit if anyone else has used this method before me. I'm not that good at research and didn't come across this method while trying to find one that'd work.)

Anyone interested in the raw data can PM me with an email address to send it to. It's 2.3 megs of text file. Feel free to redistribute it at will, but also to pour over it and try to extract any good approximating formulae out of it.

(By the way, my hand is REALLY cramped after doing the arithmetic for all those stat arrays.)

Reltzik
2007-03-22, 05:32 PM
Computer Stuff! (Technophobes may wish to skip this post and, for that matter, leave off the internet entirely.)

I'm hoping that someone, or perhaps several someones, will turn this into a stat-generation GUI similar to the point-buy one at Invisible Castle. The output file is about 2.3 megs of text and organized by stats; too much to post here, but very nice for building a program around. At the very least, you should be able to put in six stats and get a percentile out. (Actually, my word processor's "find" op does that nicely.) Better yet, something where you can easily increment or decrement individual stats one at the time and see a color-coded warning if a certain operation takes you over your limit... well, be creative. I'm asking for volunteers, I can't be picky.

The format of the output file is one stat array per line. The format is: <statArray> <space> <p/(p + x + q) percentile, in scientific notation> <space> <(p + x)/(p + x + q) percentile, in scientific notation> <new line>. <statArrays> are six numbers, each separated by a lone space, between 3 and 18, arranged in ascending order. (So 18 3 3 3 3 3 isn't there, but 3 3 3 3 3 18 is.) PM me an email to send the file to if you're interested.

Also, I'd like to have someone independent of me write their own program to double-check my numbers (description of problem in the math post above). I THINK I got them right, but I may well have messed up. Goodness knows there were a dozen bugs when I first compiled the program. (Then I added a curly brace I'd left out, and it went up to a few hundred.) Kudos if you can figure out how I did it in O(n) time (n being number of possible stat arrays). Extra kudos if you figure out a DIFFERENT O(n) algorithm.


I can also send you the source file of my own program if you want, but be warned of the following:

1) It's in C. Not C++, not objective C. C.

2) The very first line of main() is the declaration:
STAT_NODE****** raggedArray;
If you don't like pointers, you won't like my programming.

3) I did not mean for my code to be reusable, or understandable. Much of it is slapdash. There are exactly three comments in the entire thing. One of them is simply me gloating over the unholy existence of a sixth level pointer.

So if you're going to ask for my code, consider yourself warned. Oh, and no kudos to anyone who figures out my O(n) algorithm AFTER seeing my code.

Krellen
2007-03-22, 05:59 PM
A six level pointer?

You, sir, are going to programmer hell.

PMDM
2007-03-22, 06:02 PM
Why don't you post the % results here?

Deus Mortus
2007-03-22, 06:06 PM
Send me the text file to DeusMortus[at]Gmail.com and I'll slap a nice gui around it.

(I won't have my pc back until end next week, so if someone can do it now, be my guest)

Reltzik
2007-03-22, 06:13 PM
Why don't you post the % results here?

Because there's (if I remember right) something like 57 THOUSAND lines of data. (I'm lazy and don't want to count again.) My verbose manner notwithstanding, I don't wish to flood the boards.

Deus Mortus
2007-03-22, 06:27 PM
Atleast post the % for the things like all 3's, all 4's, all 5's, etc, give us something to work with ;)

Reltzik
2007-03-22, 06:48 PM
Atleast post the % for the things like all 3's, all 4's, all 5's, etc, give us something to work with ;)

Okie-dokie. First six numbers are the stats. After that, first is the p measure, and the second is the p + x measure.

18 18 18 18 18 18 9.904326e-001 1.000000e+000
17 17 17 17 17 17 9.312169e-001 9.422059e-001
16 16 16 16 16 16 8.528005e-001 8.716578e-001
15 15 15 15 15 15 7.367228e-001 7.569749e-001
14 14 14 14 14 14 6.055284e-001 6.294270e-001
13 13 13 13 13 13 4.689773e-001 4.908002e-001
12 12 12 12 12 12 3.417128e-001 3.625905e-001
11 11 11 11 11 11 2.342769e-001 2.511505e-001
10 10 10 10 10 10 1.491553e-001 1.627524e-001
9 9 9 9 9 9 8.743759e-002 9.728782e-002
8 8 8 8 8 8 4.700571e-002 5.351215e-002
7 7 7 7 7 7 2.281668e-002 2.675059e-002
6 6 6 6 6 6 9.361591e-003 1.157197e-002
5 5 5 5 5 5 3.102378e-003 4.136505e-003
4 4 4 4 4 4 7.730963e-004 1.159644e-003
3 3 3 3 3 3 0.000000e+000 1.286836e-004

Also, the Elite Array:

8 10 12 13 14 15 2.049263e-001 2.085460e-001

(That's about 20th percentile. Pathetic! EDIT: On the other hand, this system ranks it inferior to all 11s, when we know that just isn't true.)

EDIT: (I'm getting the urge to recalculate it all, taking into account the "reroll for bad stats" rule.)

Jacob Orlove
2007-03-22, 06:49 PM
Those are super rare, though. It'd be much more interesting to see what a couple of "typical" stat arrays look like.

Say,
11, 11, 11, 10, 10, 10;
13, 12, 11, 10, 9, 8; (default NPC array, 15 point buy, 'average' 3d6 roll)
15, 14, 13, 12, 10, 8; (elite array, 25 point buy, 'average' 4d6 roll)

edit: whoops, simu-posted with the above. This was supposed to be a reply to Deus Mortus.

Reltzik
2007-03-22, 07:03 PM
10 10 10 11 11 11 1.712570e-001 1.874345e-001
8 9 10 11 12 13 1.154418e-001 1.215573e-001


17-18th and 11-12th percentile, respectively. Elite Array was covered at the end of my last post: 20th percentile.

CharPixie
2007-03-22, 07:36 PM
p is basically all rolls that are LESS THAN, q GREATER THAN, and x undetermined, right? If so, you might have a problem.

If you look at:

http://d20.jonnydigital.com/2006/10/what-are-the-odds

There's a pretty good rundown on the odds of 4d6. You can get the percentiles from it by some easy addition. Now, for your stat arrays of X X X X X X, the only way for an array to be less than it is for ALL of the stats in the array to less than it (and that's just bad english). Put another way,

p(less than) = p(each less than)^6 (which is much less)
however
q(greater than) = q(each greater than)^6 (which is again much less)

therefore, any fuzzy probability should enclose the percentile of X on 4d6.

Here's the percentiles:

0.1%0.4%1.2% <- 52.8%5.7%10.5%17.5%26.9% (10)38.4%51.2% (12)64.5%76.9%87.0%94.2%98.4%100.0%
You see the closest roll to the median is 12. So, 12 12 12 12 12 12 should include the percentile 51 whatever. Instead it's 34-36%.

Even if your method excludes all fuzzy percentiles (which I think it does, if I understand it correctly), p(x)^6 should still be a median value if you are normaling it by p(x)^6 + q(x)^6 (they should be roughly equal).

On the whole, I think you are on the right track; it sounds like your six headed pointer is some implementation of Dynamic Programming, which if you set up an DP grid going both ways, you can likely calculate the solution soundly in O(n) time (it's the charm of the demon, right). And your results look sane; they just don't seem RIGHT, you know?

Reltzik
2007-03-22, 08:09 PM
If I understand what you're saying, CP, I was using a somewhat different method. The percentiles of the individual stats were never calculated, just those of the entire stat ARRAY.

x refers to the weight (odds) of the stat array in question, not the non-comparables. Those get ignored.


The fact that 12 is a median on a SINGLE stat suggests, with some thought, that the median on 6 identical stats will be higher. Look at the frequency table that you linked. Now take each of those entries and raise it to the 6th power. Those are the odds of getting, say, all 15s.

(EDIT: Reworded next paragraph for clarity purposes)

The values below 12 are less-frequent numbers; their strength is that there are more of them. The values above 12 are more-frequent numbers. When you raise their frequency to an exponent, you magnify that advantage; meanwhile, the frequencies of the lower values don't increase by nearly as much, and THEIR advantage of, well, numbers remains unchanged.

Erom
2007-03-23, 09:27 AM
Holy crap, awesome.

If you send me the file (downward_spiral(at)mit(dot)edu) I'll write a gui into an executable .jar... you know, when I get around to it. It won't too fast, but it should run on anybody's comp that has a recent java install.

Yakk
2007-03-23, 10:36 AM
You should print out your chances in percentage-with-3-decimal places form. Sure, you'll get some 0.000% entries, but they will be far more readable.

Cute metric. As you noted, it has issues, with the elite array being considered less than an all-11 build.

You might want to remove your 6-level pointer. You can deal with that kind of thing pretty easily using a 1-level pointer and a simple function that does the array lookups for you.

Just write:


struct index {
int dim[6];
};

double* get_entry( double* table, index i );


Now your index struct contains 6 index values, and the arrangement of the table is coded (manually) into get_entry.

Far more readable, and all of your pointer math is put into one function (where you can write checks-for-bugs and boundry tests).

Reltzik
2007-03-23, 02:33 PM
That's sort of what I did, instead I didn't write a function for access. The ragged away was "keyed" by an array of 6 ints. Once I built the array, I was accessing it with the array's members all the time. (Actually, I'd access it once, then assign the location to a first-level pointer called cursor.)

But since I didn't know what the ragged array was going to look like (and I wouldn't have hard-coded it without 6 nested for loops even if I HAD), I had to allocate it dynamically. And in C, a dynamically-allocated 6-dimensional array starts off life as a 6th level pointer. I suppose I could have wrapped it in a pretty-looking typedef, but C's not strongly typed and I prefer the reminder of what's ACTUALLY going on.

As for a lookup function, I considered it, but I'd have only USED it about six places in my code. And I could cut-and-paste those lines.

Oh, and Erom, you should have the data now. And Deus Mortus, you should have gotten it yesterday. However, hotmail's acting a bit quirky when I send this, so tell me if you didn't get it.

Erom
2007-03-23, 02:51 PM
Got it and working on it.

EDIT: OK, here's what the first version looks like. I'll try and get a download link up soon. It was compiled under Java 5.0, so as long as you have that it should run. Ignore the code in the background - pay no attention to the man behind the curtain :smallsmile:.

http://img267.imageshack.us/img267/9791/spifrh4.th.jpg (http://img267.imageshack.us/my.php?image=spifrh4.jpg)

CharPixie
2007-03-23, 08:13 PM
The values below 12 are less-frequent numbers; their strength is that there are more of them. The values above 12 are more-frequent numbers. When you raise their frequency to an exponent, you magnify that advantage; meanwhile, the frequencies of the lower values don't increase by nearly as much, and THEIR advantage of, well, numbers remains unchanged.

My point was that if you excluded everything except p(x) and q(x), and took both of them to the sixth power, then the ratio p(x)/q(x) would move closer to 0 or one, since that's what you are examining p(x)^6 / (p^6 + q^6).

Anyway, I got bored and wrote a version of it: http://www.sfu.ca/~dbruvall/. It's in the percentiles files. .NET Framework is required, since it's in C#.

By the way, I was apparently high when I was rambling on about dynamic programming. That's over engineering for you.

I still have some concerns; by eliminating the fuzzy probabilities, you are reducing the data set by around 90%. That 90% could be very significant.

Erom
2007-03-23, 08:36 PM
Alright! It works, at least on my machine. Use at your own risk, of course, although I didn't do anything deliberately bad in the code. I've hosted it at rapidshare. I know, annoying, but free.

http://rapidshare.com/files/22485729/Roller.jar.html

Note also, that so long as you keep the .jar file's file structure intact, you should be able to swap out the text file for a new/custom version without breaking the code.

Reltzik
2007-03-24, 11:38 AM
SWEET. That's beautiful work, Erom.

Okay I've got it running, and I decided to test it out on a barbarian at equal to or below 50th percentile.

I got 16, 14, 15, 10, 14, 10 as being 47.7 percentile. I actually had wisdom at 12 and wanted to increase con to 16, but I couldn't; that put me over. But bumping the wisdom up 2 points, that was workable. Hmm. Something doesn't feel quite right about that.

But wait! Most of us horribly abuse this concept called DUMP STATS. Let me try this here with oh, I dunno, Charisma.

16 16 16 16 16 9 is 49.6th percentile.
17 16 17 15 16 8 is 49.9th percentile.
18 16 16 14 16 7 is 50th percentile.
18 17 18 17 17 6 is 49th percentile.
18 18 18 18 18 5 is 45.6th percentile. o.0

Each time I dropped charisma, the percentile went from somewhere around 50 to somewhere around 35. A single low stat frees up a HUGE amount of percentilage for elsewhere. It's like the opposite of point buy, where changes in low stats are meaningless, but increasing stats is increasingly expensive.

So, I liked the idea, but it looks like this isn't a good system to use.

Thanks for the time and effort of everyone here (Especially Erom) in critiquing it.

EDIT: This data might STILL be of interest to probability-junkies, so I'll still email it to anyone who PMs me. Oh, and I just got an idea for a different system but, in honor of the recently departed, I'll wait a day before being able to put it up in good taste.

EDIT to EDIT: Oh, and no offense intended in not mentioning yours, CP. I don't have .NET and so I couldn't look at it. Still, I'm sure it's good work.

Reltzik
2007-03-24, 01:15 PM
DAMMIT!

Okay, I was thinking, which is never something which is advisable, and I was trying to figure out how the hell 18 18 18 18 18 5 was below 50th percentile. And I was thinking back to the code and... yup, I realized I had a bug. I'd implemented my algorithm wrong.

So this data? It's all bad.

Here's what went wrong:

My algorithm constructed the ragged array holding the stat arrays upwards -- first all 3s, then 5 3s and a 4, so on. I guaranteed that by the time I got to any given stat, everyone below it was already constructed. At the same time, I used basic probability theory to calculate the weighting of that particular stat. So far so good. And I also gathered up the data I needed to calculate the weight BELOW the stat.

This was the tricky part of the algorithm, and my solution to it is what I was (and still am) proud of. Actually adding up everything below with a tree search is, itself, an O(n) opperation. Repeated n times, once for every stat array, that becomes O(n-squared). On such a huge data-set, this was just unacceptable to me. (In retrospect, the machine probably could have handled it.) I can't just add up the weights of all the ones below; that creates an overlap. So what I came up with instead was this:


Imagine that you've got one of those thousand-cubes from kindergarten/elementary school.

(If you didn't have these things in your school, they're devices to teach counting. Take a bunch of block cubes just small enough to be a choking hazard. You count these out one at a time. When you get to ten, you replace them with a group glued together in a line of ten. At 15 you've got a line of 10 and 5 singletons. When you get to twenty, you replace your line and 9 singletons with two lines. When you get to a hundred, you've got a square of ten lines glued together. And, when you get to a thousand, you've got a cube of ten hundred-squares stacked atop each other. That's a thousand cube.)

Now imagine that we select one corner of this thousand-cube; say, the near, upper-right one, and declare it to be the highest. Any move to the left, to the rear, or down is considered a decrement; any move right, to the front, or up is an increment. How can we add up the number below that cube, excluding it? One way is to just take the volume of the cube by multiplying how far you are into it on each dimension, and subtract 1. So if you're on the third rank forward, the fifth file to the right, and the seventh file up, then there are 3 * 5 * 7 - 1 = 104 below you. But if all the cubes are weighted funny, this becomes a bit tricky.

So here's another method. You take that part-cube that you're interested in, and then you look at that cube EXCLUDING the level that you're currently on. Take THAT thing's weight. Then, you take the weight of the level you're on, EXCLUDING the row you're on. Add that to the stuff below your level. And then, you take the weight of the row you're in, excluding yourself, and add that to your sum. There, you're done.

Do we HAVE those values handy to add up? Yes, if we go at it in the right order. The weight of all those values below your present level? That's the "weight below" value of the guy one step down from you, plus that guy's weight. How 'bout the current level, excluding your row? The guy one step back from you had to calculate that in order to figure out HIS weight, and you can save that data for later access by the guy in front of you. Similarly, the guy to your left will save the data for your present row.

Now expand this concept into six dimensions. For every stat array, you've got thirteen values you're interested in. You've got your own weight. Then you've got the weight of everything below you, PRESERVING THE FIVE HIGHEST STATS. That is, you only calculate the stuff below you by dropping your lowest stat; those 5 18s must remain 18s. Then you have a value for everything below you preserving the 4 highest, three highest, et cetera, right down to preserving nothing at all; that's the total weight below you. A similar set of six values exists for the weights above.

All right, now there are a couple of ways to trip up here. First, the initial value for the recursion: If you want to know what's below you preserving the n highest values, look at your n + 1th highest stat. If that's 3, then there's nothing below you. Set that weight below that stat to 0. In other words, there's no way to make 14 12 10 3 3 3 worse while still preserving the 3 highest stats. That part I was okay on.

But suppose you have 16 15 15 12 10 9. That repetition there is a problem, because I was organizing these things in descending order. I couldn't look at what happened if you decrement the second stat, because then you'd have 16 14 15 12 10 9, which is not part of my array. However, since it was equivalent to 16 15 14 12 10 9, and I'd just calculated the weight below-and-including for THAT, it was the same value and I could just copy it over, right?

WRONG. Because what I'd just calculated was the weight below-and-including that stat array PRESERVING THE TWO HIGHEST VALUES. What I need now was the below-and-including weight of that stat array PRESERVING THE HIGHEST VALUES. What I needed to do was go back to 16 15 14 12 10 9 and grab its preserve-highest-1 value, and instead I was using its preserve-highest-2 value that I'd grabbed earlier.

Oops.

So that's why 18 18 18 18 18 5 is recorded as 45.6th percentile. I grab the weight below-and-including 18 18 18 18 18 4, preserving the five highest. (It's not much.) Then I grab the weight below-and-including 18 18 18 18 17 5, preserving the four highest. So far so good. But then, I incorrectly evaluate the next value as the same thing, when it should have been preserving the three highest (a much, much heavier weight). This problem multiplies through the repetitions until I'm left with this absurd outcome. This is why 18 18 18 18 18 5 ranks as 46.5th percentile, and 18 18 18 18 18 3 is evaluated as 6.51st percentile. EDIT: And 16 16 16 16 16 13 is evaluated as 0.54th percentile!

Like I said, oops.

Anyhow, when break's over (in a bit over a week) I'll go back into school, fix the bug, and run it again. Hopefully we'll have something a bit saner this time around.

Erom
2007-03-24, 01:40 PM
Oh man, I remember the thousand-cubes! Wow... anyway, have fun over break, if you manage to fix the problem, send me a copy of the updated data file, and I'll upload an updated gui.

CharPixie
2007-03-24, 06:59 PM
Yup, that sounds a lot like dynamic programming. You should look it up; it's a good general solution to any problem where you calculate the same value over and over.

However, there's a way of calculating the Percentile on demand. Let's say p(x) is the chance that on 4d6 you roll LESS OR EQUAL than X and q(x) is the chance you roll equal or more. Then, the percentile for any array is given by six p(x) terms and six q(x) terms and the probability of that particular array, corrected for the number of permutations that the array can have. Why? Because if you were to mix p(x) and q(x) terms, it would be part of the data set that you've thrown out. So, if you precalc p and q for 3 to 18, you can generate any particular percentile by the following:

(p(STR) * p(DEX) * ... p(CHA)) /
((p(STR) * ... p(CHA)) + (q(STR) * ... q(CHA)))

In my code I was multiplying both P* and Q* by the number of permutations, but looking at the math I realize it cancels straight up.

EDIT: By the way, if you don't order your arrays while calculating and then deal with the amalgamation later, i think you'd get a simplier algorithm, if one that used a BIT more memory.

nows7
2007-03-25, 09:18 AM
WOW! I have no understanding of what you have done, beyond that it is quite impressive - not string theory earthshattering, but still quite impressive.

If you are ever up for another challenge, you should try fitting poker hand possibilities onto a table for dice rolling.

JoeFredBob
2007-03-25, 05:16 PM
It may not be string theory earth shattering, but it's way more than string theory relevant. =P

Anyway, I'd be interested in possibly looking at different metrics for deciding comparisons. For example, I think we can say that 10 10 10 10 11 11 is worse than 10 10 10 10 10 12. Why? Well, your total modifier is higher. It might be worth considering total modifier, rather than just individual numbers. Aditionally, 10 10 10 10 12 12 should be worse than 10 10 10 10 10 14, if we want to line up roughly with how point buy works.

I haven't put a huge amount of thought into this yet, but it's probably worth thinking about. All the algorithms in the world aren't worth anything if your metric isn't good.

Reltzik
2007-03-27, 04:16 PM
Char Pixie, I had to mull that one over for a few days to remember why it wouldn't work. Then I had to drive 600 miles in a single day, for what we may assume were unrelated reasons.

The problem with your system is that it includes ordering when ordering doesn't matter. In other words, if they're all 15s, you're including odds for 14 15 15 15 15 15 as below, when no such stat array exists (assuming that we're "naming" these things by descending order). If you instead try to say that this is a valid array, you're saying that ordering does matter... in which case, you can't add up the weights for 14 15 15 15 15 15 and 15 15 15 15 15 14 and say that it's the sum of that equivalence class, because you have overlap at all 14s (and one or two other places, as well).

Yakk
2007-03-29, 10:00 AM
Two stat case:
V(A,B) = V(A-1, B) + V(A, B-1) - V(A-1, B-1) + X(A,B)

Ie, the number of rolls under (A,B) is the number of rolls under (A-1, B) plus the number of rolls under (A, B-1), minus their overlap (the rolls under (A-1, B-1)), plus the number of rolls that hit (A,B) exactly.

Three stat case:
V(A,B,C) = V(A-1, B, C) + V(A, B-1, C) - V(A-1, B-1, C) + V(A, B, C-1) - V(A-1, B, C-1) - V(A, B-1, C-1), + V(A-1, B-1, C-1) + X(A,B,C)

... I think.

Rolls under 2,2,2 are rolls under 1,2,2 and 2,1,2 and 2,2,1.

The rolls under 1,1,2 and 1,2,1 and 2,1,1 are counted twice, and the rolls under 1,1,1 are counted three times.

Subtracting out the rolls under the (1,1,2) etc, we remove the rolls under (1,1,1) three times -- so we have to add it back in.

This gets more complicated, but there does appear to be a pattern.

Odds are one could make such a recursive definition for 6 stats.

Then a dynamic programming solution falls out, so long as the recursive definition doesn't get too crazy, our problem is solveable in O(number of different roles) space and O(different roles * complexity of recursive definition) time.

CharPixie
2007-03-29, 02:44 PM
It doesn't matter if it's ordered or unordered. Those terms actually cancelled when computing P/P+Q. I was including them up until the point where I realized (A) I had made a mistake in the program that counted the number of orderings and (B) once I fixed it, it yielded the same results.

Ah. And Yakk demonstrates why I should handwave less while doing problem formulation. I was giving it some thought, and I think you could use:

V(A..F) = V(A-1,B..F) + X(A..F) where a > 3

and if you switch to B-1 once A-1 is depleted, you could reduce the complexity of the recursive definition to something that's always two terms (and six if statements).

Seatbelt
2007-03-29, 03:01 PM
How would this be practical and useful for players?

KIDS
2007-03-29, 03:15 PM
Though my knowledge of statistic analisys is limited and I don't know much about programming either I do see the potential in your work. Was an interesting read neverntheless, keep it up!

Fax Celestis
2007-03-29, 03:20 PM
How would this be practical and useful for players?

During game setup. The DM can say, "We'll play a 60th percentile game, ECL 7," which would let the players choose any stat array of 60th percentile or lower for their character.

It'd be about as useful as point-buy.

Yakk
2007-03-29, 04:42 PM
It doesn't matter if it's ordered or unordered. Those terms actually cancelled when computing P/P+Q. I was including them up until the point where I realized (A) I had made a mistake in the program that counted the number of orderings and (B) once I fixed it, it yielded the same results.

Ah. And Yakk demonstrates why I should handwave less while doing problem formulation. I was giving it some thought, and I think you could use:

V(A..F) = V(A-1,B..F) + X(A..F) where a > 3

and if you switch to B-1 once A-1 is depleted, you could reduce the complexity of the recursive definition to something that's always two terms (and six if statements).

No. X(A, B-1, C...F) isn't counted in your algorithm, and neither is X(A, B-7, C...F) counted, etc.

I suspect the answer is:
V(A..F) = sum V(A...F) with 1 entry lower, minus sum V(A..F) with 2 entries lower, plus sum V(A..F) with 3 entries lower, minus sum V(A..F) with 4 entries missing, plus sum V(A...F) with 5 entires lower, minus sum V(A..F) with 6 entries lower, plus X(A...F)

Total number of lookups:
6c0 + 6c1 + 6c2 + 6c3 + 6c4 + 6c5 + 6c6
1+6+15+20+15+6+1 = 64 = 2^6

So 64 lookups per roll (naturally you sort the lookup). On each lookup, if the value has been calculated, your short-circuit. If it hasn't, you calculate it, store it, and return it.

To populate the entire table, do a lookup on all 18s.

The function X has to be written, and the code that loops over the 63 sub-V-calls.

Plus, we have to check if my theory about the pattern of sub-Vs holds in 6 arguements. :)

Reltzik
2007-04-02, 06:57 PM
Okay, I'm back from break and have fixed the bug, hopefully without creating new ones, and re-generated the data. Let's see what we've got with the previously given examples:


6 identical values:

18 18 18 18 18 18 1.000000e+000 1.000000e+000
17 17 17 17 17 17 9.999964e-001 9.999964e-001
16 16 16 16 16 16 9.997559e-001 9.997585e-001
15 15 15 15 15 15 9.941618e-001 9.941698e-001
14 14 14 14 14 14 9.054083e-001 9.054529e-001
13 13 13 13 13 13 4.038428e-001 4.038865e-001
12 12 12 12 12 12 5.003686e-002 5.005551e-002
11 11 11 11 11 11 3.763039e-003 3.765881e-003
10 10 10 10 10 10 2.502309e-004 2.507208e-004
9 9 9 9 9 9 1.245358e-005 1.250741e-005
8 8 8 8 8 8 4.390555e-007 4.429188e-007
7 7 7 7 7 7 1.002030e-008 1.019653e-008
6 6 6 6 6 6 1.273523e-010 1.327388e-010
5 5 5 5 5 5 5.856073e-013 6.419665e-013
4 4 4 4 4 4 7.138788e-016 8.684430e-016
3 3 3 3 3 3 0.000000e+000 2.110425e-019

This looks a bit more like what we'd want. The real jump is around all-13s, which are around the median roll. (Average is between 12 and 13, but most of us know that the extreme-low rolls skew this average, right?)


Elite Array:

8 10 12 13 14 15 1.120787e-001 1.127845e-001

... huh. This time it's ELEVENTH percentile; before it was 20th percentile. That's fishy. However, it ranks better than all 12s, and worse than all 13s. That sounds about right. Even if the number seems weird, it's all about which stat array is better than which others. Besides, most of us expect to roll better than the Elite Array.


NPC Array:

8 9 10 11 12 13 7.375165e-003 7.575864e-003

Well, we KNEW this would be bad. Now we know just how bad. 0th percentile. Ick.


Final array suggested by Jacob Orlove:

10 10 10 11 11 11 1.988051e-003 2.005921e-003

Now, this ranks as WORSE than the NPC array, and I think most of us would agree with that. Having a pair of +1 modifiers to work with more than offsets the pair of -1 modifiers. All 0s just doesn't cut it.


The Test Barbarian:
10 10 14 14 15 16 5.525621e-001 5.531459e-001
9 16 16 16 16 16 9.925060e-001 9.925190e-001
8 15 16 16 17 17 9.806353e-001 9.807371e-001
7 14 16 16 16 18 9.588755e-001 9.589711e-001
6 17 17 17 18 18 9.992777e-001 9.992823e-001
5 18 18 18 18 18 9.999481e-001 9.999485e-001

This seems a bit more sane. The first one is a decentish barbarian build that still looks a bit forlorn without an 18, and is slightly above 50th percentile. Not a bad fit. The rest are freakishly good barbarian builds, and are justly found above 90th percentile. I'll play around with a test barbarian build some more when Erom plugs this new data into his GUI. (Nice plug-and-play design there, Erom.)


JoeFredBob's data:
10 10 10 10 11 11 1.590065e-003 1.601026e-003
10 10 10 10 10 12 1.683892e-003 1.688250e-003
10 10 10 10 12 12 5.102540e-003 5.118685e-003
10 10 10 10 10 14 3.423939e-003 3.428902e-003

Hmm. Okay, we passed the first test here; 5 10s and a 12 has moderately less suck than 4 10s and 2 11s. The second test it fails; the lone 14 ranks worse than the pair of 12s. Point buy actually pegs the two at the same point value. However, the percentiles are fairly close (in absolute terms, at least), so I'm not going to sweat it. For that matter, I'd say whether the 12s are worse than the 14 depends on the class you choose; a sorceror would want the 14, but a bard or barbarian would want the 12s. To the extent that anyone wanted these stats, at least. Let's try a comparable situation -- compare all 12s to 3 14s and 3 10s, and all 14s to 3 18s and 3 10s.

12 12 12 12 12 12 5.003686e-002 5.005551e-002
10 10 10 14 14 14 8.441363e-002 8.447262e-002
14 14 14 14 14 14 9.054083e-001 9.054529e-001
10 10 10 18 18 18 9.850961e-001 9.851082e-001

In both cases, the few truly high stats were better than the many somewhat high stats, but not by an insane degree. That sounds about right, though I'm still mulling over the two 12s versus the 14s.

Oh, and Yak, while I did consider that system of yours, I discarded it for two reasons. First, navigating what you call the sub-V elements requires some fairly complex logic if you want to ignore the order of the stats. That is, 14 14 14 16 16 16 is the same as 16 16 16 14 14 14. Since we're talking about assign as rolled, we want that. If you DON'T want to ignore the order of the stats, it alters the weightings from what I was looking for. The simplest approach would be to have an isLegal() test for each stat array, and fail for the majority which aren't legal. Second, as you mentioned, it might have as many as 36 sub-V calls -- or, rather, 72, since you could be accessing both the weight AND weight-below of each sub-V. (The test-and-fail approach would have 36 + n calls, where the n was the ACTUAL number sub-V; more complicated logic would only have 2n, which would always be less. N is a variable value, and I got a headache trying to derive a formula for it months ago.) My algorithm requires exactly 12 sub-V calls per stat array (ignoring cases where an individual stat is 3 or 18 -- I couldn't and didn't have to look below and above those, respectively), with correspondingly fewer math operations. Your approach would work; it's just a bit of a pain.

I've emailed out the updated percentile data. Hopefully we'll have a GUI for it shortly.

Rykaj
2007-04-02, 07:40 PM
Wow this is pretty cool stuff. I am pretty much clueless when it comes to maths like this and programming leaves me completely oblivious, but I was wondering if this is completely bug-free now. I might want to introduce it to my group. We usually do the 4d6 and genereally aren't fond of point buy, but this might entice them possibly. So keep up the good work and tell us how far it is finished :)

Yakk
2007-04-02, 08:03 PM
How so?

I don't see anything hard about taking an unordered set of indexes, and generating the sorted order:


struct index {
int i[6];
};

// takes an index i and makes it sorted, least to greatest:
index order_index(index i) {
std::sort( i.i, i.i+sizeof(i.i)/sizeof(i.i[0]) );
return i;
}

// takes an index i, and reduces the element elem by 1:
index reduce(index i, int elem) {
i[elem]--;
return i;
}


So do the totaling of the sub-Vs, you just:


// does the actual work:
void V_total_help( index i, int start, int sign, int* accum );

// nice clean interface:
int V_total( index i ) {
int accum = 0;
V_total_help( i, 0, 1, &accum );
return accum;
}

// can only recurse 6 deep, because of the a<6 restriction on the for loop
// each depth has a the sign flipped
// accum stores the accumulated total.
// the "V" function takes a sorted index, does a lookup in the cache
// and if it fails to find it, generates the value.
// V_total_help will be called 2^6 times in the recursive tree
// (maybe -1), each one will have a different subset of the index decremented
// the sorting of the index is only done before the V lookup, and is then discarded. This is important.
void V_total_help( index i, int start, int sign, int* accum );
{
for( int a = start; a < 6; ++a )
index a_ = reduce( i, a );
accum += V( order_index( a_ ) );
V_total_help( a_, a+1, sign*-1, &accum );
}
}


What is tricky about it?

I guess if you treated the index like 6 different variables, it would get headachy. :)

The V function would be broken down into two parts:


int V( index i ) {
// check for lookup. If it fails:
lookup_location = V_real( index i );
}

int X( index i ) {
// calculates the number of ways you can roll exactly i
}

int V_real( index i ) {
if (index_is_out_of_bounds(i)) return 0;
int result = V_total( i ) + X( i );
return result;
}


The only thing missing is "how many ways are there to roll exactly 1 18, 1 16, 2 12s and 2 10s", and the dynamic cache.

Erom
2007-04-02, 09:16 PM
Home from break, but I caught a nasty cold from all the airflight, so I'm heading to sleep tonight. Tomorrow morning I'll update the GUI. I'm thinking of making it sort of multi-purpose, so that you can select what system you want to use (psuedo-percentile, traditional point buy, or alternate point buy like Reltzik proposed in the other thread(Call it weighted point-buy? Adjusted point buy?)). Anyway, sleep now. Glaaaaaaaaaa.

Yakk
2007-04-02, 09:49 PM
struct data {
int ****** ptr;
data():ptr(0){}
};

// this function builds a level of the data table
// it allocates the minimium number of elements that are required for this
// particular entry:
template<typename T>
void build_table_level( T*& t, index i, int depth ) {
assert(!t); // only call this function if you have a null to fill.
if(t) return;
// if we are on the top level, the table needs to be size 18-3+1:
if (depth==1) {
t = new T[18-3+1]();
return;
}
// otherwise, it needs enough room to fit the 1-shallower number
// of data points:
t = new T[i.i[6-depth+1]-3+1]();
return;
}

// lookup_helper does the grunt work of the lookup function
// it relies on build_table_level to ensure that the table doesn't have
// any nulls in the way.
// It then recursively uses itself to find the right value in the table.
template<typename T>
int& lookup_helper( T*& t, index i, int depth ) {
if (!t) {
build_table_level( t, i, depth );
}
int offset = i.i[6-depth]-3; // roll of 3 is an offset of 0.
T* t2 = t+offset;
if (depth==6) return *t2;
return lookup_helper( *t2, i, depth+1 );
}

int& lookup( data d, index i ) {
i = order_index(i); // paranoid, can be removed to make it run faster
return lookup_helper( d.ptr, i, 1 );
}


There -- that creates a table to store the probabilities in, using minimal space.

It returns an int& to the value. This value will be zero to start.

So the V function can get the int&: if it is zero, you know it hasn't been filled in. You calculate the right value, then place it in the int&.

Next time you call the V function, it quickly finds the int&, and returns without having to do any calculation.

This means that the V function only does any calculation at most (size of table) number of times. :)

_int64s may have to be used to fit the size of the numbers we are calculating.

Reltzik
2007-04-03, 12:43 AM
Grrk. C++. That was the first object-oriented language I ever learned, and I never forgave it that. I just can't read C++ code without wanting to fly into a homicidal rage.

Anyhow, what I'm trying to make out here is when the weights-below and weights-above are added up.

Eighth_Seraph
2007-04-03, 02:07 AM
Honestly, most of this is Greek to me as programming generally leaves me gibbering in a corner, and Statistics is the current limit of my mathematical abilties. In any case, I'd like to give my newest character's stats a whirl in this percentile system. My friend Silas has
13 Strength
15 Dexterity
15 Constitution
13 Intelligence
15 Wisdom
12 Charisma.

Generated by the 4d6 minus lowest die system.

Reltzik
2007-04-03, 10:01 AM
12 13 13 15 15 15 7.846889e-001 7.851734e-001

Silas is 78th percentile.

CharPixie
2007-04-03, 11:30 AM
Yes, that wasn't a solution. I still hold that you can do any particular array in O(1) time, though, just using the probabilities for each stat being that number or less and then compensating for ordering.

Yakk
2007-04-03, 02:59 PM
CharPixie, solve it for 3 stats, each 2d6.

CharPixie
2007-04-03, 05:19 PM
You arguing against the a(x-1,x,x,x,x,x) idea I had? I already agreed with you that it was wrong.

Or are you arguing with the assertion that the problem can be solved with dynamic programming?

Douglas
2007-04-03, 06:03 PM
It strikes me that your percentiles depend a great deal on either how you order the arrays (if you calculate % by counting) or some possibly unwarranted assumptions about how to compute it directly. I'm not entirely sure what method you're using, but some of the percentiles you've posted just don't match what I'd expect at all.

I suggest trying this:
1) Calculate the probability table for a single ability score. The probability in the table is the chance of getting that value or lower with 4d6 drop lowest.
2) For each stat array, look up the probabilities for each of the six ability scores in the table generated in step 1 and multiply them all together.
3) Take the 6th root of the product from step 2. This number is the percentile for that array.

I'm not sure how well I can explain why, but I suspect this will yield results closer to the numbers I expected and may fit the actual definition of percentiles as well.

Fax Celestis
2007-04-03, 06:18 PM
Have you considered the probability of such stats according to die roll method?

For instance, stats are going to be much higher on average (7-18) if you use the 1d12+6 method I have known to be used.

There are other methods, each of which have their own probabilities. "Weighting" your results according to method used to determine score should also alter the percentile.

Methods I've seen used:
3d6, six times (Range: 3-18)
3d6, eight times, keep highest six (Range: 3-18)
4d6, six time, drop low die (Range: 3-18; higher averages than 3d6)
4d6, eight times, drop low die, keep highest six (Range: 3-18; higher averages than 3d6)
1d12+6, six times (Range: 7-18; much higher average than normal)
1d12+1d8, six times (Range: 2-20; higher potential maximum, but lower potential bottom. Also, less bell curve)
6d3 (Range: 6-18; much higher average than normal, though a better bell curve than 1d12+6)
5d3+3 (Range: 8-18; much higher average than normal, and a strong bell curve).
4d3+6 (Range: 10-18; most characters are superhuman in this system; stats of below 13 are rare)

henebry
2007-04-03, 08:09 PM
I'm sorry to come in here with an objection after so much good work has been done, but I'm not clear why you based your system on "4d6 drop the lowest" rather than simply 3d6.Remember that "4d6 drop" is simply a method for generating "above average" characters (and since WotC regards it as equivalent to the elite array), "4d6 drop" shouldn't be conceptualized as fundamental (it may be the system used to generate PCs, but PCs are just a select group in a larger world).

So you should think of 3d6 as the fundamental stat-generator. "4d6 drop the lowest" is just a shortcut for isolating the elite individuals who heed the call of adventure. As is using the elite array.

So my preference would be to see your system implemented for 3d6. Then, if I understand your system, the DM could set the p = 50% for a very low powered campaign (the equivalent of the standard array), p=60% or so for the elite array, and p even higher for something akin to a high point buy system.

Erom
2007-04-04, 08:12 AM
How many games have you played with 3d6, and how many with 4d6 drop? Myself, I think suggesting 3d6 would be a good way to be thrown out of my gaming group. I'm mostly kidding, but my group actually plays 4d6 drop the lowest, and 1's count as 6's. My point is that there are a lot of gamers out there that use 4d6drop, so this is plenty useful in that regard. Of course, if you or someone else wanted to recalculate the information based on 3d6 plain, that would be useful to a lot of people as well.

I guess my point is that to start with 4d6drop or 3d6 is a matter of preference. Once an algorithm to generate the psudo-percentiles is in place, it's easy to calculate the missing table.

Also, sorry about the delay on the GUI, it'll be up in an hour or two, I think.

SpiderBrigade
2007-04-04, 08:23 AM
Yes, while it would be somewhat interesting to know the percentiles of the myriad alternate rolling methods Fax mentions, that would be icing on the basic cake, which is knowing the results for 4d6 drop lowest. Henebry, I'm sorry but you are wrong in saying that 3d6 is the fundamental stat generator.
To create an ability score for your character, roll four six-sided dice (4d6). Disregard the lowest die roll and total the three highest ones.The DMG mentions a series of 8 other methods, which includes point buy, 3d6, and even "5d6 drop 2." However these are explicitly mentioned as "In addition to the standard method for generating ability scores presented in the Player's Handbook."

Now, your point that PCs using 4d6 are just a smaller subset of the world doesn't really hold either, because the default for all other entities is to use either the elite array, or the NPC array of all 10s and 11s. Monsters, NPCs, etc are not rolled-up the way PCs are, especially as they're intended to A) be fair and B) fulfil a certain role in the campaign. The DM is not going to randomly roll a BBEG and use him even if all of his stats are 5s, unless he has a reason for doing that. He'll almost certainly use the Elite Array. So it doesn't really matter what percentile of 3d6 the standard 10s-and-11s array is, if no one who would be using that ever has randomly rolled stats.

Now, I'd certainly be interested if some of the statistics/programming gurus in this thread wanted to include the percentiles for those alternate methods, especially as that would let us see whether a certain stat array is really easier to get with a certain method. But as I said, that's icing, and the "default" method is the priority here.

Vik
2007-04-04, 08:33 AM
So you should think of 3d6 as the fundamental stat-generator. "4d6 drop the lowest" is just a shortcut for isolating the elite individuals who heed the call of adventure. As is using the elite array. QFT. I'd like to see a comparison between the two method percentiles, btw.

SpiderBrigade
2007-04-04, 08:54 AM
See, I think it really depends on what you want these percentiles for. For me, and I think this is the original intent, they purpose is as an alternate method for making characters within a certain power range. Thus the DM says "Okay, characters 60-80th percentile" and the players use that. For such a purpose, it's only really significant to know the percentiles of the default PC rolling method, and possibly one or two popular alternate methods as a bonus.

If you're looking to find out "what is the percentile of a character with these scores in the D&D world," I'm of the opinion that this method won't help you. As mentioned, NPCs generally do not roll their stats. Your average commoner has 10s and 11s. Furthermore, knowing the percentile of, say, the elite array within all possible rolls of 3d6 does not tell you the "in-world" percentile of those stats. This is because to determine that you would have to know the population of the world and the ability scores of all characters. The pseudo-percentiles being generated here only tell you the relative merit of a score array relative to all other rolled arrays. It's a tool for estimating the power of a set of scores compared to other scores, not the power of those scores in terms of the game world.

Erom
2007-04-04, 09:02 AM
Well, if you have java installed you can now compare and contrast the results using Reltzik's Psuedo-Percentile schema, Invisible Castle-style traditional Point Buy, or the Adjusted Point Buy from this (http://www.giantitp.com/forums/showthread.php?t=38754) thread.

Download V2 (http://rapidshare.com/files/24296896/Roller.jar.html)
http://rapidshare.com/files/24296896/Roller.jar.html

Vik
2007-04-04, 09:13 AM
If you're looking to find out "what is the percentile of a character with these scores in the D&D world," I'm of the opinion that this method won't help you. As mentioned, NPCs generally do not roll their stats. Your average commoner has 10s and 11s. I know a lot of DM that will roll 3d6 to get a random NPCs ability score - mostly for Charisma.
Yet I agree that the main point is to give a scaling for character power level, so the 4d6 method percentile is more useful to compare with other characters and standard values.

Reltzik
2007-04-04, 11:14 AM
Okay, I'm going to be looking at EROM's work in a minute (once it finishes downloading).

I will agree that 3d6 would also be interesting, and the other options as well, to a lesser degree. However, 4d6 drop lowest IS the standard method, and so I placed a priority on that. (Technically, I should have used 4d6 drop lowest ignore "unplayable" characters, but that would have given REALLY screwy data.)

All right, let's try another test barbarian at 50th percentile.

16 15 17 10 10 10 is 44th percentile.

However, if we were to take those odd values and use them to pump up wisdom (good for various survival skills of the barbarian), we'd get:

16 14 16 10 12 10 at 50th percentile.

Not good.

16 14 18 10 10 9 is 47th percentile. However, if we drop dex and then raise wis, we get:

16 12 18 10 10 9 at 37th percentile, and then

16 12 18 10 11 9 at 49th percentile. That's right, the one point of wisdom was worth more than the two points of dex.

This makes sense under the algorithm -- you are turning far more possibilities above into possibilities noncomparable by raising a lower score (wisdom) than raising a higher score (dex). Some more experimenting shows that this effect is lessened when you're raising a 17 to an 18, and heightened when you're raising a 12 to a 13. Again, this makes sense; the weight of 12s are greater than weights of 17s, and you're shifting that greater weight from above to below. (BTW, this problem would be WORSE with straight 3d6, methinks.) So I'm not inclined to think that this is an implementation bug this time around.

It is, however, a poor point buy system. Still, it's useful data for people who like to crunch the odds on char gen.

Yakk
2007-04-08, 10:34 PM
You arguing against the a(x-1,x,x,x,x,x) idea I had? I already agreed with you that it was wrong.

Or are you arguing with the assertion that the problem can be solved with dynamic programming?

Of course I think it could be solved with dynamic programming. I implemented, in C++ code, about 90% of what I think is a dynamic programming solution in this thread.

I hope I'll have some time in a week or two to finish it off, or someone else could take my work and finish it off...

Dan_Hemmens
2007-04-09, 05:33 AM
I really, really hate to burst your bubble here, but this is a completely useless exercise for character creation.

The probability of rolling a particular set of ability scores has absolutely no relationship to how powerful it is. 3, 3, 3, 3, 3, 3 is the least probable stat array of all.

It doesn't "counter" min-maxing either. Sure, you're unlikely to wind up with "18, 18, 18, 7, 7, 7" but only because you can instead pick the more probable "18, 18, 18, 12, 12, 12".

Douglas
2007-04-09, 08:13 AM
That would be true if he were going with the probability of rolling a particular exact stat array. Instead, he's calculating the probability of rolling a particular stat array or worse. The "18, 18, 18, 12, 12, 12" array is clearly better than "18, 18, 18, 7, 7, 7" and would therefore have a higher probability of rolling that well or worse, requiring a higher limit on that probability to be available. I don't think the "or worse" part of the problem is being handled very well, but that's just a detail of implementation. The concept is just fine.

Dan_Hemmens
2007-04-09, 08:19 AM
That would be true if he were going with the probability of rolling a particular exact stat array. Instead, he's calculating the probability of rolling a particular stat array or worse. The "18, 18, 18, 12, 12, 12" array is clearly better than "18, 18, 18, 7, 7, 7" and would therefore have a higher probability of rolling that well or worse, requiring a higher limit on that probability to be available. I don't think the "or worse" part of the problem is being handled very well, but that's just a detail of implementation. The concept is just fine.

Since this entire *thread* is about details of implementation, I'd suggest that it's actually rather important.

What it boils down to is this: all that's actually going on here is that you're reproducing the Arrays by an absurdly convoluted method. If you say "67th percentile characters" or whatever, all you're really saying is "characters whose best possible set of stats is this". Why bother with all the programming?

clarkvalentine
2007-04-09, 08:44 AM
Why bother with all the programming?


An excuse to use a six-dimensional array?

*shudders*

Valairn
2007-04-09, 09:29 AM
This is quite possibly the most geeky delicious thread ever. Keep up the good work guys. You are all legends.

Yakk
2007-04-09, 11:11 AM
Since this entire *thread* is about details of implementation, I'd suggest that it's actually rather important.

*nod*, because it is a tricky problem to solve and to solve correctly.


What it boils down to is this: all that's actually going on here is that you're reproducing the Arrays by an absurdly convoluted method. If you say "67th percentile characters" or whatever, all you're really saying is "characters whose best possible set of stats is this". Why bother with all the programming?

No, you don't understand.

Of the stats strictly compareable to ARRAY_X, what percent are better than ARRAY_X and what percent are worse?

Ie, two uncompareable sets of stats:
18 16 16 12 8 6
and
14 14 14 14 14 14
could both have the same percentile value. (I'm not saying they do, but they could).

One of them isn't strictly better than the other, but this thread contains some attempt to map them both to a percentile value that means something.

The "67th percentile" isn't a point, but rather a surface in a 6 dimensional non-euclidean space.

The goal of this is to create a system that emulates 4d6 (or 3d6) stat rolling, but does it in such a way that all players have the same amount of luck.

It may turn out that the algorithms talked about in this thread won't be useful: but they are beautiful, and quite often mathematically beautiful solutions are useful. ;)

Dan_Hemmens
2007-04-09, 11:18 AM
No, you don't understand.

Of the stats strictly compareable to ARRAY_X, what percent are better than ARRAY_X and what percent are worse?

But the only way to determine that is to pick an arbitrary standard. Like, say, points-buy.


Ie, two uncompareable sets of stats:
18 16 16 12 8 6
and
14 14 14 14 14 14
could both have the same percentile value. (I'm not saying they do, but they could).

One of them isn't strictly better than the other, but this thread contains some attempt to map them both to a percentile value that means something.

But that's exactly my point: the percentile value will never mean anything. It will always be arbitrary.


The "67th percentile" isn't a point, but rather a surface in a 6 dimensional non-euclidean space.

The goal of this is to create a system that emulates 4d6 (or 3d6) stat rolling, but does it in such a way that all players have the same amount of luck.

Which is - to put it bluntly - not a useful endeavour. You can't emulate a random roll, then take out the randomness and say that it's still random. All you're doing is reinventing points-buy by a needlessly complicated method.

Erom
2007-04-09, 11:35 AM
All you're doing is reinventing points-buy by a needlessly complicated method.

Yes. That's exactly what we're doing, and many of us feel that this new system will have value as an alternative to point buy. Any system is going to be arbitrary - point buy is. I fail to understand why an arbitrary system doesn't mean anything. I don't think anyone ever claimed that this system was somehow "better" than point buy- it's an alternative. If it's crafted well, it will avoid some of the quirks of point buy (everyone has one 8 and one 18) but of course no system is perfect, and this system too will have it's quirks (currently MAD characters are penalized under this system).

As for needlessly complicated, you are free to propose a better algorithm. That is sort of the point of this thread - feedback, analysis, and discussion of Reltzik's algorithm as a launch point for discussing algorithms to rank stat arrays.

Also - are people getting the GUI working? Any bugs?

Dan_Hemmens
2007-04-09, 11:40 AM
I've got nothing against arbitrary systems, but this is pure legacy thinking. There is no intrinsic merit to the "probability" method of character creation. Just because two stat arrays are equally "probable" that does not mean that they are both equally powerful.

Fax Celestis
2007-04-09, 12:00 PM
I've got nothing against arbitrary systems, but this is pure legacy thinking. There is no intrinsic merit to the "probability" method of character creation. Just because two stat arrays are equally "probable" that does not mean that they are both equally powerful.

True, but it also means that the DM can say "We're playing a 60th Percentile game," and allow each of his players to choose the stat array of 60th or less that best fits their character. Further, the DM can use percentiles to his own advantage when creating NPCs. Need an on-the-fly commoner? Grab a 35th Percentile array and go. Need a warlord? Try a 75th.

Frankly, I see this more as a DM tool than as a player tool.

Dan_Hemmens
2007-04-09, 04:52 PM
True, but it also means that the DM can say "We're playing a 60th Percentile game," and allow each of his players to choose the stat array of 60th or less that best fits their character. Further, the DM can use percentiles to his own advantage when creating NPCs. Need an on-the-fly commoner? Grab a 35th Percentile array and go. Need a warlord? Try a 75th.

Frankly, I see this more as a DM tool than as a player tool.

As a DM tool it is no better than points buy or the Arrays. And neither of those require reams of coding.

kamikasei
2007-04-11, 01:26 PM
As a DM tool it is no better than points buy or the Arrays. And neither of those require reams of coding.

But... you don't have to do the reams of coding. Other people, who enjoy the intellectual exercise, will do the coding for you. So long as you trust that their method of generation is sound, you can simply reap the fruits of their labour through a handy-dandy GUI.

Obviously you don't view this as having enough utility to justify spending the time and effort on it yourself, but that calculation will be different for others who view the costs as lower, so what harm if they get on with it?

The real question is whether you, as a player or DM, would use what they produce once they're finished, since it wouldn't require any programming on your part. That's the test of utility.

Fax Celestis
2007-04-11, 01:29 PM
I'd use it.

CharPixie
2007-04-11, 02:23 PM
Of course I think it could be solved with dynamic programming. I implemented, in C++ code, about 90% of what I think is a dynamic programming solution in this thread.

I hope I'll have some time in a week or two to finish it off, or someone else could take my work and finish it off...


Oops. I meant "without dynamic programming", not "with". My bad.

SpiderBrigade
2007-04-11, 03:00 PM
As a DM tool it is no better than points buy or the Arrays. And neither of those require reams of coding.Oh hay guys, looks like he's right and you're wasting your time working on this. What a relief huh?
:smallyuk:

edit: or, yeah, what kamikasei said, too.

Yakk
2007-04-11, 03:50 PM
But the only way to determine that is to pick an arbitrary standard. Like, say, points-buy.

No. Read my statement more carefully.

Can you say that 18 10 10 10 10 10 is strictly better than 10 10 10 10 10 10 without an arbitrary standard?

I mean, other than "high stats good". ;)


But that's exactly my point: the percentile value will never mean anything. It will always be arbitrary.

The percentile value is based off of what percent of rolls are strictly better than this roll, and what percent are strictly worse. Those that are not strictly better nor strictly worse are not factored in.

That is what makes this percentile system mathematically beautiful.

It may not turn out to be useful, game-balance wise.


Which is - to put it bluntly - not a useful endeavour. You can't emulate a random roll, then take out the randomness and say that it's still random.

Probability theory can be done without rolling dice. The math still works. And you end up with things that can quite usefully be called random.


All you're doing is reinventing points-buy by a needlessly complicated method.

Except instead of a hand-fudged points-buy, built around fudge factors, this attempt is trying to build a points-buy system from first principles.

It may fail, and not generate a useful points-buy system. It also might generate one that is more interesting than the standard one. :)

...

Back to implementation:
In order to calculate the pseudo-percentage, one needs both the chance of rolling strictly HIGHER and the chance of rolling strictly LOWER. This requires double-dynamic programming, as the two problems are pretty much unrelated.