Jamie Elsey writes:
I’ve been actually to see you speaking extra about information visualisation in your weblog because it’s a subject I actually get pleasure from and assume it’s underappreciated. I’ve lately been engaged on some methods of legibly presenting uncertainty as a part of my work, and devised what’s, to me, a barely novel manner of exhibiting distributions of information in a manner I discover to be fairly helpful. I questioned when you have seen this sort of factor earlier than, and what you assume? – principally, it is sort of a histogram or density plot in that’s reveals the general form of the distribution, however what I discover good is that every bar is made to have the identical space and to particularly signify a selected proportion. One may name it an “percentogram.” Therefore, it’s very easy to evaluate how a lot of the distribution is falling particularly ranges. You too can particularly coloration code the bars in line with e.g., specific quantiles, deciles and so forth.
I feel this might be probably helpful for plotting issues like posterior distributions, or the outcomes of issues like price effectiveness analyses the place a number of the inputs embrace uncertainty/are simulated with variability. This isn’t a correct geom but and the code might be a bit janky, however in case you’d prefer to see the code I also can share what I’ve to this point (it’s a perform that can take a vector of information and returns a dataframe from which this type of plot could be simply made).
I believed you may discover it fascinating particularly whether it is one thing you haven’t seen earlier than, or possibly there may be some good purpose why this type of plot isn’t used!
The above graphs present percentograms for random attracts from the traditional and exponential distributions.
In response to Elsey’s query, my fast reply is that I’ve seen histograms with various bin widths however not with equal chance.
Elsey did some looking and found this on various binwidth histograms, with references going again to the Seventies. It is sensible that folks had been writing in regards to the subject again then, as a result of that was a time when statisticians thought lots about unidimensional information show. These days we predict extra about time sequence and scatterplots, however histograms nonetheless get used, which is why I’m sharing the concept right here.
I googled *equal chance histograms in r* and located this amusing bit from 2004, basic R-list stuff, no messing round:
Q: I want to use R to generate a histogram which has bars of variable bin width with every bar having an equal variety of counts. For instance, if the bin limits are the quartiles, every bar would signify 1/4 of the entire chance within the distribution. An instance of such an equal-probability histogram is introduced by Nicholas Cox at http://www.stata.com/assist/faqs/graphics/histvary.html.
A: So you’ll be able to calculate the quartiles utilizing the quantiles() perform and set these quartiles as breaks in hist().
Certainly:
percentogramI will attempt it on my favourite instance, a random sample from the Cauchy distribution:
> y percentogram(y)And this is what comes up:
That is kinda ineffective: there's a variety of the info and then you definately see no element within the center. You will get comparable issues with a classical equal-width histogram (attempt it!).
There is not any manner out of this one . . . besides that if we're going with percentiles anyway, we may simply trim the extremes:
percentogram = b[1]) & (aOK, now let's attempt it, chopping off the decrease and higher 1%:
percentogram(y, q=c(0.01, seq(0.05,0.95,0.05), 0.99))
Not unhealthy! I kinda like this percentogram as a default.