Histogram of Member Post-Counts
Welcome once again to "Fun with Graphs"!
Today we'll see a different kind of graph, a histogram. It will show how many MacRumors members have the same 6-month post-count.
For practical reasons (illustrated later), we use ranges of post-counts, which are called
bins or
buckets. The range of post-counts covered by a bin is called its size or
width. For example, for a bin width of 100, all members with post-counts in the range 0 to 99 are put in the first bin, then those in the range 100-199 are put in the second bin, and so on. After all the members have been assigned to a bin, the bin counts are graphed. The bins represent the distribution of post-counts across the entire range. Some bins may be empty, meaning no member had a post-count within its range.
Previous graphs have looked at top posters, either by their total post-count, or by their prolificness in the prior 6-month interval. A histogram covers all the members, even those with few or no posts in the past 6 months. Technically, it covers all members with total post-counts of 4000 or more, because those are the only ones I collect data for.
The histogram data is presented first in Table form, then as graphs. The tables make it easier to see the exact counts of how many members are in each bin. The graphs look prettier.
I made two histograms with different bin-widths: 100 and 50. There's a separate bin for members with 0 posts. There's another separate bin for members with very high post-counts, a "top bin".
The Tables
Measurements taken 03 Jan 2019 by chown33.
Code:
Range of
6-Mo Posts Members
----------- -------
<0 4
0 133
1-99 121
100-199 29
200-299 18
300-399 17
400-499 10
500-599 7
600-699 8
700-799 6
800-899 4
900-999 5
1000-1099 2
1100-1199 5
1200-1299 1
1500-1599 1
1600-1699 1
1900-1999 2
2000-2099 2
2300-2399 1
2400+ 5
Range of
6-Mo Posts Members
----------- -------
<0 4
0 133
1-49 96
50-99 25
100-149 19
150-199 10
200-249 11
250-299 7
300-349 9
350-399 8
400-449 8
450-499 2
500-549 4
550-599 3
600-649 5
650-699 3
700-749 1
750-799 5
800-849 3
850-899 1
900-949 2
950-999 3
1000+ 20
Table Columns:
- Range of 6-Mo Posts = the range of post-counts covered by this bin
- Members = the count of members with this many posts
We can see that the most common range in both tables is 0 posts. The next most common is "under 100", or "under 50" in the table with bin-size of 50.
The "<0" bin has a few members whose post-count actually decreased over the 6-mo interval. This can happen when a member has posted in a thread that's later moved to an uncounted area, such as Wasteland. This doesn't mean the member isn't active, only that they have a post that was previously counted moved to an area where it's not counted.
The Graphs
View attachment 821079
View attachment 821080
First, let me acknowledge that these aren't
Proper Histograms, which should be drawn with a vertical bar representing each bin count. Instead, I used the same automated line-graphing script I produced the other graphs with. I may make a bar-drawing version of the script, for future graphing fun.
As with the tables, we can easily see that "zero posts" dominates. We also see a rapid roll off as the post-count ranges increase.
The areas of the graph with no labels or vertical lines are where the bins were completely empty, i.e. no members had post-counts in that area. The graphed line doesn't drop to 0 in these areas, although it should to be completely accurate.
The up-tick at the right end is where the most prolific posters are found. You can see exactly who they are in the
Active Posters Highlight in this thread.
Going Crazy
The automated scripts I use to produce the tables and graphs let me easily change the bin-width, and disable the top bin. I first made some of these for testing the scripts, to see how well they worked at the extremes. Some of the results were interesting, so here they are.
First we have two graphs with no top bin, and bin widths of 100 and 500. The "long tail" is plain to see, as are the gaps where no post-counts lie.
View attachment 821081
View attachment 821082
Next, here's the table and graph for a bin width of 10 and no top bin. This graph is extremely compressed toward the left, and the labels and lines drawn for each bin resemble a density map. It's this illegibility that illustrates the reason for using bins. Even with bins of width 10, this is definitely a candidate for an Ugly Graphs post.
Code:
Range of
6-Mo Posts Members
----------- -------
<0 4
0 133
1-9 45
10-19 19
20-29 11
30-39 8
40-49 13
50-59 9
60-69 7
70-79 6
80-89 3
100-109 6
110-119 3
120-129 3
130-139 1
140-149 6
150-159 1
160-169 2
170-179 1
180-189 2
190-199 4
200-209 4
210-219 3
220-229 2
230-239 1
240-249 1
250-259 1
260-269 1
270-279 3
280-289 1
290-299 1
310-319 1
320-329 3
330-339 3
340-349 2
350-359 2
360-369 3
370-379 2
380-389 1
400-409 2
410-419 4
440-449 2
460-469 2
500-509 1
520-529 2
540-549 1
550-559 1
560-569 1
580-589 1
600-609 1
610-619 1
620-629 1
630-639 1
640-649 1
650-659 1
670-679 1
680-689 1
730-739 1
750-759 1
760-769 2
780-789 1
790-799 1
820-829 1
830-839 1
840-849 1
880-889 1
920-929 1
940-949 1
960-969 1
990-999 2
1000-1009 1
1070-1079 1
1110-1119 2
1140-1149 1
1180-1189 1
1190-1199 1
1290-1299 1
1570-1579 1
1620-1629 1
1920-1929 2
2040-2049 1
2090-2099 1
2340-2349 1
2960-2969 1
3010-3019 1
3820-3829 1
4170-4179 1
5900-5909 1
View attachment 821083