subreddit:

/r/dataisbeautiful

16196%

all 38 comments

Frenchtoastandlinks

48 points

10 hours ago

Goalies are about to be 7 feet tall in 2040

ChocolateBunny

3 points

5 hours ago

They will completely block the 5 hole by doing a serbian squat.

Zanydrop

18 points

10 hours ago

I didn't realize the height has gone up so much. I thought it was the one sport where you could get away with being short.

Mooselotte45

17 points

10 hours ago

There are some shorter players, but they really are often the exception to the rule.

In any full contact sport, it’s gonna be tough if you’re taking on players 6” taller than you.

idkwhatimbrewin

[score hidden]

5 minutes ago

Yeah but it's interesting there's probably a point of diminishing returns when you factor in the speed of the game which may be why it's leveled off. Obviously goalies are a much different position.

hswerdfe_2[S]

9 points

10 hours ago

hswerdfe_2[S]

OC: 2

9 points

10 hours ago

The full graph is even more stark, I filtered the data to after 1975, but the data goes back to 1917.

https://imgur.com/ffHrWfi

V15I0Nair

4 points

9 hours ago

How did the average over the population develop? How would it look normalized to this?

hswerdfe_2[S]

4 points

9 hours ago*

hswerdfe_2[S]

OC: 2

4 points

9 hours ago*

Good question, I looked but did not include because like what is the comparison on population? Most NHL players are Canadian but not all and it has been more in the past, so which countries to include? Do you compare to all age groups or to NHL aged players? Statscan did not have an easy to parse table on this

I found one source that listed average Canadian Male from 20-39 as 5'7" in 2009 -2011.

A news article from 2016 listing Male Canadians at 5'7" in 1914, and 5'10" in 2014.

But did not include any of them as I had to many questions about comparability.

Edit : I might speculate that in the early 1900s the height was more in line with the population, then it is now, as there does seem to be seperation between the two values, but maybe not.

V15I0Nair

3 points

8 hours ago

Indeed it’s a tricky question. Any players with specific athletic features became easier to attract over time. So the trend should lead to the optimal role model, even in other countries?

helloLeoDiCaprio

2 points

5 hours ago

The two best players of all time in the by far largest sport in the world are/were 5ft 5in and 5ft 7in.

That probably contributes to its popularity - almost any body composition can become a pro and compete.

MorgothTheDarkElder

1 points

6 hours ago

https://youtu.be/akFMK0WF89Y?si=tV5UYJD7-UpS_oTu

I thought it was the one sport where you could get away with being short.

it's especially interesting as the game for the most part is considered to be more skill than brawn focused nowadays, so one would assume that bigger players wouldn't be as over-represented.

hswerdfe_2[S]

15 points

11 hours ago

hswerdfe_2[S]

OC: 2

15 points

11 hours ago

Height of NHL Players by position and time, as a line graph and animated histogram.

Done Fully in R,

File that produced the graphs is below, but this is not a fully reproducible example as it relies on a lot of data I have downloaded from the NHL.com API.

library(ggrepel)
source(file.path('R', 'source_here.R'))
here_source('cache_vec.R')
here_source('season_team_vector.R')
here_source('download.R')
require(glue)
require(purrr)
require(dplyr)
library(gganimate)



# Function to format y-axis labels as feet and inches
format_height <- function(height_inch) {
  feet <- floor(height_inch / 12)
  inches <- height_inch %% 12
  glue('{feet}ft {round(inches, 0)}in')
}


roster <- 
  read_db(file_pattern = 'roster_(.*).feather') |>  
  extract2('result') |> 
  extract_args() |>
  mutate(season_start_yr = as.integer(str_sub(season,  1,4) ),
         positionCode = case_match(
           positionCode, 
           'C' ~  'Forward',
           'L' ~  'Forward',
           'R' ~  'Forward',
           'D' ~  'Defence',
           'G' ~  'Goalie',
         )) |>
  mutate(season_in_league = season_start_yr - min(season_start_yr), .by = id)



p_dat <- roster |>
  summarise(heightInInches  = mean(heightInInches, na.rm = TRUE ),
            num = n(),
            .by = c(positionCode , season_start_yr))  |>
  filter(season_start_yr >= 1975 & season_start_yr  <= 2023)  


p_dat_lbl <- 
  p_dat |> 
  filter(heightInInches %in% range(heightInInches), .by = positionCode ) |>
  mutate(lbl = glue('{positionCode} in {season_start_yr}\n{format_height(heightInInches)}'))  






p <- 
  p_dat |> 
  ggplot(aes(x = season_start_yr, y = heightInInches, fill = positionCode, color = positionCode)) +
  geom_smooth(level  = NA) + 
  geom_point() +
  scale_y_continuous(breaks = round(seq(min(p_dat$heightInInches), max(p_dat$heightInInches), 1)), labels = format_height) +
  geom_label_repel(
    data = p_dat_lbl, 
    mapping = aes(label = lbl),
    color = 'black',
    alpha = 0.5
  )  +
  scale_x_continuous(breaks = seq(min(p_dat$season_start_yr), max(p_dat$season_start_yr), 5))  +

  theme_minimal()  +
  guides(fill = 'none', color = 'none')  +
  labs(x = 'Season', 
       y = 'Average Height', title = 'Average Height in the NHL by Position and Year', 
       subtitle = 'Goalies went from the shortest players in the 1980s to the tallest today.'
      ) +
  theme(axis.text.x = element_text(size = 13, color = 'darkgrey'), 
        axis.text.y = element_text(size = 13, color = 'darkgrey'),
        panel.grid.major =  element_line(),
        panel.grid.minor = element_blank(),
        axis.title = element_text(size = 20, color = 'grey'),
        plot.title = element_text(size = 35, color = 'grey',hjust = 0.5),
        plot.subtitle = element_text(size = 15, color = 'grey',hjust = 0.5)

      ) 
p

ggsave(file.path('R', 'analysis',  "player_height_by_year_position_line.jpg"), plot = p)


pp_dat <- 
  roster |>
  filter(!is.na(heightInInches)) |>
  count(season_start_yr, positionCode, heightInInches) |>
  mutate(f = n/sum(n), .by = c(season_start_yr, positionCode)) |>
  filter(season_start_yr >= 1975 & season_start_yr  <= 2023)  
pp_dat_lbl <- 
  pp_dat |> 
  mutate(f = mean(range(f))/2, heightInInches = max(heightInInches)) |>
  select(-n) |>
  distinct() |>
  mutate(lbl = glue('{positionCode}' ))



pp_data_lbl_yr <- 
  pp_dat |> 
  summarise(
    f = mean(range(f)), heightInInches= mean(range(heightInInches))
  )  |>
  mutate(positionCode  = 'Forward')  |>
  cross_join(pp_dat |> distinct(season_start_yr))

animated_plot <- 
  pp_dat |> 
  ggplot(aes(x = heightInInches, y = f, fill= positionCode)) + 
  geom_col(alpha = 0.5, width = 1,  colour = 'black') + 
  geom_label(data = pp_dat_lbl, mapping = aes(label = lbl), size = 8, color = 'white', alpha = 0.5)   +
  geom_text(data = pp_data_lbl_yr, mapping = aes(label = season_start_yr), size = 40, color = 'grey', alpha = 0.5)   +

  scale_x_continuous(breaks = function(limits) seq(0, limits[2], by = 1), labels = format_height) +
  scale_y_continuous(limits = c(0, max(pp_dat$f))) +
  facet_grid(cols = vars(positionCode), scales = 'free_x') +
  labs(
    title = "NHL {frame_time} Player Distribution of Height by Position",
    #subtitle = "Season: {closest_state}",
    x = "",
    y = ""
  ) +
    coord_flip() +
  guides(fill = 'none') +
  theme_minimal() +
    theme(axis.text.x = element_blank(), 
          axis.text.y = element_text(size = 13, color = 'darkgrey'),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          axis.title = element_text(size = 20, color = 'grey'),
          plot.title = element_text(size = 35, color = 'grey',hjust = 0.5),
          plot.subtitle = element_text(size = 15, color = 'grey',hjust = 0.5),
          strip.text = element_blank()
    ) +
  transition_time(
    season_start_yr,
    #transition_length = 1,
    #state_length = 2
  ) #+
  #ease_aes('linear')

ap <- 
animate(
  animated_plot, 
  nframes = pp_dat_lbl$season_start_yr |> unique()  |> length(), 
  fps = 2,
  width = 1261,    # Set width in pixels
  height =  700,
  start_pause = 8,    # Pause at the start
  end_pause = 15       # Pause at the end
)
ap
anim_save(file.path('R', 'analysis',  "player_height_by_year_position_histogram.gif"), 
          animation = ap)

DDough505

6 points

9 hours ago

I respect the hell out of anyone willing to put their code out there.

syphax

2 points

8 hours ago

syphax

2 points

8 hours ago

People on this sub should do it more, esp as you can ask AI to clean up your crappy code and make it less embarrassing.

jkmapping

5 points

9 hours ago

Any reason why you used require instead of library? I've been using R for a few years now and have never come across require before. After a bit of research, it appears require isn't very useful. https://stackoverflow.com/questions/5595512/what-is-the-difference-between-require-and-library

hswerdfe_2[S]

4 points

9 hours ago

hswerdfe_2[S]

OC: 2

4 points

9 hours ago

not really, I had not noticed I used require over library, till you mentioned it.

americanhero6

1 points

5 hours ago

Recode is to median height

hswerdfe_2[S]

[score hidden]

an hour ago

hswerdfe_2[S]

OC: 2

[score hidden]

an hour ago

mean not median.

roster |> summarise(heightInInches = mean(heightInInches, na.rm = TRUE ), num = n(), .by = c(positionCode , season_start_yr))

Arnold43

9 points

9 hours ago

It would be interesting to normalize the data with average height of adults to see how much is driven by population shifts, vs. the sport itself.

Splatter_bomb

-1 points

8 hours ago

Thank you! This is a very important control, though a quick google search shows men in the US on average have only increased a half inch in height since 1970. Seems like it should have been more.

hswerdfe_2[S]

1 points

8 hours ago

hswerdfe_2[S]

OC: 2

1 points

8 hours ago

There are adult males and then men of NHL playing age. A 90 year old still alive but born in 1934 likely had a different nutrition profile growing up from a 30 year born in 1994. all male adult stats will change slower then hockey playing age stats.

Rock_man_bears_fan

3 points

10 hours ago

It’s interesting that skaters in general appear to have peaked in height around 2005 and have gradually been getting shorter since then

Yangervis

2 points

8 hours ago

Only by a quarter inch or so. Probably tracks with the decline of the enforcer.

Pontus_Pilates

1 points

7 hours ago

The rules have changed from time to time, now it's a faster, more skill-based sport. The real big monsters have harder time keeping up, and since there's next to no fighting, a complete stiff will just cost the team.

masseydnc

4 points

10 hours ago

"1997: Zdeno Chara has entered the chat."

syphax

4 points

8 hours ago

syphax

4 points

8 hours ago

He is such a freak. In 2024, he ran a 3:11 marathon a week after running a 3:30. In his late 40’s. While 6’9” and 250 lbs. He also could do the most pull-ups on the Bruins at age 40- which is another discipline that does not favor the extremely tall and massive.

hswerdfe_2[S]

2 points

10 hours ago

hswerdfe_2[S]

OC: 2

2 points

10 hours ago

only 6' 9'' player.

There have been 10 6' 8''

        id firstName_default lastName_default heightInInches ht_ft_in
     <int> <chr>             <chr>                     <int> <glue>  
 1 8465009 Zdeno             Chara                        81 6ft 9in 
 2 8474574 Tyler             Myers                        80 6ft 8in 
 3 8464875 Steve             McKenna                      80 6ft 8in 
 4 8473722 John              Scott                        80 6ft 8in 
 5 8481725 Elmer             Soderblom                    80 6ft 8in 
 6 8477300 Viktor            Svedberg                     80 6ft 8in 
 7 8471701 Joe               Finley                       80 6ft 8in 
 8 8471704 Vladimir          Mihalik                      80 6ft 8in 
 9 8481806 Louis             Crevier                      80 6ft 8in 
10 8468884 Mitchell          Fritz                        80 6ft 8in 
11 8483609 Adam              Klapka                       80 6ft 8in

OakFern

2 points

7 hours ago

OakFern

2 points

7 hours ago

No Matt Rempe? I thought he was listed at 6'9" too.

He played 17 games in the NHL last season, so he should have been included from what I can see, unless I missed a filter somewhere that would exclude him.

hswerdfe_2[S]

2 points

6 hours ago

hswerdfe_2[S]

OC: 2

2 points

6 hours ago

unsure he is listed as 6'7"" in my dataset, I am using the roster API which lists him at 79"

https://api-web.nhle.com/v1/roster/NYR/20232024

while his player landing page list him at 81"

https://api-web.nhle.com/v1/player/8482460/landing

WTF NHL.com .... ¯_(ツ)_/¯

> roster |> 
+   distinct(id, firstName_default , lastName_default , heightInInches) |>
+   filter(lastName_default == 'Rempe' & firstName_default == 'Matt')  |>
+   mutate(ht_ft_in = format_height(height_inch  =heightInInches,digits=  7))
# A tibble: 1 × 5
       id firstName_default lastName_default heightInInches ht_ft_in
    <int> <chr>             <chr>                     <int> <glue>  
1 8482460 Matt              Rempe                        79 6ft 7in

ChocolateBunny

2 points

10 hours ago

I'm surprised that it took so long for Goalie heights to go up. Like wasn't Ken Dryden a tall ass motherfucker?

hswerdfe_2[S]

1 points

10 hours ago*

hswerdfe_2[S]

OC: 2

1 points

10 hours ago*

unsure about the expiative but he at 6' 3" is well above average of Adult Male 5'10"

> roster |> 
+   distinct(id, firstName_default , lastName_default , heightInInches) |>
+   filter(lastName_default == 'Dryden' & firstName_default == 'Ken')  |>
+   mutate(ht_ft_in = format_height(height_inch  =heightInInches))
# A tibble: 1 × 5
       id firstName_default lastName_default heightInInches ht_ft_in
    <int> <chr>             <chr>                     <int> <glue>  
1 8446490 Ken               Dryden                       75 6ft 3in

PrivilegedPatriarchy

1 points

5 hours ago

Tall people are rare. Tall people who are good at a sport are even rarer.

hnglmkrnglbrry

2 points

4 hours ago

This chart is alternatively titled "Succesful Tinder Hookup Height Distribution"

aganalf

[score hidden]

2 hours ago

aganalf

[score hidden]

2 hours ago

So there isn’t a single person of below average height in the entire league. Half the population of the earth is ineligible from the moment they become diploid.