[Test] How do you determine how many names are displayed for top names of the country?
So for the USA top names the most common 1000 are displayed. Cool, that makes sense because the SSA posts an easily accessible version of the top 1000 names. Though I know more are available if you download the actual files for baby names per year I'm assuming its a waste of space to include more than the top 1000. But for Canada it only displays the top 100? Why? From 1991 to 2023 StatsCan has every name and its ranking displayed. Why not use the top 1000 for Canada as well? I'm a data nerd (quite literally a stats major) so the more data easily accessible on this site, the happier I am. Also I'm sure Canada isn't the only country that has more available, I'm Canadian so it's just what I know about.
Also if its a resources/staff thing, I would literally happily add Canada or any other countries additional rankings in my spare time
Also if its a resources/staff thing, I would literally happily add Canada or any other countries additional rankings in my spare time
Replies
First of all, I was not aware of the StatsCan release -- it looks like it was recently put together.
In most cases the number of names is limited because that is the extent of the information available. It is common for countries to provide top a 100 list only, though more are releasing exhaustive lists these days. Sometimes, a country provides more names for recent years than it does for older years -- this is a problem for this site's software because it expects all years in a series to have the same number of names. If I want to include these older years, then I have to limit to the least.
In other cases I trim a list to the top 100 because the country has a small population and the lowest names on the list do not actually have many bearers.
There are also technical considerations. Increasing a list from, say, a top 100 to a top 1000 over 100 years adds 90,000 rows of data to the lookup table, which could affect search times.
Looking at the Canadian data in particular, a choice would have to be made... Would we want a top 1000 (actually, more likely a top 500) from 1991, or a top 100 from 1920?
In most cases the number of names is limited because that is the extent of the information available. It is common for countries to provide top a 100 list only, though more are releasing exhaustive lists these days. Sometimes, a country provides more names for recent years than it does for older years -- this is a problem for this site's software because it expects all years in a series to have the same number of names. If I want to include these older years, then I have to limit to the least.
In other cases I trim a list to the top 100 because the country has a small population and the lowest names on the list do not actually have many bearers.
There are also technical considerations. Increasing a list from, say, a top 100 to a top 1000 over 100 years adds 90,000 rows of data to the lookup table, which could affect search times.
Looking at the Canadian data in particular, a choice would have to be made... Would we want a top 1000 (actually, more likely a top 500) from 1991, or a top 100 from 1920?
Thank you so much for responding! In my opinion it would be better to keep the name lists the way they are. The top 100 for 1920 onwards is more informative than a top 500 for only the last 30 years.
It depends on how many the source used for the countries has. If you'd like to find more, you can do research to find some with more data and send them to Mike. If the source allows this site to use the data, then he'll put it in the site.