Using python Pandas in a Jupiter notebook, I have the following csv files I am trying to get a result of and I can't find out how to get the correct result:
sets.csv with the following columns:
set_num name year theme_id num_parts
And this file themes.csv
id name parent_id
Create a function called theme_by_year that takes as input a year (as an integer) and shows the theme ids and theme names (listed in order by theme id) that were in sets that year.
The column names must be id and name_themes (to differentiate between the name of a theme and the name of a set) in that order.
The index should be reset and go from 0 to n-1.
Each theme should only be listed once even if it appeared in more than one set from that year -- duplicate themes should be based on theme id and not name since there are some themes with the same name but with a different id.
Hint: It will help if you were to think about merging appropriate DataFrames to help you get this answer.
In [153]:
### The code ###
def theme_by_year(yr):
new = sets[sets['year']==yr]
new = list(set(list(sets['theme_id'])))
name = []
new.sort()
for i in new:
st = str(themes[themes['id']==i]['name'])
st = st[1:]
st= st.strip("\nName: name, dtype: object")
name.append(st)
df = pd.DataFrame(list(zip(new,name)),index=list(range(len(new))),columns=['id','name_themes'])
return df
Code Check: Call the theme_by_year() function on the year with the lowest number of total sets as found in 1960 . Your output should look as follows:
id name_themes
0 371 Supplemental
1 497 Books
2 513 Classic
### Here is the result I get ###
Q14 = theme_by_year(1960)
Q14
Out[154]:
id name_themes
0 1 Techni
1 3 Competiti
2 4 Expert Builder
3 16 RoboRiders
4 17 Speed Slammers
... ... ...
431 715 39 Marvel
432 716 40 Modulex
433 717 41 Speed Racer
434 718 42 Series 22 Minifigures
435 719 43 BrickLink Designer Progr