使用IMDbPY批量获取电影的imdb编号

Updated on with 0 views and 0 comments

操作系统是Windows 10,预先配置了conda虚拟环境

1.安装IMDbPY包

pip install imdbpy

2.创建电影文件列表

格式是 “电影英文名称 发行年份”,尽量简洁明了,同时清理掉不必要的字符和空白行,以免查询失败,列表示例如下:

American Psycho 2000
Brokeback Mountain 2005
Bruce Almighty 2003
Chicken Run 2000
Constantine 2005
Downfall aka Der Untergang 2004
eight and a half 1963
Face Off 1997
La Femme Nikita 1990
Liar Liar 1997
Little Nicholas 2009
Making Waves The Art of Cinematic Sound 2019
One Flew Over the Cuckoos Nest 1975
paris je t aime 2006
Pearl Harbor
Shrek 2 2004
Shrek Forever After 2010
Shrek the Third 2007
The China Syndrome 1979

3.运行Python代码

将列表保存为文本文件,例如movielist.txt,然后在任意的python环境里面运行代码即可,个人一般更偏向于在jupyter notebook里面运行。

# 导入imdb模块
from imdb import IMDb 

# 创建获取imdb编号的方法
def get_movie_imdb_id(movie_title): 
    ia = IMDb()
    movies = ia.search_movie(movie_title)
    if movies:
        movie = movies[0]
        imdb_id = movie.movieID
        return imdb_id
    else:
        return None

# 逐行读取电影列表txt文件,文件路径是D:\movielist.txt
filmlist = r"D:\movielist.txt" 
with open(filmlist, 'r', encoding='utf-8') as f:
    movie_names = f.readlines()
for movie_name in movie_names:
    movie_name = movie_name.strip()
    imdb_id = get_movie_imdb_id(movie_name)
    # 如果没有查询到相关的编号信息,则输出“电影名称,id_not_found”
    if imdb_id == None:
        print(movie_name, ",id_not_found")
    else:
        # 如果查询到电影信息,则输入“电影名称,ttXXXX”
        print(movie_name+ ",tt" + imdb_id)

上述列表运行效果如下:

American Psycho 2000,tt0144084
Brokeback Mountain 2005,tt0388795
Bruce Almighty 2003,tt0315327
Chicken Run 2000,tt0120630
Constantine 2005,tt0360486
Downfall aka Der Untergang 2004 ,id_not_found
eight and a half 1963,tt0369179
Face Off 1997,tt0119094
La Femme Nikita 1990,tt0118379
Liar Liar 1997,tt0119528
Little Nicholas 2009,tt1264904
Making Waves The Art of Cinematic Sound 2019,tt3856408
One Flew Over the Cuckoos Nest 1975,tt0073486
paris je t aime 2006,tt0401711
Pearl Harbor,tt0213149
Shrek 2 2004,tt0298148
Shrek Forever After 2010,tt0892791
Shrek the Third 2007,tt0413267
The China Syndrome 1979,tt0078966

4.小问题的处理

其中Downfall aka Der Untergang 2004(帝国的毁灭 2004)没有查询到结果,Der Untergang是这这部电影的德文名称,英文和德文混写,导致识别不出来,简化成Downfall 2004,再运行,就可以正常获取了:

Downfall  2004,tt0363163

标题:使用IMDbPY批量获取电影的imdb编号
作者:MeGusta
地址:https://www.oakdb.cn/articles/2023/11/03/1699016896091.html