GPT2 を動かしてみたかったけど TensorFlow で躓いて断念した
gpt2-japanese という GitHub リポジトリを見つけたので、GPT2 による日本語文章生成を試せるかと思いやってみたところ、Python ちからと TensorFlow ちからがなくて断念した話。
- tanreinama/gpt2-japanese
- 参考 : gpt2-japaneseのmediumモデル による日本語テキスト生成|npaka|note
- 参考 : OpenAIのGPT2で人工知能が記事を生成するメディアを作ってみた - Qiita
- 参考 : GPT-3でプログラマ向け箴言・名言出力サイトを作りました - Qiita
まっさらな WSL Ubuntu でやってみることにした。
# プロジェクトを取得する
$ git clone https://github.com/tanreinama/gpt2-japanese
$ cd ./gpt2-japanese/
# モデルファイルは 1.2GB くらいある
$ wget https://www.nama.ne.jp/models/gpt2ja-medium.tar.bz2
$ tar xvfj gpt2ja-medium.tar.bz2
# 試しにコマンドを叩いてみる
$ python3 gpt2-generate.py
Traceback (most recent call last):
File "gpt2-generate.py", line 3, in <module>
import numpy as np
ModuleNotFoundError: No module named 'numpy'
numpy
がなくてエラーだって。pip
とかでインストールせなあかんな。
てか WSL Ubuntu って何の Python が入ってんだ?
$ python -V
-bash: /usr/bin/python3.7: そのようなファイルやディレクトリはありません
$ type python
python は `/usr/bin/python3.7' のエイリアスです
$ python3.7 -V
コマンド 'python3.7' が見つかりません。もしかして:
command 'python3.8' from deb python3.8 (3.8.5-1~20.04)
command 'python2.7' from deb python2.7 (2.7.18-1~20.04)
command 'python3.9' from deb python3.9 (3.9.0-5~20.04)
次を試してみてください: sudo apt install <deb name>
$ type python3.7
-bash: type: python3.7: 見つかりません
何やコレ?
$ python3 -V
Python 3.8.5
$ python3.8 -V
Python 3.8.5
$ type python3.8
python3.8 はハッシュされています (/usr/bin/python3.8)
$ ls -l /usr/bin/ | grep python
lrwxrwxrwx 1 root root 23 2020-07-28 21:59 pdb3.8 -> ../lib/python3.8/pdb.py*
lrwxrwxrwx 1 root root 31 2020-03-13 21:20 py3versions -> ../share/python3/py3versions.py*
lrwxrwxrwx 1 root root 9 2020-03-13 21:20 python3 -> python3.8*
lrwxrwxrwx 1 root root 16 2020-03-13 21:20 python3-config -> python3.8-config*
-rwxr-xr-x 1 root root 5486352 2020-07-28 21:59 python3.8*
lrwxrwxrwx 1 root root 33 2020-07-28 21:59 python3.8-config -> x86_64-linux-gnu-python3.8-config*
lrwxrwxrwx 1 root root 33 2020-03-13 21:20 x86_64-linux-gnu-python3-config -> x86_64-linux-gnu-python3.8-config*
-rwxr-xr-x 1 root root 3240 2020-07-28 21:59 x86_64-linux-gnu-python3.8-config*
python
コマンドは存在しないのに、python3.7
コマンドを探しに行こうとしていて、python3.7
コマンドはなくて python3.8
が入ってる。
ところでプロジェクト内の requirements.txt
を見ると
tensorflow-gpu==1.15.4
と書いてある。コレよくよく調べてみたら Python 3.5 ~ 3.7 にしか対応してないやんけ。
ほんなら pipenv
入れて Python3.7 環境を作るか。
$ sudo apt install -y python3-pip
$ pip install pipenv
$ pipenv --python 3.7
Warning: Python 3.7 was not found on your system...
Neither 'pyenv' nor 'asdf' could be found to install Python.
You can specify specific versions of Python with:
$ pipenv --python path/to/python
pyenv か asdf 入れろって。めんどくせーなー。テキトーに pyenv 入れるか
$ git clone https://github.com/pyenv/pyenv.git ~/.pyenv
$ vi ~/.bashrc
# Python : pyenv (For pipenv)
export PYENV_ROOT="${HOME}/.pyenv"
export PATH="${PYENV_ROOT}/bin:${PATH}"
eval "$(pyenv init -)"
# Python : pipenv
export PIPENV_VENV_IN_PROJECT='true'
$ pyenv -v
pyenv 1.2.21-1-g943015eb
pyenv 準備した。Python3.7 環境に切り替えてみよう。
$ pipenv --python 3.7
Warning: Python 3.7 was not found on your system...
Would you like us to install CPython 3.7.9 with Pyenv? [Y/n]: Y
Installing CPython 3.7.9 with /home/neo/.pyenv/bin/pyenv (this may take a few minutes)...
✘ Failed...
Something went wrong...
Downloading Python-3.7.9.tar.xz...
-> https://www.python.org/ftp/python/3.7.9/Python-3.7.9.tar.xz
Installing Python-3.7.9...
WARNING: The Python bz2 extension was not compiled. Missing the bzip2 lib?
WARNING: The Python readline extension was not compiled. Missing the GNU readline lib?
ERROR: The Python ssl extension was not compiled. Missing the OpenSSL lib?
Please consult to the Wiki page to fix the problem.
https://github.com/pyenv/pyenv/wiki/Common-build-problems
BUILD FAILED (Ubuntu 20.04 using python-build 1.2.21-1-g943015eb)
Inspect or clean up the working tree at /tmp/python-build.20201126201713.2142
Results logged to /tmp/python-build.20201126201713.2142.log
Last 10 log lines:
fi
Looking in links: /tmp/tmp4cg8hzxq
Processing /tmp/tmp4cg8hzxq/setuptools-47.1.0-py3-none-any.whl
Processing /tmp/tmp4cg8hzxq/pip-20.1.1-py2.py3-none-any.whl
Installing collected packages: setuptools, pip
WARNING: The script easy_install-3.7 is installed in '/home/neo/.pyenv/versions/3.7.9/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The scripts pip3 and pip3.7 are installed in '/home/neo/.pyenv/versions/3.7.9/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-20.1.1 setuptools-47.1.0
Warning: The Python you just installed is not available on your PATH, apparently.
5分くらい待たされてエラーで終わった。Wiki を見てみよう。
以下を入れろと。
# sudo apt-get install -y build-essential libssl-dev zlib1g-dev libbz2-dev \
libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev \
xz-utils tk-dev libffi-dev liblzma-dev python-openssl git
# 再トライ
$ pipenv --python 3.7
Warning: Python 3.7 was not found on your system...
Would you like us to install CPython 3.7.9 with Pyenv? [Y/n]: Y
Installing CPython 3.7.9 with /home/neo/.pyenv/bin/pyenv (this may take a few minutes)...
✔ Success!
Downloading Python-3.7.9.tar.xz...
-> https://www.python.org/ftp/python/3.7.9/Python-3.7.9.tar.xz
Installing Python-3.7.9...
Installed Python-3.7.9 to /home/neo/.pyenv/versions/3.7.9
Virtualenv already exists!
Removing existing virtualenv...
Creating a virtualenv for this project...
Pipfile: /home/neo/Documents/Dev/Sandboxes/gpt2-japanese/Pipfile
Using /home/neo/.pyenv/versions/3.7.9/bin/python3.7m (3.7.9) to create virtualenv...
⠏ Creating virtual environment...created virtual environment CPython3.7.9.final.0-64 in 614ms
creator CPython3Posix(dest=/home/neo/Documents/Dev/Sandboxes/gpt2-japanese/.venv, clear=False, no_vcs_ignore=False, global=False)
seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/neo/.local/share/virtualenv)
added seed packages: pip==20.2.4, setuptools==50.3.2, wheel==0.35.1
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
✔ Successfully created virtual environment!
Virtualenv location: /home/neo/Documents/Dev/Sandboxes/gpt2-japanese/.venv
requirements.txt found, instead of Pipfile! Converting...
✔ Success!
Warning: Your Pipfile now contains pinned versions, if your requirements.txt did.
We recommend updating your Pipfile to specify the "*" version, instead.
# シェルを切り替える
$ pipenv shell
$ python -V
bash: /usr/bin/python3.7: そのようなファイルやディレクトリはありません
$ python3 -V
Python 3.7.9
python
コマンドのエラーは意味分からんけど python3
コマンドは v3.7 になったな。
# pipenv を使って requirements.txt` をインストールする
$ pipenv install -r requirements.txt
# 動かしてみる
$ python3 gpt2-generate.py
WARNING:tensorflow:From /home/neo/Documents/Dev/Sandboxes/gpt2-japanese/model.py:147: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.
WARNING:tensorflow:From gpt2-generate.py:88: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
WARNING:tensorflow:From gpt2-generate.py:92: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2020-11-27 13:23:47.385197: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-27 13:23:47.393555: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2903995000 Hz
2020-11-27 13:23:47.394397: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x557a2dacd890 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-27 13:23:47.394489: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-27 13:23:47.396619: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-11-27 13:23:47.396679: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2020-11-27 13:23:47.396734: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (Neos-ZenBook): /proc/driver/nvidia/version does not exist
2020-11-27 13:23:47.396784: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Not found: no CUDA devices found
中止
tensorflow-gpu
がダメっぽいな。
なんか調べてたら TensorFlow は v1 系はもう古いらしく、v2 系は全く勝手が違うらしい。Python のこういうところ嫌いだわームカつくし諦めよ。
あと python
コマンドが変なので v3.8 にシンボリックリンク貼って黙らせた。なんやねんマジでクソ Python が。
$ sudo ln -s /usr/bin/python3.8 /usr/bin/python
$ sudo ln -s /usr/bin/python3.8 /usr/bin/python3.7
こういうところ Node.js みたいにちゃんとできるようになったら出直してね