Oracle Database 23ai freeで試すVector Search - ONNXモデル準備編 Tweet
Getting Started with Oracle Database 23ai AI Vector Search
https://blogs.oracle.com/database/post/getting-started-with-oracle-database-23ai-ai-vector-search
上記ブログを見て、これも実行計画はSQL文のレントゲン写真ネタにはなりそうだなぁ、と思い。いきなり準備を始めたw 今日はその準備(日本語対応してないモデル)
環境は以下の通り。
VirtualBOX向けPrebuild VMを利用(なお、Arm64ではなく、Intel Mac)
いずれ、Arm64のネタも書く予定ではいますがw それだと古ーーーーーい Oracleだとこんな結果に。。。みたいな差分比較ネタができないのでw とりあえず。X86_64環境にて。。。(Arm64ネイティブの環境はまだ作りかけなのでw)
クラウドではなくオフラインでも楽しめる環境(23ai Freeなのでリソース制限の範囲内で遊べる環境にします)を作ります。
*** mac info. ***
ProductName: macOS
ProductVersion: 12.7.6
BuildVersion: 21H1320
*** macOS ver. ***
Model Name: MacBook
Processor Name: Dual-Core Intel Core m5
Processor Speed: 1.2 GHz
Number of Processors: 1
Total Number of Cores: 2
Memory: 8 GB
*** VirtualBox ver. ***
7.0.10r158379
[oracle@localhost ~]$ cat /etc/oracle-release /etc/redhat-release
Oracle Linux Server release 8.9
Red Hat Enterprise Linux release 8.9 (Ootpa)
[oracle@localhost ~]$ uname -srpo
Linux 5.15.0-3.60.5.1.el8uek.x86_64 x86_64 GNU/Linux
SCOTT@localhost:1521/freepdb1> select banner_full from v$version;
BANNER_FULL
--------------------------------------------------------------------------------
Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free
Version 23.4.0.24.05
12.2 ONNXモデルへの事前トレーニング済モデルの変換: エンドツーエンドの手順
https://docs.oracle.com/cd/G28130_01/2-23ai/mlpug/convert-pretrained-models-onnx-model-end-end-instructions.html
の手順に沿ってONNIXモデルの準備をします!。
ドキュメント斜め読みしながら環境をつくっていたこともあり、インストールするOML4Pyのリリースが異なっていたりしてやり直したりしているログも含めているため読みずらいかもしれません。悪しからず m(_ _)m
進める前に、必要なversionなど事前確認しておくとスムーズだと思います。。。。。(お前が言うか〜っw
初っ端からエラー!。なんだろ。(すんなり進むのかこれw)
Pythonのインストール
[oracle@localhost ~]$ sudo yum install libffi-devel openssl openssl-devel tk-devel xz-devel zlib-devel bzip2-devel readline-devel libuuid-devel ncurses-devel libaio
[sudo] oracle のパスワード:
Oracle Linux 8 BaseOS Latest (x86_64) 0.0 B/s | 0 B 08:00
Errors during downloading metadata for repository 'ol8_baseos_latest':
- Curl error (28): Timeout was reached for https://yum-us-phoenix-1.oracle.com/repo/OracleLinux/OL8/baseos/latest/x86_64/repodata/repomd.xml
[Connection timed out after 120000 milliseconds]
エラー: repo 'ol8_baseos_latest' のメタデータのダウンロードに失敗しました : Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
こんなの見つけた〜。多分これ。
yum update not working using Oracle Database 23ai Free Container Image Lite #2900
https://github.com/oracle/docker-images/issues/2900
間違いない! これが原因だ
書かれている対策をまんま適用して。。。。
[oracle@localhost ~]$ cd /etc/yum/vars/
[oracle@localhost vars]$ ll
合計 8
-rw-r--r--. 1 root root 11 5月 1 2024 ocidomain
-rw-r--r--. 1 root root 14 5月 1 2024 ociregion
[oracle@localhost vars]$ cat ociregion
-us-phoenix-1
[oracle@localhost vars]$ cat ocidomain
oracle.com
[oracle@localhost vars]$ sudo mv ociregion ociregion.org
[sudo] oracle のパスワード:
[oracle@localhost vars]$ sudo su -
[root@localhost ~]# sudo echo -n "" > /etc/yum/vars/ociregion
[root@localhost ~]# cat /etc/yum/vars/ociregion
[root@localhost ~]# exit
logout
気を取り直して、もう一度。
[oracle@localhost ~]$ sudo yum install libffi-devel openssl openssl-devel tk-devel xz-devel zlib-devel bzip2-devel readline-devel libuuid-devel ncurses-devel libaio
Oracle Linux 8 BaseOS Latest (x86_64) 5.0 MB/s | 97 MB 00:19
Oracle Linux 8 Application Stream (x86_64) 5.1 MB/s | 70 MB 00:13
Latest Unbreakable Enterprise Kernel Release 7 for Oracle Linux 8 (x86_64) 5.1 MB/s | 61 MB 00:12
...略...
パッケージ openssl-1:1.1.1k-12.el8_9.x86_64 は既にインストールされています。
パッケージ libaio-0.3.112-1.el8.x86_64 は既にインストールされています。
依存関係が解決しました。
===============================================================================================================
パッケージ アーキテクチャー バージョン リポジトリー サイズ
===============================================================================================================
インストール:
bzip2-devel x86_64 1.0.6-28.el8_10 ol8_baseos_latest 224 k
...略...
xorg-x11-proto-devel noarch 2020.1-3.el8 ol8_appstream 280 k
トランザクションの概要
===============================================================================================================
インストール 32 パッケージ
アップグレード 10 パッケージ
ダウンロードサイズの合計: 17 M
これでよろしいですか? [y/N]: y
パッケージのダウンロード:
(1/42): bzip2-devel-1.0.6-28.el8_10.x86_64.rpm 1.5 MB/s | 224 kB 00:00
...略...
(42/42): util-linux-2.32.1-46.0.1.el8.x86_64.rpm 3.6 MB/s | 2.5 MB 00:00
-----------------------------------------------------------------------------------------------------------
合計 4.3 MB/s | 17 MB 00:03
...略...
トランザクションのテストに成功しました。
トランザクションを実行中
準備 : 1/1
scriptletの実行中: libuuid-2.32.1-46.0.1.el8.x86_64 1/1
アップグレード中 : libuuid-2.32.1-46.0.1.el8.x86_64 1/52
...略...
xz-devel-5.2.4-4.el8_6.x86_64 zlib-devel-1.2.11-25.el8.x86_64
[oracle@localhost ~]$ mkdir -p $HOME/python
[oracle@localhost ~]$ wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz
--2025-05-13 02:48:50-- https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz
...略...
HTTP による接続要求を送信しました、応答を待っています... 200 OK
長さ: 27195214 (26M) [application/octet-stream]
`Python-3.12.0.tgz' に保存中
Python-3.12.0.tgz 100%[===================================>] 25.93M 4.79MB/s 時間 5.4s
2025-05-13 02:48:56 (4.80 MB/s) - `Python-3.12.0.tgz' へ保存完了 [27195214/27195214]
[oracle@localhost ~]$ tar -xvzf Python-3.12.0.tgz --strip-components=1 -C /home/$USER/python
Python-3.12.0/Grammar/
Python-3.12.0/Grammar/python.gram
...略...
Python-3.12.0/Objects/tupleobject.c
Python-3.12.0/install-sh
[oracle@localhost ~]$ cd $HOME/python
[oracle@localhost python]$ ./configure --prefix=$HOME/python
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
...略...
configure: creating Modules/Setup.local
configure: creating Makefile
configure:
If you want a release build with all stable optimizations active (PGO, etc),
please run ./configure --enable-optimizations
[oracle@localhost python]$ make clean; make
find . -depth -name '__pycache__' -exec rm -rf {} ';'
find . -name '*.py[co]' -exec rm -f {} ';'
find . -name '*.[oa]' -exec rm -f {} ';'
...略...
# Pristine binaries before BOLT optimization.
rm -f *.prebolt
# BOLT instrumented binaries.
rm -f *.bolt_inst
gcc -pthread -c -fno-strict-overflow -Wsign-compare -DNDEBUG -g -O3 -Wall
-std=c11 -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wstrict-prototypes
-Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal
-I. -I./Include -DPy_BUILD_CORE -o Programs/python.o ./Programs/python.c
...略...
LC_ALL=C sed -e 's,\$(\([A-Za-z0-9_]*\)),\$\{\1\},g' < Misc/python-config.sh >python-config
The following modules are *disabled* in configure script:
_sqlite3
The necessary bits to build these optional modules were not found:
_dbm _gdbm nis
To find the necessary bits, look in configure.ac and config.log.
Checked 111 modules (31 built-in, 75 shared, 1 n/a on linux-x86_64, 1 disabled, 3 missing, 0 failed on import)
[oracle@localhost python]$ make altinstall
Creating directory /home/oracle/python/bin
Creating directory /home/oracle/python/lib
...略...
The necessary bits to build these optional modules were not found:
_dbm _gdbm nis
To find the necessary bits, look in configure.ac and config.log.
Checked 111 modules (31 built-in, 75 shared, 1 n/a on linux-x86_64, 1 disabled, 3 missing, 0 failed on import)
Creating directory /home/oracle/python/lib/python3.12
Creating directory /home/oracle/python/lib/python3.12/asyncio
...略...
Looking in links: /tmp/tmp9k7fgbr3
Processing /tmp/tmp9k7fgbr3/pip-23.2.1-py3-none-any.whl
Installing collected packages: pip
WARNING: The script pip3.12 is installed in '/home/oracle/python/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-23.2.1
ここまでマニュアルの通り。想定外のエラーもなく順調ですね〜
変数PYTHONHOME、PATHおよび LD_LIBRARY_PATHを設定〜python3およびpip3のシンボリックリンクの作成など
[oracle@localhost python]$ export PYTHONHOME=$HOME/python
[oracle@localhost python]$ export PATH=$PYTHONHOME/bin:$PATH
[oracle@localhost python]$ export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
[oracle@localhost python]$ cd $HOME/python/bin
[oracle@localhost bin]$ ln -s python3.12 python3
[oracle@localhost bin]$ ln -s pip3.12 pip3
[oracle@localhost bin]$ cd $HOME
[oracle@localhost ~]$ wget https://download.oracle.com/otn_software/linux/instantclient/2340000/instantclient-basic-linux.x64-23.4.0.24.05.zip
--2025-05-13 03:03:13-- https://download.oracle.com/otn_software/linux/instantclient/2340000/instantclient-basic-linux.x64-23.4.0.24.05.zip
...略...
HTTP による接続要求を送信しました、応答を待っています... 200 OK
長さ: 118377607 (113M) [application/zip]
`instantclient-basic-linux.x64-23.4.0.24.05.zip' に保存中
instantclient-basic-linux.x64-23.4.0.24.05.zip 100%[===========================>] 112.89M 5.83MB/s 時間 18s
2025-05-13 03:03:31 (6.38 MB/s) - `instantclient-basic-linux.x64-23.4.0.24.05.zip' へ保存完了 [118377607/118377607]
[oracle@localhost ~]$
[oracle@localhost ~]$ unzip instantclient-basic-linux.x64-23.4.0.24.05.zip
Archive: instantclient-basic-linux.x64-23.4.0.24.05.zip
replace META-INF/MANIFEST.MF? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
inflating: META-INF/MANIFEST.MF
...略...
instantclient_23_4/libocci.so.21.1 -> libocci.so.23.1
instantclient_23_4/libocci.so.22.1 -> libocci.so.23.1
[oracle@localhost ~]$
[oracle@localhost ~]$ export LD_LIBRARY_PATH=$HOME/instantclient_23_4:$LD_LIBRARY_PATH
.bashrcにも以下環境変数を追加しておきます
export PYTHONHOME=$HOME/python
export PATH=$PYTHONHOME/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
[oracle@localhost ~]$ vi .bashrc
[oracle@localhost ~]$ cat .bashrc
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
...略...
export TWO_TASK=FREEPDB1
fi
# Environment variables for Python
export PYTHONHOME=$HOME/python
export PATH=$PYTHONHOME/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
# Note: If Python is used to load models to the database, set this environment variable for the Oracle Instant Client.
export LD_LIBRARY_PATH=$HOME/instantclient_23_4:$LD_LIBRARY_PATH
requirements.txtの作成
[oracle@localhost ~]$ vi requirements.txt
[oracle@localhost ~]$ cat requirements.txt
--extra-index-url https://download.pytorch.org/whl/cpu
pandas==2.1.1
setuptools==68.0.0
scipy==1.12.0
matplotlib==3.8.4
oracledb==2.2.0
scikit-learn==1.4.1post1
numpy==1.26.4
onnxruntime==1.17.0
onnxruntime-extensions==0.10.1
onnx==1.16.0
torch==2.2.0+cpu
transformers==4.38.1
sentencepiece==0.2.0
pip3のアップグレード
[oracle@localhost ~]$ pip3 install --upgrade pip
Requirement already satisfied: pip in ./python/lib/python3.12/site-packages (23.2.1)
Collecting pip
Obtaining dependency information for pip
from https://files.pythonhosted.org/packages/29/a2/d40fb2460e883eca5199c62cfc2463fd261f760556ae6290f88488c362c0/pip-25.1.1-py3-none-any.whl.metadata
Downloading pip-25.1.1-py3-none-any.whl.metadata (3.6 kB)
Downloading pip-25.1.1-py3-none-any.whl (1.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 6.3 MB/s eta 0:00:00
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 23.2.1
Uninstalling pip-23.2.1:
Successfully uninstalled pip-23.2.1
Successfully installed pip-25.1.1
[oracle@localhost ~]$ pip3 install -r requirements.txt
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cpu
Collecting pandas==2.1.1 (from -r requirements.txt (line 2))
Downloading pandas-2.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (18 kB)
Collecting setuptools==68.0.0 (from -r requirements.txt (line 3))
...略...
Successfully installed MarkupSafe-3.0.2 certifi-2025.4.26 cffi-1.17.1 charset-normalizer-3.4.2 coloredlogs-15.0.1 contourpy-1.3.2
cryptography-44.0.3 cycler-0.12.1 filelock-3.18.0 flatbuffers-25.2.10 fonttools-4.58.0 fsspec-2025.3.2 hf-xet-1.1.1 huggingface-hub-0.31.1
humanfriendly-10.0 idna-3.10 jinja2-3.1.6 joblib-1.5.0 kiwisolver-1.4.8 matplotlib-3.8.4 mpmath-1.3.0 networkx-3.4.2 numpy-1.26.4 onnx-1.16.0
onnxruntime-1.17.0 onnxruntime-extensions-0.10.1 oracledb-2.2.0 packaging-25.0 pandas-2.1.1 pillow-11.2.1 protobuf-6.30.2 pycparser-2.22
pyparsing-3.2.3 python-dateutil-2.9.0.post0 pytz-2025.2 pyyaml-6.0.2 regex-2024.11.6 requests-2.32.3 safetensors-0.5.3 scikit-learn-1.4.1.post1
scipy-1.12.0 sentencepiece-0.2.0 setuptools-68.0.0 six-1.17.0 sympy-1.14.0 threadpoolctl-3.6.0 tokenizers-0.15.2 torch-2.2.0+cpu
tqdm-4.67.1 transformers-4.38.1 typing-extensions-4.13.2 tzdata-2025.2 urllib3-2.4.0
あ”!
oml-2.1をインストールしてしまっていたので、oml-2.0をインストールしなおし. (oml-2.1でやってもよかったけどw
Oracle Machine Learning for Python Downloads
https://www.oracle.com/database/technologies/oml4py-downloads.html
から、2.0をダウンロード(oml4py-client-linux-x86_64-2.0.zip)して、と。。。。
ということでreinstall
[oracle@localhost ~]$ unzip oml4py-client-linux-x86_64-2.0.zip
Archive: oml4py-client-linux-x86_64-2.0.zip
inflating: client/client.pl
inflating: client/OML4PInstallShared.pm
inflating: client/oml-2.0-cp312-cp312-linux_x86_64.whl
extracting: client/oml4py.ver
[oracle@localhost ~]$ pip3 install client/oml-2.0-cp312-cp312-linux_x86_64.whl
Processing ./client/oml-2.0-cp312-cp312-linux_x86_64.whl
Requirement already satisfied: numpy>=1.26.4 in ./python/lib/python3.12/site-packages (from oml==2.0) (2.2.5)
...略...
Requirement already satisfied: six>=1.5 in ./python/lib/python3.12/site-packages (from python-dateutil>=2.7->matplotlib>=3.7.2->oml==2.0) (1.17.0)
Requirement already satisfied: joblib>=1.2.0 in ./python/lib/python3.12/site-packages (from scikit-learn>=1.2.1->oml==2.0) (1.5.0)
Requirement already satisfied: threadpoolctl>=3.1.0 in ./python/lib/python3.12/site-packages (from scikit-learn>=1.2.1->oml==2.0) (3.6.0)
Installing collected packages: oml
Attempting uninstall: oml
Found existing installation: oml 2.1
Uninstalling oml-2.1:
Successfully uninstalled oml-2.1
Successfully installed oml-2.0
なにか怒られてるな。なぜだ。。。。numpy 1.xが必要っぽい。。。
[oracle@localhost ~]$ python3
Python 3.12.0 (main, May 13 2025, 02:54:53) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20.0.3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import oml
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.2.5 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "", line 1, in
File "/home/oracle/python/lib/python3.12/site-packages/oml/__init__.py", line 77, in
from oml.utils import *
File "/home/oracle/python/lib/python3.12/site-packages/oml/utils/__init__.py", line 23, in
from .embeddings import EmbeddingModelConfig,EmbeddingModel
File "/home/oracle/python/lib/python3.12/site-packages/oml/utils/_pipeline/__init__.py", line 22, in
from .PipelineBuilder import PipelineBuilder
...略...
/home/oracle/python/lib/python3.12/site-packages/torch/nn/modules/transformer.py:20: UserWarning:
Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.2.5 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "", line 1, in
File "/home/oracle/python/lib/python3.12/site-packages/oml/__init__.py", line 77, in
from oml.utils import *
...略...
from onnxruntime.capi._pybind_state import ExecutionMode # noqa: F401
File "/home/oracle/python/lib/python3.12/site-packages/onnxruntime/capi/_pybind_state.py", line 32, in
from .onnxruntime_pybind11_state import * # noqa
AttributeError: _ARRAY_API not found
/home/oracle/python/lib/python3.12/site-packages/oml/__init__.py:80: UserWarning: oml.utils import failed
warn('oml.utils import failed')
>>>
ということで、 numpy 1.x にする。。。うまくいった
[oracle@localhost ~]$ pip3 install --upgrade numpy==1.26.4
Collecting numpy==1.26.4
Using cached numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Using cached numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.0 MB)
Installing collected packages: numpy
Attempting uninstall: numpy
Found existing installation: numpy 2.2.5
Uninstalling numpy-2.2.5:
Successfully uninstalled numpy-2.2.5
Successfully installed numpy-1.26.4
やっふーーーーーーーー!
[oracle@localhost ~]$ python3
Python 3.12.0 (main, May 13 2025, 02:54:53) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20.0.3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>>
>>> import oml
>>>
>>> from oml.utils import EmbeddingModel, EmbeddingModelConfig
>>> EmbeddingModelConfig.show_preconfigured()
['sentence-transformers/all-mpnet-base-v2', 'sentence-transformers/all-MiniLM-L6-v2'
, 'sentence-transformers/multi-qa-MiniLM-L6-cos-v1', 'ProsusAI/finbert'
,'medicalai/ClinicalBERT', 'sentence-transformers/distiluse-base-multilingual-cased-v2'
, 'sentence-transformers/all-MiniLM-L12-v2', 'BAAI/bge-small-en-v1.5'
, 'BAAI/bge-base-en-v1.5', 'taylorAI/bge-micro-v2', 'intfloat/e5-small-v2', 'intfloat/e5-base-v2'
, 'prajjwal1/bert-tiny', 'thenlper/gte-base'
, 'thenlper/gte-small', 'TaylorAI/gte-tiny', 'infgrad/stella-base-en-v2'
, 'sentence-transformers/paraphrase-multilingual-mpnet-base-v2'
, 'intfloat/multilingual-e5-base', 'intfloat/multilingual-e5-small'
, 'sentence-transformers/stsb-xlm-r-multilingual']
>>>
DBMS_VECTOR.LOAD_ONNX_MODELを使用してデータベースに手動でアップロード可能な、ONNXファイルを生成
"事前構成済の埋込みモデルをローカル・ファイルにエクスポートします。oml.utilsからEmbeddingModelをインポートします。これにより、ONNX形式モデルがローカル・ファイル・システムにエクスポートされます。"
ディレクトリオプジェクト向けのディレクトリを先に作成し、そのディレクトリで行うと便利ですね。
[oracle@localhost ~]$ cd work4vector/
[oracle@localhost work4vector]$ pwd
/home/oracle/work4vector
[oracle@localhost work4vector]$ python3
Python 3.12.0 (main, May 13 2025, 02:54:53) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20.0.3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from oml.utils import EmbeddingModel
>>> em = EmbeddingModel(model_name="sentence-transformers/all-MiniLM-L6-v2")
>>> em.export2file("all-MiniLM-L6-v2", output_dir=".")
/home/oracle/python/lib/python3.12/site-packages/huggingface_hub/file_download.py:943: FutureWarning: `resume_download` is deprecated
and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
tokenizer_config.json: 100%|████████████████████████████████████████████████| 350/350 [00:00<00:00, 821kB/s]
vocab.txt: 100%|████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 565kB/s]
special_tokens_map.json: 100%|██████████████████████████████████████████████| 112/112 [00:00<00:00, 463kB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████████| 466k/466k [00:00<00:00, 1.47MB/s]
config.json: 100%|██████████████████████████████████████████████████████████| 612/612 [00:00<00:00, 3.88MB/s]
model.safetensors: 100%|████████████████████████████████████████████████████| 90.9M/90.9M [00:13<00:00, 6.72MB/s]
>>>
[oracle@localhost work4vector]$ ll all-MiniLM-L6-v2.onnx
-rw-rw-r--. 1 oracle oracle 90621438 5月 13 05:47 all-MiniLM-L6-v2.onnx
[oracle@localhost work4vector]$ sqlplus sys@localhost:1521/freepdb1 as sysdba
...略...
Oracle Database 23ai Free Release 23.0.0.0.0 - Develop, Learn, and Run for Free
Version 23.4.0.24.05
に接続されました。
SYS@localhost:1521/freepdb1> create directory onnx_dir as '/home/oracle/work4vector';
ディレクトリが作成されました。
経過: 00:00:00.08
SYS@localhost:1521/freepdb1> !pwd
/home/oracle/work4vector
SYS@localhost:1521/freepdb1> grant read,write on directory onnx_dir to scott;
権限付与が成功しました。
経過: 00:00:00.05
SYS@localhost:1521/freepdb1> grant create mining model to scott;
権限付与が成功しました。
経過: 00:00:00.02
DBMS_VECTOR.LOAD_ONNX_MODELプロシージャを使用して、OMLユーザー・スキーマにモデルをロードする
コードは以下
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL(
directory => 'ONNX_DIR',
file_name => 'all-MiniLM-L6-v2.onnx',
model_name => 'ALL_MINILM_L6');
END;
/
SCOTT@localhost:1521/freepdb1> l
1 BEGIN
2 DBMS_VECTOR.LOAD_ONNX_MODEL(
3 directory => 'ONNX_DIR',
4 file_name => 'all-MiniLM-L6-v2.onnx',
5 model_name => 'ALL_MINILM_L6');
6* END;
SCOTT@localhost:1521/freepdb1> /
PL/SQLプロシージャが正常に完了しました。
経過: 00:00:08.93
SQLを使用してモデルを確認
SQL文は以下
SELECT
model_name
, algorithm
, mining_function
FROM
user_mining_models
WHERE
model_name = 'ALL_MINILM_L6'
/
SCOTT@localhost:1521/freepdb1> l
1 SELECT
2 model_name
3 , algorithm
4 , mining_function
5 FROM
6 user_mining_models
7 WHERE
8* model_name = 'ALL_MINILM_L6'
SCOTT@localhost:1521/freepdb1> /
MODEL_NAME ALGORITHM MINING_FUNCTION
------------------------------ ------------------------------ ------------------------------
ALL_MINILM_L6 ONNX EMBEDDING
経過: 00:00:00.01
ユーザーがアクセスできるモデルを確認するビューからも確認しておきます。
SQLはこんな感じ
SELECT
view_name
, view_type
FROM
user_mining_model_views
WHERE
model_name = 'ALL_MINILM_L6'
ORDER BY
view_name
/
SCOTT@localhost:1521/freepdb1> l
1 SELECT
2 view_name
3 , view_type
4 FROM
5 user_mining_model_views
6 WHERE
7 model_name = 'ALL_MINILM_L6'
8 ORDER BY
9* view_name
SCOTT@localhost:1521/freepdb1> /
VIEW_NAME VIEW_TYPE
------------------------------ ------------------------------
DM$VJALL_MINILM_L6 ONNX Metadata Information
DM$VMALL_MINILM_L6 ONNX Model Information
DM$VPALL_MINILM_L6 ONNX Parsing Information
経過: 00:00:00.01
上記ビューを問い合わせモデルの情報を確認:)
SCOTT@localhost:1521/freepdb1> SELECT * FROM dm$vmall_minilm_l6;
NAME VALUE
------------------------------ -----------------------------------------------
Producer Name onnx.compose.merge_models
Graph Name tokenizer_main_graph
Graph Description Graph combining tokenizer and main_graph
tokenizer
main_graph
Version 1
Input[0] input:string[?]
Output[0] embedding:float32[?,384]
6行が選択されました。
経過: 00:00:00.00
VECTOR_EMBEDDING SQLスコアリング関数を軽く試す!
SQL文はこんな感じ
SELECT VECTOR_EMBEDDING(ALL_MINILM_L6 USING 'RES' as DATA) AS embedding;
SCOTT@localhost:1521/freepdb1> r
1* SELECT VECTOR_EMBEDDING(ALL_MINILM_L6 USING 'RES' as DATA) AS embedding
EMBEDDING
----------------------------------------------------------------------------------------------------
[-1.16423056E-001,1.54331746E-002,-4.69262414E-002,7.16730766E-003,3.50234732E-002,-4.02988419E-002,
.08232533E-002,4.99225073E-002,-1.86311249E-002,-2.62796488E-002,-3.2601878E-002,5.22731952E-002,-9.
...略...
-003,6.00763485E-002,1.91014066E-001,7.64457136E-002,1.46513591E-002,3.13854888E-002]
経過: 00:00:00.36
関数しか実行していないので面白い実行計画は現れませんが一応、確認だけw
SCOTT@localhost:1521/freepdb1> set autot trace exp stat
SCOTT@localhost:1521/freepdb1> r
1* SELECT VECTOR_EMBEDDING(ALL_MINILM_L6 USING 'RES' as DATA) AS embedding
経過: 00:00:00.33
実行計画
----------------------------------------------------------
Plan hash value: 1388734953
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------
統計
----------------------------------------------------------
0 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
2261 bytes sent via SQL*Net to client
473 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
user_mining_models 、user_mining_model_attributesビューなんてのありますね。ふむふむ
SCOTT@localhost:1521/freepdb1> SELECT model_name, mining_function, algorithm, algorithm_type, model_size FROM user_mining_models;
MODEL_NAME MINING_FUNCTION ALGORITHM ALGORITHM_ MODEL_SIZE
------------------------------ ------------------------------ ------------------------------ ---------- ----------
ALL_MINILM_L6 EMBEDDING ONNX NATIVE 90621438
経過: 00:00:00.02
SCOTT@localhost:1521/freepdb1> l
1 SELECT model_name, attribute_name, attribute_type, data_type, vector_info
2* FROM user_mining_model_attributes
SCOTT@localhost:1521/freepdb1> /
MODEL_NAME ATTRIBUTE_NAME ATTRIBUTE_TY DATA_TYPE VECTOR_INFO
------------------------------ ------------------------------ ------------ ------------------------------ ------------------------------
ALL_MINILM_L6 ORA$ONNXTARGET VECTOR VECTOR VECTOR(384,FLOAT32)
ALL_MINILM_L6 DATA TEXT VARCHAR2
経過: 00:00:00.01
ONNX形式の埋込みモデルを使用して、ユーザーの入力テキスト文字列「hello」をベクトルに変換してみる。ベクトル眺めても意味わからないけども、とりあえず、できてるみたいw
SCOTT@localhost:1521/freepdb1> col EMBEDDING for a200
SCOTT@localhost:1521/freepdb1> SELECT TO_VECTOR(VECTOR_EMBEDDING(all_minilm_l6 USING 'hello' as data)) AS EMBEDDING;
EMBEDDING
-----------------------------------------------------------------------------------------------
[-6.27717897E-002,5.49588911E-002,5.21648414E-002,8.57899487E-002,-8.27489197E-002,-7.45729804E-
002,6.85546845E-002,1.83963589E-002,-8.20114315E-002,-3.73847559E-002,1.21248914E-002,3.51829384
E-003,-4.13423125E-003,-4.37844135E-002,2.18073577E-002,-5.10276016E-003,1.95467062E-002,-4.2348
...略...
-4.67414372E-002,-1.34112127E-002,6.51347339E-002,5.09059429E-002,5.1483497E-002,7.09215924E-003]
経過: 00:00:00.67
ということで、準備完了(このモデルだと日本語は対応してないみたいだけど、ひとまず、軽く遊べる小さなOracle Database 23ai Free on VirtualBOXの環境の準備完了!
SQL文もここまでくると、何やっているのか理解しながら進めないと、迷子になりそうな気がしないでもないw
次回へつづく。
| 固定リンク | 0


コメント