統計学 | First Wave

AIレコメンド機能を乗せれるかもしれない.VPSだから微妙 #TFIDFベクトライザ

2025.06.03

おはようございます.AIレコメンド機能を乗せれるかもしれないけど無理かもしれないということで、まだ試してはいないのだけど機械学習で学習済みのモデルをVPSに乗せ動かすことが出来れば、AIレコメンド機能が出来そうです.

いまある記事のデータのタグ付け部分をTF-IDFベクトライザの学習させれば案外簡単に学習させることが出来そうなので生成AIにコードを書いてもらいました.

尚、この方法はECサイトの商品のレコメンド機能にも同じような感じでデータを与えるとレコメンドしてくれたりします.

最後にPythonコードを貼っときます.VPSサーバで再学習できれば良いだけども難しいかもしれない、、、.

import pickle
import os

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd

# モデル保存ファイル名
MODEL_PATH = "tfidf_vectorizer.pkl"  # TfidfVectorizer
DATA_PATH = "article_tags.pkl"       # 記事IDとタグ

# 類似度モデル構築（再学習対応）
def build_similarity_model(article_tags_input, save_model=True, retrain=False):
    # 再学習時は既存データを読み込んで結合する
    if retrain and os.path.exists(DATA_PATH):
        with open(DATA_PATH, "rb") as f:
            existing_data = pickle.load(f)
        existing_data.update(article_tags_input)  # 新しいデータを追加
        article_tags_input = existing_data

    article_ids = list(article_tags_input.keys())
    processed_corpus = [" ".join(tags) for tags in article_tags_input.values()]

    # 再学習またはモデル未保存時に新たにモデルを学習・保存
    if retrain or not os.path.exists(MODEL_PATH):
        vectorizer = TfidfVectorizer()
        tfidf_matrix = vectorizer.fit_transform(processed_corpus)

        if save_model:
            with open(MODEL_PATH, "wb") as f:
                pickle.dump(vectorizer, f)  # ← ここでモデルを保存
            with open(DATA_PATH, "wb") as f:
                pickle.dump(article_tags_input, f)  # ← ここで元データを保存

    else:
        # 保存済みモデルを使って変換する
        with open(MODEL_PATH, "rb") as f:
            vectorizer = pickle.load(f)
        tfidf_matrix = vectorizer.transform(processed_corpus)

    cosine_sim_matrix = cosine_similarity(tfidf_matrix)
    cosine_sim_df = pd.DataFrame(cosine_sim_matrix, index=article_ids, columns=article_ids)

    return cosine_sim_df, article_ids

# 類似記事を取得する関数
def get_recommendations(article_title, similarity_matrix, articles_map, top_n=3):
    if article_title not in articles_map:
        print(f"エラー: 記事 '{article_title}' が見つかりません。")
        return []

    sim_scores = list(enumerate(similarity_matrix[article_title]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)

    recommended_articles = []
    for i, score in sim_scores:
        if articles_map[i] != article_title and len(recommended_articles) < top_n:
            recommended_articles.append((articles_map[i], score))
        if len(recommended_articles) >= top_n:
            break

    return recommended_articles

# 入力記事データ ( 例 )
article_tags_input = {
    "記事1": ["Python", "機械学習", "データサイエンス"],
    "記事2": ["Python", "Web開発", "Django"],
    "記事3": ["機械学習", "自然言語処理"],
    "記事4": ["データサイエンス", "統計学"],
    "記事5": ["Python", "データサイエンス", "可視化"]
}

# 類似度モデル構築 + モデル保存 ( 初回学習 )
cosine_sim_df, article_ids = build_similarity_model(article_tags_input)

# 使用例
target_article = "記事1"
recommendations = get_recommendations(target_article, cosine_sim_df, article_ids, top_n=2)
print(f"\n「{target_article}」へのおすすめ記事 ( 上位2件 ):")
for article, score in recommendations:
    print(f"- {article} (類似度: {score:.4f})")

# 新しい記事を追加して再学習
new_article_id = "記事6"
new_article_tags = ["Python", "統計学"]
article_tags_input = {new_article_id: new_article_tags}

# 再構築＋再学習
cosine_sim_df, article_ids = build_similarity_model(article_tags_input, retrain=True)
target_article = new_article_id
recommendations = get_recommendations(target_article, cosine_sim_df, article_ids, top_n=2)
print(f"\n「{target_article}」へのおすすめ記事 ( 上位2件 ):")
for article, score in recommendations:
    print(f"- {article} (類似度: {score:.4f})")

明日へ続く

著者名 @taoka_toshiaki

※この記事は著者が40代前半に書いたものです．

Profile
高知県在住の@taoka_toshiakiです、記事を読んで頂きありがとうございます.
数十年前から息を吸うように日々記事を書いてます．たまに休んだりする日もありますがほぼ毎日投稿を心掛けています😅．
SNSも使っています、フォロー、いいね、シェア宜しくお願い致します🙇.
SNS::@taoka_toshiaki

OFUSEで応援を送る

投稿日時 2025年06月03 07:00日

タグ

エラー, コード, サーバ, タグ, データサイエンス, ベクトライザ, レコメンド機能, 上位件, 初回学習, 学習, 学習済み, 既存データ, 最後, 機械学習, 統計学, 自然言語処理, 良いだけ, 関数, 類似度, 類似度モデル構築,

この頃の疑問。性格診断や手相って当たるのか？

2018.07.14

Logging

名前占いや手相って当たるのか？
なんとく当たっている気がするし、統計学な要素もあるから
結構当たっているとは思うのだけど、そもそも何故当たるのかという
事を考えたときに、占いはそれなり人の心を動かしているような気がする。
例えば本人や知人の名前を占って
その結果を見てこの人はこんな性格なんだと
知らじず知らずのうちに、固定概念を埋め込まれている。
その事によってあの人は「こういう人だから」と思い込みが生まれる。
思い込みが知らじず知らず社会を動かしている気がする。
本当はそのひとの心の深層を見えていないのだけど
人っていうのは見えないものや未知のものを怖がる心理が
あると、そういう怖い部分を名前占いや手相で
昔の人達は軽減していたのかもしれないなとか・・・
それが脈々と現代にも受け継がれてきたのかも。
早朝、ウォーキングしてた時にふと思った。

著者名 @taoka_toshiaki

※この記事は著者が30代前半に書いたものです．

OFUSEで応援を送る

投稿日時 2018年07月14 05:10日

タグ

そのひと, 名前占い, 固定概念, 基礎, 完全独習, 心理, 怖い部分, 思い込み, 性格診断, 手相, 早朝, 未知, 深層, 現代, 知人, 統計学, 要素,

Divination or AI

2018.01.05

Logging

Hand divination
Fortune telling is statistical so it can be trusted quite a bit.
So, I believe in fortune telling.Also, psychology is pretty reliable.
I think that artificial intelligence is fundamental, statistics.
So, the same as fortunetelling. Therefore, both have high reliability.
It seems that statistics has become popular
also in the general public as artificial intelligence has become popular.
I believe that it is now possible to infer people’s lives using Google’s large data.
In the future, artificial intelligence will give you instructions and if the instructions allow people to live a happy life of life.
Will you follow the instructions of artificial intelligence?
I think I will obey. But I think most people will not obey.

著者名 @taoka_toshiaki

※この記事は著者が30代前半に書いたものです．

OFUSEで応援を送る

投稿日時 2018年01月05 05:00日

タグ

also in the general public as artificial, artificial intelligence will give you instructions, both have high reliability, But I think most people will not, Divination or AI, Fortune telling is statistical so it can, Hand divination, I believe in fortune telling.Also, I believe that it is now possible, I think I will obey, I think that artificial intelligence is, In the future, It seems that statistics has become popular, psychology is pretty reliable, the same as fortunetelling, Therefore, Will you follow the instructions of artificial, 統計学,

@Blog

日常日誌からプログラムやYOUTUBER紹介、旅日記まで日々更新中です。

AIレコメンド機能を乗せれるかもしれない.VPSだから微妙 #TFIDFベクトライザ

この頃の疑問。性格診断や手相って当たるのか？

Divination or AI