文爱 x 手把手教养你用 AI 和 Python 进行股票交游瞻望（齐备代码干货）

53kkk你的位置：萝莉 > 53kkk > 文爱 x 手把手教养你用 AI 和 Python 进行股票交游瞻望（齐备代码干货）

文爱 x 手把手教养你用 AI 和 Python 进行股票交游瞻望（齐备代码干货）

发布日期：2025-01-03 08:42 点击次数：76

文爱 x 手把手教养你用 AI 和 Python 进行股票交游瞻望（齐备代码干货）

图片

写在前边的话：本文手把手教养寰宇使用 Python 和 AI 进行股票交游瞻望。开端先容了不同的瞻望活动，绝顶是 LSTM 处理序列瞻望的才能。然后提供了意见考据要领，包括装配、创建模式等，还展示代码建筑，如导入库、用函数测验测试模子，临了还评估了模子的性能。

咱们探寻了多种瞻望股价的面孔，像 Facebook 的 Prophet 等瞻望器具、SARIMA 模子等统计技能、多项式记忆等机器学习政策，还有基于东谈主工智能的轮回神经网罗（RNN）。在庞杂东谈主工智能模子与技能里，咱们发现瑕瑜时操心（LSTM）模子能带来最理念念的扫尾。

LSTM 模子是递归神经网罗架构的一种变形，擅所长理序列瞻望贵重。它与传统的前馈神经网罗不同，具有雷同操心的结构，能在无数序列中保留凹凸文数据。这一特色使其相当符合时分序列瞻望、当然话语处理以偏激他依赖序列数据的任务。它通过缓解消成仇梯度爆炸问题，措置了圭臬 RNN 的基本劣势，从而升迁了模子识别数据集内恒久依赖关系的才能。因此，LSTM 已成为需要万古分深入融会数据的复杂任务的首选。

为了考据其灵验性，咱们开垦了一个意见考据。

一、准备责任

你需要在你的缱绻机中（或聘用使用 VSCode 会愈加浅近）装配最新版块的 Python 和 PIP。（https://code.visualstudio.com/）

创建一个带有 “main.py “文献的 Python 模式。

在模式中添加 “data”目次。

建立并激活虚构环境。

trading-ai-lstm $ python3 -m venv venvtrading-ai-lstm $ source venv/.bin/activate(venv) trading-ai-lstm $

创建一个 “requirements.txt “文献。

pandasnumpyscikit-learnscipymatplotlibtensorfloweodhdpython-dotenv

确保已在虚构环境中升级 PIP 并装配依赖项。

(venv) trading-ai-lstm $ pip install --upgrade pip(venv) trading-ai-lstm $ python3 -m pip install -r requirements.txt

需要在”.env “文献中加入了 EODHD API 的 API 密钥。

API_TOKEN=<YOUR_API_KEY_GOES_HERE>

一切就绪。如果你正在使用 VSCode ，并但愿使用与咱们相同的”.vscode/settings.json “文献，请点击 Fork 本模式 GitHub 仓库(https://github.com/alexyu2013/trading-ai-lstm)，以备未雨缱绻。

  'python.formatting.provider': 'none'，  'python.formatting.blackArgs': ['--line-length'， '160']，  'python.linting.flake8Args': [    '--max-line-length=160'，    '--ignore=E203，E266，E501，W503，F403，F401，C901'  ]，  'python.analysis.diagnosticSeverityOverrides': {    'reportUnusedImport': 'information'，    'reportMissingImports': 'none'  }，  '[python]': {    'editor.defaultFormatter': 'ms-python.black-formatter'  }}

二、代码构建

第一步是导入必要的库。

import osos.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'import pickleimport pandas as pdimport numpy as npfrom dotenv import load_dotenvfrom sklearn.metrics import mean_squared_error， mean_absolute_errorfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import LSTM， Dense， Dropoutfrom tensorflow.keras.models import load_modelfrom sklearn.preprocessing import MinMaxScalerimport matplotlib.pyplot as pltfrom eodhd import APIClient

TensorFlow 时常会自动生成诸多劝诫与调试信息。而咱们更倾向于爽脆明了的输出，故而对这些告知进行了放置。这不错在导入“os”模块后，借助 os.environ 来结束。

机器学习和东谈主工智能模子的测验过程需要无数的微调，主若是通过所谓的超参数（hyperparameters）进行治理。这个问题散乱有致，掌捏它需要不停学习和耐性，最好超参数的聘用受到各式身分的影响。凭据咱们通过 EODHD API （https://eodhd.com/）取得的圭臬普尔 500 指数逐日数据，咱们开端使用了一些广为认同的建立。咱们饱读舞您修改这些建立以提高扫尾。现时，提倡将序列长度保持在 20。

# Configurable hyperparametersseq_length = 20batch_size = 64lstm_units = 50epochs = 100

下一步是从咱们的”.env “文献中取得 EODHD API 的 API_TOKEN。

# Load environment variables from the .env fileload_dotenv()# Retrieve the API keyAPI_TOKEN = os.getenv('API_TOKEN')if API_TOKEN is not None:    print(f'API key loaded: {API_TOKEN[:4]}********')else:    raise LookupError('Failed to load API key.')

需要确保领有灵验的 EODHD API 的 API_TOKEN 才能见效看望数据。

咱们照旧建筑了几个可重叠使用的函数，并将鄙人文中详实先容它们的功能。我把这些函数进行了代码谛视，以表示其操作。

def get_ohlc_data(use_cache: bool = False) -> pd.DataFrame:    ohlcv_file = 'data/ohlcv.csv'    if use_cache:        if os.path.exists(ohlcv_file):            return pd.read_csv(ohlcv_file， index_col=None)        else:            api = APIClient(API_TOKEN)            df = api.get_historical_data(                symbol='HSPX.LSE'，                interval='d'，                iso8601_start='2010-05-17'，                iso8601_end='2023-10-04'，            )            df.to_csv(ohlcv_file， index=False)            return df    else:        api = APIClient(API_TOKEN)        return api.get_historical_data(            symbol='HSPX.LSE'，            interval='d'，            iso8601_start='2010-05-17'，            iso8601_end='2023-10-04'，        )def create_sequences(data， seq_length):    x， y = []， []    for i in range(len(data) - seq_length):        x.append(data[i : i + seq_length])        y.append(data[i + seq_length， 3])  # The prediction target 'close' is the 4th column (index 3)    return np.array(x)， np.array(y)def get_features(df: pd.DataFrame = None， feature_columns: list = ['open'， 'high'， 'low'， 'close'， 'volume']) -> list:    return df[feature_columns].valuesdef get_target(df: pd.DataFrame = None， target_column: str = 'close') -> list:    return df[target_column].valuesdef get_scaler(use_cache: bool = True) -> MinMaxScaler:    scaler_file = 'data/scaler.pkl'    if use_cache:        if os.path.exists(scaler_file):            # Load the scaler            with open(scaler_file， 'rb') as f:                return pickle.load(f)        else:            scaler = MinMaxScaler(feature_range=(0， 1))            with open(scaler_file， 'wb') as f:                pickle.dump(scaler， f)            return scaler    else:        return MinMaxScaler(feature_range=(0， 1))def scale_features(scaler: MinMaxScaler = None， features: list = []):    return scaler.fit_transform(features)def get_lstm_model(use_cache: bool = False) -> Sequential:    model_file = 'data/lstm_model.h5'    if use_cache:        if os.path.exists(model_file):            # Load the model            return load_model(model_file)        else:            # Train the LSTM model and save it            model = Sequential()            model.add(LSTM(units=lstm_units， activation='tanh'， input_shape=(seq_length， 5)))            model.add(Dropout(0.2))            model.add(Dense(units=1))            model.compile(optimizer='adam'， loss='mean_squared_error')            model.fit(x_train， y_train， epochs=epochs， batch_size=batch_size， validation_data=(x_test， y_test))            # Save the entire model to a HDF5 file            model.save(model_file)            return model    else:        # Train the LSTM model        model = Sequential()        model.add(LSTM(units=lstm_units， activation='tanh'， input_shape=(seq_length， 5)))        model.add(Dropout(0.2))        model.add(Dense(units=1))        model.compile(optimizer='adam'， loss='mean_squared_error')        model.fit(x_train， y_train， epochs=epochs， batch_size=batch_size， validation_data=(x_test， y_test))        return modeldef get_predicted_x_test_prices(x_test: np.ndarray = None):    predicted = model.predict(x_test)    # Create a zero-filled matrix to aid in inverse transformation    zero_filled_matrix = np.zeros((predicted.shape[0]， 5))    # Replace the 'close' column of zero_filled_matrix with the predicted values    zero_filled_matrix[:， 3] = np.squeeze(predicted)    # Perform inverse transformation    return scaler.inverse_transform(zero_filled_matrix)[:， 3]def plot_x_test_actual_vs_predicted(actual_close_prices: list = []， predicted_x_test_close_prices = []) -> None:    # Plotting the actual and predicted close prices    plt.figure(figsize=(14， 7))    plt.plot(actual_close_prices， label='Actual Close Prices'， color='blue')    plt.plot(predicted_x_test_close_prices， label='Predicted Close Prices'， color='red')    plt.title('Actual vs Predicted Close Prices')    plt.xlabel('Time')    plt.ylabel('Price')    plt.legend()    plt.show()def predict_next_close(df: pd.DataFrame = None， scaler: MinMaxScaler = None) -> float:    # Take the last X days of data and scale it    last_x_days = df.iloc[-seq_length:][['open'， 'high'， 'low'， 'close'， 'volume']].values    last_x_days_scaled = scaler.transform(last_x_days)    # Reshape this data to be a single sequence and make the prediction    last_x_days_scaled = np.reshape(last_x_days_scaled， (1， seq_length， 5))    # Predict the future close price    future_close_price = model.predict(last_x_days_scaled)    # Create a zero-filled matrix for the inverse transformation    zero_filled_matrix = np.zeros((1， 5))    # Put the predicted value in the 'close' column (index 3)    zero_filled_matrix[0， 3] = np.squeeze(future_close_price)    # Perform the inverse transformation to get the future price on the original scale    return scaler.inverse_transform(zero_filled_matrix)[0， 3]def evaluate_model(x_test: list = []) -> None:    # Evaluate the model    y_pred = model.predict(x_test)    mse = mean_squared_error(y_test， y_pred)    mae = mean_absolute_error(y_test， y_pred)    rmse = np.sqrt(mse)    print(f'Mean Squared Error: {mse}')    print(f'Mean Absolute Error: {mae}')    print(f'Root Mean Squared Error: {rmse}')

咱们需矜重指出的是，在各样函数中增添了“use_cache”变量。此政策意在缩小对 EODHD 应用轨范接口的冗余 API 调用，留神行使相同的逐日数据对模子进行重叠的再行测验。激活“use_cache”变量会将数据存储至“data/”目次下的文献里。若数据不存在，则会创建；若已存在，则会加载。当屡次运行剧本时，此活动能权臣升迁遵守。若要在每次运行时取得新数据，只需在调用函数时禁用“use_cache”选项或清空“data/”目次中的文献，就能得到相同的扫尾。

当今投入代码的中枢部分…文爱 x

if __name__ == '__main__':    # Retrieve 3369 days of S&P 500 data    df = get_ohlc_data(use_cache=True)    print(df)

开端，咱们从 EODHD API 取得 OHLCV 数据，并将其存入名为 “df “的 Pandas DataFrame。OHLCV 暗意开盘价、最高价、最廉价、收盘价和成交量，是交游烛炬图数据的圭臬属性。如前所述，咱们启用了缓存以简化过程。咱们还不错聘用在屏幕上深刻这些数据。

图片

咱们将一次性先容以下代码块…

    features = get_features(df)    target = get_target(df)    scaler = get_scaler(use_cache=True)    scaled_features = scale_features(scaler， features)    x， y = create_sequences(scaled_features， seq_length)    train_size = int(0.8 * len(x))  # Create a train/test split of 80/20%    x_train， x_test = x[:train_size]， x[train_size:]    y_train， y_test = y[:train_size]， y[train_size:]    # Re-shape input to fit lstm layer    x_train = np.reshape(x_train， (x_train.shape[0]， seq_length， 5))  # 5 features    x_test = np.reshape(x_test， (x_test.shape[0]， seq_length， 5))  # 5 features

“features” 包括咱们将用来瞻望场地（即 “close”）的一系列输入。

“target” 包含一个场地值列表，如 “close“。

“scaler”代表一种用于将数字圭臬化的活动，使它们具有可比性。举例，咱们的数据集运行时的接近值可能是 784，临了可能是 3538。临了一滑的数字越高，并不料味着瞻望的兴味越大。归一化可确保可比性。

“scaled_features” 是缩放过程的扫尾，咱们将用它来测验东谈主工智能模子。

“x_train” and “x_test” 分手暗意咱们将用于测验和测试东谈主工智能模子的数据集，频繁的作念法是 80/20 分拨。这意味着 80% 的交游数据用于测验，20% 用于测试模子。x “暗意这些特征或输入。

“y_train” and “y_test” 的功能雷同，但只包含场地值，如 “close”。

临了，必须对数据进行重塑，以称心 LSTM 层的条目。

咱们开垦了一种功能，既能对模子进行再行测验，又能载入之前已测验好的模子。

model = get_lstm_model(use_cache=True)

图片

从深刻的图片中不错一窥测验序列。你会发现，开端， “loss”和 “val_loss” 方针可能并不完全一致。不外，跟着测验的进行，这些数据有望趋于一致，这标明测验取得了进展。

全色网

Loss: 这是在测验数据集上缱绻的均方弊端（MSE）。它响应了每个测验期瞻望标签和真确标签之间的“cost” 或 “error” 。咱们的场地是通过聚积的历时来减少这一数字。

Val_loss: 这个均方弊端是在考据数据集上详情的，用于掂量模子在测验过程中未碰到的数据上的进展。它是模子泛化到新的未见数据才能的方针。

张望测试集的瞻望收盘价列表，不错使用此代码。

    predicted_x_test_close_prices = get_predicted_x_test_prices(x_test)    print('Predicted close prices:'， predicted_x_test_close_prices)

单看这些数据，可能并络续顶具有启发性或直不雅。不外，通过绘画实质收盘价与瞻望收盘价的对比图（请防卫，这只占通盘这个词数据集的 20%），咱们不错得到更明晰的图像，如下图所示。

# Plot the actual and predicted close prices for the test data    plot_x_test_actual_vs_predicted(df['close'].tail(len(predicted_x_test_close_prices)).values， predicted_x_test_close_prices)

图片

扫尾标明，在测试阶段，该模子在瞻望收盘价方面进展出色。

当今，咱们来望望最令东谈主期待的方面：咱们能详情未来的瞻望收盘价吗？

   # Predict the next close price    predicted_next_close =  predict_next_close(df， scaler)    print('Predicted next close price:'， predicted_next_close)Predicted next close price: 3536.906685638428

这是一个用于解释主义的基本示例，只是是一个运行。从这里运行，您不错斟酌加入更多的测验数据，调遣超参数，或将模子应用于不同的商场和时分区间。如果您念念对模子进行评估，不错将其包括在内。

 # Evaluate the model    evaluate_model(x_test)

在咱们的决议中的输出情况是

Mean Squared Error: 0.00021641664334765608Mean Absolute Error: 0.01157513692221611Root Mean Squared Error: 0.014711106122506767

“平均平时弊端”（mean_squared_error）和 “平均齐全弊端”（mean_absolute_error）函数来自 scikit-learn 的度量模块，分手用于缱绻平均平时弊端（MSE）和平均齐全弊端（MAE）。均方根弊端 (RMSE) 是通过对 MSE 取平时根得出的。

这些方针为模子的准确性提供了数字化的评估，也为模子的性能进行了定量的分析，而图形化的展示则更有意于直不雅地对比瞻望值与实质数值，以及直不雅地比拟瞻望值和实质值。

三、总结

在本文中我详实先容了用 Python 和 AI 作念交游瞻望的过程。开端是各式瞻望办法，像 Facebook 的 Prophet、SARIMA 模子、多项式记忆，还有基于东谈主工智能的轮回神经网罗（RNN），这内部我合计 LSTM 模子最是非。LSTM 模子是种畸形的递归神经网罗，能处理序列瞻望问题，还措置了圭臬 RNN 的消成仇梯度爆炸问题，符合时分序列瞻望和当然话语处理这些任务。

接下来，我给寰宇提供了一个意见考据的准备要领，包括装配Python和PIP、创建模式和文献、建立虚构环境以及创建requirements.txt文献。还包括 VSCode的建立文献示例，以及本模式的 GitHub 代码仓库。

而在建筑代码的部分，我详实表示了若何导入必要的库和调用 EODHD API’s，并先容了一系列可重用的函数，这些函数用于取得数据、创建序列、取得特征和场地值、缩放特征、取得LSTM模子、进行瞻望以及评估模子。此外，咱们还商榷了若何使用缓存来减少无用要的API调用和数据重叠加载。

临了，本文展示了若何使用这些函数来测验和测试LSTM模子，并展示了若何瞻望下一个交游日的收盘价。通过比拟实质收盘价和瞻望收盘价的图表，以及缱绻均方弊端（MSE）、均方根弊端（RMSE）和均齐全弊端（MAE）等方针，来评估模子的性能。简略总结起来即是底下6句话：

LSTM模子在交游瞻望中的成果优于其他活动，因为它约略更好地处理恒久依赖问题。

使用缓存机制不错提高数据处理的遵守，幸免重叠的API调用和模子测验。

通过可视化实质和瞻望的收盘价，以及缱绻联系的弊端方针，不错直不雅地评估模子的瞻望准确性。

模子的测验和测试应该使用不同的数据集，以确保模子的泛化才能。

调遣超参数和使用畸形的测验数据不错进一步提高模子的性能。

模子的瞻望扫尾不错算作交游决策的参考，但应严慎使用，因为瞻望并不老是准确的。

本文内容只是是技能探讨和学习，并不组成任何投资提倡。

转发请注明原作家和出处文爱 x。

本站仅提供存储做事，通盘内容均由用户发布，如发现存害或侵权内容，请点击举报。

萝莉

53kkk你的位置：萝莉 > 53kkk > 文爱 x 手把手教养你用 AI 和 Python 进行股票交游瞻望（齐备代码干货）

文爱 x 手把手教养你用 AI 和 Python 进行股票交游瞻望（齐备代码干货）

上一篇：熟女控去过日照后，提出还没去的...

下一篇：熟女控股票行情快报：新安股份（600596）1月2日主力资金净买入209.51万元

友情链接：

萝莉

53kkk你的位置：萝莉 > 53kkk > 文爱 x 手把手教养你用&#160;AI 和 Python 进行股票交游瞻望（齐备代码干货）

文爱 x 手把手教养你用&#160;AI 和 Python 进行股票交游瞻望（齐备代码干货）

上一篇：熟女控 去过日照后，提出还没去的...

下一篇：熟女控 股票行情快报：新安股份（600596）1月2日主力资金净买入209.51万元

友情链接：

53kkk你的位置：萝莉 > 53kkk > 文爱 x 手把手教养你用 AI 和 Python 进行股票交游瞻望（齐备代码干货）

文爱 x 手把手教养你用 AI 和 Python 进行股票交游瞻望（齐备代码干货）

上一篇：熟女控去过日照后，提出还没去的...

下一篇：熟女控股票行情快报：新安股份（600596）1月2日主力资金净买入209.51万元