The right way to load and use "pickle" files

an623267 · July 2019

I trained some model by Tensorflow and Keras and want to try it in backtesting, but I can't upload it to the environment.

I tried to upload a "pickle" file via the UI in the User Data section, but I got a message that says: "Binary/Executible Files Not Allowed".
I tried to create a new file via the UI with "pickle" extension and copy&paste the content of my file to it. This allowed me to create a file, but i can't find a way how to load it in to the enviroment.
Built-in python function "open" can't find this file. The function "service.read_file" return the content of the file as a string and i can't load it by "pickle" module.

I watched a few CloudQuants videos from Trevor Trinkino and there he says that he is using the "pickle" module to load a pretrained model to the enviroment.
Please help.

ptunney · July 2019

Due to security restrictions, there is no way to upload anything to the site that is a binary file.
Internal users, like Trevor do not have those restrictions so I can appreciate the frustration.
But if you are using our system for machine learning and creating a pickle and you are stuck at the point of uploading a pickle, get in touch with us at customer_success@cloudquant.com, we are very interested in what people are doing with our system and we want to help you achieve your goals.

an623267 · July 2019

@pt120221 said:
Due to security restrictions, there is no way to upload anything to the site that is a binary file.
Internal users, like Trevor do not have those restrictions so I can appreciate the frustration.
But if you are using our system for machine learning and creating a pickle and you are stuck at the point of uploading a pickle, get in touch with us at customer_success@cloudquant.com, we are very interested in what people are doing with our system and we want to help you achieve your goals.

Thank you. For now I managed to use the model by extracting weights from it and saving them in CSV. And then making predictions using numpy matrix multiplication in your enviroment. It's ok for the simple model that I'm trying to use right now.

an623267 · July 2019

Maybe someone will find this useful. Works only for simple dense model.

I just save weights by Keras Model's method in .h5 format:

model.save_weights('model_weights.h5')

Then extract the parameters from file:

import h5py
parameters = {}

with h5py.File('model_weights.h5') as f:
    count = 0
    for layer, group in f.items():
        if not len(group.keys()) > 0:
                continue
        count += 1
        for params_name in group.keys():
                params = group[params_name]
                for p_name in params.keys():
                    if "kernel" in p_name:
                        parameters["W" + str(count)] = params.get(p_name)[:].T
                    if "bias" in p_name:
                        parameters["b" + str(count)] = params.get(p_name)[:].reshape(-1,1)

Save it to CSV:

import pandas as pd
for key, item in parameters.items():
    df = pd.DataFrame(item)
    df.to_csv(path_or_buf= "name_" + key + ".csv", header=True, index=False)

In the CloudQuant enviroment:

        import numpy as np
        @classmethod
        def on_strategy_start(cls, md, service, account):
            cls.parameters = {}

            cls.parameters["W1"] = cls.get_parameters(service.read_file("name_W1.csv", format='csv'))
            cls.parameters["b1"] = cls.get_parameters(service.read_file("name_b1.csv", format='csv'))
            cls.parameters["W2"] = cls.get_parameters(service.read_file("name_W2.csv", format='csv'))
            cls.parameters["b2"] = cls.get_parameters(service.read_file("name_b2.csv", format='csv'))
            cls.parameters["W3"] = cls.get_parameters(service.read_file("name_W3.csv", format='csv'))
            cls.parameters["W4"] = cls.get_parameters(service.read_file("name_W4.csv", format='csv'))
            cls.parameters["b4"] = cls.get_parameters(service.read_file("name_b4.csv", format='csv'))
            cls.parameters["W5"] = cls.get_parameters(service.read_file("name_W5.csv", format='csv'))
            cls.parameters["b5"] = cls.get_parameters(service.read_file("name_b5.csv", format='csv'))

        @staticmethod
        def get_parameters(file):
            parameters = None
            for row in file:
                list = []
                for i in range(len(row)):
                    list.append(float(row[str(i)]))
                parameters = np.array(list) if parameters is None else np.vstack([parameters, list]) 
            return parameters

        @classmethod
        def nn_forward(cls, X, parameters):
            L = len(parameters)//2
            A_prev = X
            for i in range(1,L):
                Z = np.dot(parameters["W" + str(i)], A_prev) + parameters["b" + str(i)]
                A_prev = cls.relu(Z)

            ZL = np.dot(parameters["W" + str(L)], A_prev) + parameters["b" + str(L)]
            AL = cls.softmax(ZL)

            return AL

        @staticmethod
        def relu(X):
            return X*(X > 0)

        @staticmethod
        def softmax(X):
            return np.exp(X)/(np.exp(X).sum(axis=0))

        def on_minute_bar(self, event, md, order, service, account, bar):
            cls = self.__class__
            prediction = cls.nn_forward(X, cls.parameters)

X shape: (n_features,1)

The right way to load and use "pickle" files

Comments