Limitations on the amount of sentiment that can be retrieved
kc975943
Posts: 55
I am trying to a get a week worth of sentiment for the S&P500 universe. Following the original example, I have this code in on_strategy_start
:
cls.sources = {'SM_TW':'!sentiment/sma/tw/15min',
'SM_ST':'!sentiment/sma/st/15min'}
cls.sent_scores={symbol:np.zeros(3) for symbol in cls.universe}
cls.buzz={symbol:0 for symbol in cls.universe}
source='SM_TW'
hours=168
for symbol in cls.universe:
sent_data=[]
buzz_data=[]
data = service.query_data(cls.sources[source],
symbol,
start_timestamp=(service.time() - service.time_interval(hours = hours )))
for item in data:
if item['s-score'] != 0 :
sent_data.append(item['s-score'])
buzz_data.append(item['s-buzz'])
# some extra code that reduces the above arrays down to 3-4 numbers for given symbol
# and populates the cls.sent_scores and cls.buzz dictionaries
Running it for one day, after quite a long time, I get a console message that simply says: "Killed" which presumably means it ran out of memory.
Is there a better way to get weekly sentiment?
Comments
As an initial immediate suggestion without testing anything at all I would say the easiest thing you could do is separate the data sources.
Why would you combine StockTwits and Twitter?
Just pull one, or at the most one at a time.
I would regard them as quite different beasts and both will generate quite different signals.
I am only using one source in the above:) (line 5.)
Unfortunately, at the moment, there is no consistent way to get large amounts of data.
The issue is pulling random data from a large source in many steps.
With Liberator we give users the ability to hand a list of symbols to a single query and get a block of results back.
service.query_data is meant for smaller calls, normally just before a trading decision, "should I get in or not?"
Most of the sentiment data is a rolling value anyway so there is no need to pull historical data unless you want to detect the trend.
We are looking at adding the SMA Twitter and Stocktwits data to the Liberator query but yours was the first use case identified so there is no option in place at the moment.
If you pull for just one sentiment (stocktwits or twitter), for less symbols, for a shorter period of time you will probably have success with service.query_data.
So something like this should probably work as a starting point, here the query is run in on_strategy_start (ideally it would be a single liberator query run once in on_strategy_start)...
or if you prefer to do it in on_start for each individual symbol