WTI Oil Price against EIA Crude Inventories: why my machine learning algorithm failed to find the ultimate formula?
The crude oil price, which took a dramatic
fluctuation last week, finally got the ultimate market stabilizer – the fall of
the US crude oil inventories.
It boosted the volatile markets with much needed
impetus for some sort of stability.
The draw of over 4 million barrels for the week
ending July, 23, did the trick; the oil price started going green again on
investors’ screens.
That, however, has not been the case always – at least
recently; the fall of US oil inventories did not bring about the desired effect
on the crude oil markets, especially when the hopes of reviving the JCPOA, 2015
Iranian nuclear deal, sea-sawed wildly in the past few months.
Against this backdrop, I was tempted this week to
check the feasibility of coming up with a model to predict the weekly WTI crude
price against the US inventory stocks.
I used JavaScript programming to write just 160
lines of code in order to make a Machine Learning algorithm.
You can get real data – the weekly oil price and US crude inventories – from this link for practising – or have some fun.
Get the weekly oil price and correspinding US crude inventoeies in million barrels and then click on the appropriate point on the grid. When you add more points a regression line will be drawn, in response to the input.
Although the machine learns from the past data,
there is nothing for me to learn from the output I get!
The machine learns – hence, called machine learning
- because it redraws the line of
regression every time a new data point is input; it draws what statisticians
call, the line of best fit too.
My over-simplified model, however, is
flawed, because I assumed a linear relationship between the two variables in
question and took them in isolation as if other factors did not exit!
Of course, there are complex models with their creators claiming to have identified every single factor that affects the crude oil price at present.
Even they must have had a tough luck on April, 20,
2020, when the price of oil went negative for the first time since records
began. They must have had a bad time during the past two weeks too, when the
price crashed owing to the unexpected spat between two supposedly close allies
in the Middle East; the investment banks had been predicting the price of oil
hitting above $80 a barrel before the development!.
The EIA, US Energy Administration, has identified at
least seven factors that determine the price of crude oil: some of them are quantifiable
and a few not.
The main challenge appears to be quantifying the
sentiment in the current circumstances.
Of course, there are algorithms for that too - not
instruments.
I have worked with one of them recently: it just
counts the number of positive words or phrases – or negative words, given the
market mood – to give a ‘score’ from 1 – 10: the frequent presence of highly positive
words, such as positive, remarkable, dramatic, encouraging and so on, understandably gives a high score and
negative words give a low score.
To be quite honest with the readers, these
algorithms do not take us very far either, when it comes to accurately – and sensibly
– predicting the price of oil on a given day or in a week – in my humble
opinion.
In short, oil price models are going the same way
that most of weather models are heading are currently heading; the latter do
not forecast weather more than 5 days ahead for obvious reasons.
In London, UK, today’s forecast was completely
wrong. It didn’t forecast rain in the afternoon with its morning forecast, but
it rained; yesterday’ forecast for today was thunder storms, but we didn’t have
any of that either.
In short, the prediction of oil price is going to remain
unfinished science for the foreseeable future, despite the advent of Machine
Learning and endless data crunching; machines do learn from what is
input, but they don’t necessarily teach us something useful or reliable.