**Introduction: **

The aim of this academic article is to improve an efficient analysis on rare metals, looking at their behaviour and interconnections. Many evaluation metrics and techniques have been applied to reach this purpose. Huge importance has been attributed to the economic interpretation, providing both a theoretical explanation and real data ex-post evaluation. Unfortunately, the amount of information has proved to be excessive for a single article. That’s why I decided to put in practice a statistical analysis based on precious/rare metals only, without the addition of the necessary codes to obtain the desired outputs. In order to do this, the main article will be divided in two parts. They will give a through explanation of the whole subject, and will guide the reader thanks to the carefully constructed titles: Introduction : Statistical analysis on rare metals and Overview on Gold : Statistical analysis on Gold. Whoever might be interested in statistical processes and codes used to perform statistical analysis is free to contact the author. At the end of the article the packages used in R software to perform our analysis will be thoroughly listed.

Why precious metals are so interesting? Historically they were important as currency, but are now regarded mainly as investment, industrial commodities, jewellery and store of value. For this reason their price is higher than the price of other metals. Everyone probably knows precious metals as gold, silver or platinum, but the majority of the readers is not aware of palladium, iridium, rhodium, etc. and their usage, even if they are as precious as gold and platinum if not even more.

I selected the rarest metals in the world, and all data have been collected from Quandl website. Quandl is a Toronto-based platform for financial, economic, and alternative data, serving investment professionals. All Quandl’s data are accessible via API (Application Programming Interface). API access is possible through packages for multiple programming languages including R, Python, Matlab, etc. In order to perform our analysis we are going to use R. R is an open source programming language and software environmental for statistical computing and graphics that is supported by the R foundation for Statistical Computing.

The rare metals considered (see Fig. 1) are: (1) Gold (Au), (2) Silver (Ag), (3) Platinum (Pt), (4) Palladium (Pd), (5) Iridium (Ir), (6) Rhodium (Rh) and (7) Ruthenium (Ru). All prices are expressed in $/toz to have a fair comparison among them. The unit of measure “toz” (troy ounce) was used during the Middle Age, in Troyes, France, when dealing with precious metals. This unit system is still used in modern times for pricing rare metals. One troy ounce is equal to 31.1034768 grams, and doesn’t refer to a standard ounce.

Gold and Silver data are downloaded from the London Bullion Market Association (LBMA). It is an international trade association, representing the London market for gold and silver bullion which has a global client base. It is also the marketplace and clearing-house for physical gold and silver traded wholesale between central banks, producers, refiners and fabricators. On the other hand, Platinum, Palladium, Iridium, Rhodium and Ruthenium data are downloaded from Johnson Matthey (JM), which is a leading global speciality chemicals company. It has departments dedicated to environmental technology, precious metals, and fine chemicals.

**Statistical analysis on rare metals: **

The period of interest chosen is four years, which goes from 1^{st} April 2013 to 31^{st} March 2017 (see Chart 1). Then, to perform a better analysis, I decided to split the period in four sub-periods made exactly by one year each.

We introduce our analysis with a small descriptive statistic of the trend associated to every metal selected, expressed in table 1. Moreover, we decided to show mean, standard deviation, and minimum and maximum values during their respective dates of registration. Reporting data for every sub-period could result too long and boring, so descriptive statistics will refer to the whole period analysed from 2013 to 2017.

It is interesting to observe that the value of most of the metals started to decline from 2^{nd} April 2013 (minimum value), this means that all the metals are more or less correlated. If we observe the chart above, we can see that all metals lose their value at first (downward trend) just to see it become stationary afterwards.

From the correlations matrix expressed in table 2, it is possible to evaluate which exchange rates are connected by a linear association. In detail, we can notice that the linear correlation coefficient associated to the pair Gold and Silver is equal to 0.91, while the coefficient associated with Platinum and Ruthenium corresponds to 0.89. Both states a strong positive linear dependency between the two examined rare metals, which will tend to move toward the same direction and with similar value. After observing the results of the correlation chart regarding the whole time span we are referring to, it is now interesting to study the correlation chart while examining two different periods in the whole time span.

We are hereby analysing the series referred to the rare metals from 2013-04-01 to 2017-03-31. Let’s divide to obtain four periods of equal dimension we can compare. We want to verify if the linear bond between the series of analyses can change with the variation of the reference period. We have observed very remarkable values:

- The correlation of 0.91 between Gold and Silver in the whole reference period is, in truth, influenced by the movements of the two series in the first and second year of the study. From the following analysis we apprehended that the coefficient of correlation between two series was equal to 0.93 in the first year and 0.933 in the second considered, and “only” to the 0.76 and 0.84 in the third and fourth year. This is evidence of behavioural detachment of the four series compared.
- A similar behaviour can be seen when we look at the coefficient of correlation between Ruthium and Platinum where the coefficient of correlation was over 0.90 in the second and third period, and was around 0.50 for the first and the last one. Looking at Gold and Platinum the coefficient is always higher than 0.80, except for the third period (0.60 only).
- The Rhodium behaviour is very curious during the second period, because it is the only metal that doesn’t show correlation with the others. Its correlation index is slightly negative the most of the time, but very close to zero (from -0.20 to 0.28).
- The last period is very interesting from a statistical perspective (see table 3), because it is very easy to distinguish two different clusters from the graph behind. The first one is composed by Gold, Silver, Platinum and Ruthenium, while the second one is composed by Iridium, Palladium and Rhodium. We can say so because the metals of the first group are positively correlated with one other, and negative correlated with the metals of the second one.

After this, a cluster analysis is very useful, because it is a peculiar method that allows the identification of homogeneous groups from a heterogeneous starting sample. This is possible thanks to the maximization of the variance between the different groups and to the minimization of the within variance.

The matrix is constructed by an algorithm that aggregates two out of n individuals in a group previously obtained. Subsequently, the aggregation of individuals generates particular groups and subgroups following the dissimilarity criterion. Another peculiar algorithm, the “Divisive”, works with a single group of individuals and splits it in subgroups to reach the creation of n groups.

In this article, to perform a cluster analysis, I decided to use two distances: Markov Operator Distance and the Dynamic Time Warping Distance (DTW). First, Markov process is a process whose future behaviour cannot be accurately predicted from its past behaviour and which involves random chance or probability. Behaviour of a business or economy, flow of traffic, progress of an epidemic, are examples of Markov processes. Through Markov Operator distance it is possible to cluster the time series following a criterion based on the transaction densities and so on the model followed by the rare metal during the period considered. Instead, dynamic time warping (DTW) is an algorithm for measuring similarities between two temporal sequences whose speed may vary. In general, DTW is a method that calculates an optimal match between two given sequences with certain restrictions.

In general, it is possible to notice that, even if we have divided the period of analysis in four general clusters of various metrics, they remain consistent with the variation of the reference time span. Anyway, we can also notice that a different definition considerably changes the clusters structure.

We notice in chart 2, that no distance whatsoever divides the cluster formed by Gold, Platinum, Palladium, Rhodium and Iridium. It is also important to note that using any of the other four distances submitted, Silver and Ruthium always stays a single cluster anyway.

Market participants usually agree that certain pairs of assets (X1,X2) share a lead-lag effect, in the sense that the lagger (or follower) price process X1 tends to partially reproduce the oscillations of the leader (or driver) price process X2, with some temporal delay, or vice-versa. However, using the largest group, the pictures is not clear and it is not possible identify a leader. That is why I decided to perform the lead-lag analysis on two different groups, the first made by Gold, Silver and Platinum, and the second composed by Iridium, Rhodium and Ruthenium. Using this strategy the results acquire a deeper meaning.

In the first group, thanks chart 3, it is possible to see that Platinum and Palladium are anticipated by Gold. The lead lag matrices of the four periods clearly show that the values estimated by the maximization for our parameter theta are equal to (-1/256) in the Gold columns. Consequently, the maximum correlation with the other rare metals lagged at (t – (1/256) (a one time lag), is reached by the time series Gold. I have picked a time span of 1/256, since this commodity market should be characterized approximately by 256 trading days in one year. That’s the reason why Gold anticipates the other rare metals, becoming the most important. Above we can see the results seen in the matrix, where the red boxes represent the negative time lag correlation between two rare metals for the first and last period analysed.

Performing the analysis of the second group I realize that the situation is not so clear as in the first one. The leader seems to be Iridium, but probably due to the hole in the time-series of these last metals, so we won’t consider that.

Now, we are about to statistically analyse the most significant asset, Gold with a special focus on the most relevant features of the log-returns. Special test will be performed during the two periods considered to verify the independency of the log returns and the normality distribution. But before applying this analysis we are going to introduce an overview of Gold and its use.

## Recent Comments