Abnormal tail asymmetry based on kernel density estimation
factor.formula
Abnormal tail asymmetry (S_φ):
Kernel density estimation function:
Gaussian kernel function:
Formula explanation:
- :
Abnormal tail asymmetry is used to measure the degree of asymmetry in the tail of the return distribution. A positive value indicates a heavier right tail, and a negative value indicates a heavier left tail. The higher the value, the more significant the asymmetry of the return distribution.
- :
The difference between the actual return distribution and the symmetric distribution is used to determine whether the return distribution is left-skewed or right-skewed. $E_φ = \int_{-\infty}^{-k} (f_1(x) - f_2(x))^2 dx - \int_{k}^{+\infty} (f_1(x) - f_2(x))^2 dx$, when $E_φ > 0$, it means that the left tail difference is greater than the right tail difference, and the return distribution shows a left-skewed feature; conversely, when $E_φ < 0$, it means that the right tail difference is greater than the left tail difference, and the return distribution shows a right-skewed feature
- :
Idiosyncratic return refers to the remaining part of the return of individual stocks after removing market and industry risks. The calculation method is estimated through a linear regression model: $R_{i,d} = \alpha_i + \beta_i R_{m,d} + \gamma_i R_{ind,d} + E_{i,d}$, where $R_{i,d}$ is the return of individual stock i on day d, $R_{m,d}$ is the market return on day d, $R_{ind,d}$ is the industry return on day d, and $E_{i,d}$ is the idiosyncratic return.
- :
The tail threshold is used to define the extreme areas of the return distribution. This value is usually a positive number, such as 1.5 or 2, indicating that the area above or below the mean k standard deviations is considered to be the tail of the distribution. The choice of this value will affect the sensitivity of the factor and can be adjusted according to the specific situation. Generally, a larger k value will make the factor more concerned with the asymmetry of the extreme tail.
- :
The kernel density estimation function of the actual rate of return is used to estimate the probability density distribution of the actual rate of return through non-parametric methods.
- :
A probability density function that is symmetric to the actual rate of return distribution usually has a mean of 0 and a variance equal to the symmetric distribution of the actual rate of return variance, such as a normal distribution.
- :
The sample size used to estimate the kernel density function, that is, the number of trading days in the time window used to calculate the factors.
- :
The idiosyncratic return of the ith observation.
- :
The bandwidth parameter of the kernel density estimate controls the smoothness of the kernel function. The smaller the bandwidth, the finer the estimated distribution, but it may be too sensitive; the larger the bandwidth, the smoother the estimated distribution, but it may lose details. Silverman's rule of thumb is usually used to estimate: $h ≈ 1.06\hat{\sigma}n^{-1/5}$, where $\hat{\sigma}$ is the standard deviation of the idiosyncratic return.
- :
Gaussian kernel function is used to weight the influence of sample points on the target point, where z is the normalized distance, that is, $z = \frac{r_i - x}{h}$. The Gaussian kernel function gives greater weight to sample points closer to the target point.
factor.explanation
This factor is an important indicator for measuring the asymmetry of return distribution and an effective supplement to the traditional skewness. It captures the asymmetric characteristics of return distribution in extreme cases, especially the non-uniformity in the tail area, by comparing the difference between the actual return distribution and the symmetric distribution. Empirical studies have shown that there is a significant relationship between the degree of tail asymmetry of the return distribution of stocks in the cross section and future returns. Generally speaking, the higher the tail asymmetry (heavier the right tail), the lower the future return may be, and vice versa. However, the predictive ability of this factor for future returns may be affected by factors such as market environment, investor sentiment, and volatility. For example, when market optimism is high, tail asymmetry may be negatively correlated with future returns, while when market pessimism is low, tail asymmetry may be positively correlated with future returns. Therefore, in practical applications, it is necessary to combine multiple factors for comprehensive analysis. In addition, the calculation of this factor uses non-parametric methods, which can effectively avoid the errors caused by parameter selection and more accurately reflect the true situation of the return distribution.