Understanding Returns and Assessing Risks with Value at Risk
This guide simplifies the understanding of investment returns and risk management. It starts with how to calculate and interpret percentage and compound returns, then explains monthly and annualized returns. The focus then shifts to understanding and measuring volatility and risk, including practical Python examples. It covers how to assess investment risks and returns using the Sharpe Ratio and real-world data, and introduces concepts like skewness, kurtosis, Value at Risk (VaR), and Conditional Value at Risk (CVaR). These tools help predict potential investment losses and provide a solid foundation for anyone looking to understand the essentials of financial risk management.
Analyzing Returns
Percentage Returns Explained
Percentage Returns Explained: Percentage return measures the financial gain or loss between two time points,
where
Understanding Compound Returns
Compound Returns over Multiple Periods: The total return over several periods is not merely the sum of individual returns. Consider two consecutive time periods, with prices
So, the total return over two periods,
In a timeframe
Practical Examples
Two-Day Stock Performance: A stock increasing by 10% on day one and decreasing by 3% on day two has a compound return
, not simply . Annualized Return from Quarterly Returns: A stock with consistent 1% quarterly returns yields an annualized return
.
Monthly & Annual Returns
Monthly Returns: Given monthly returns, the compound total return after two months
is calculated. To find the equivalent monthly return , we solve , giving . Annualized Returns: The annualized return
is derived from monthly returns using . For a series of returns, the formula becomes , where is the number of months.
Generalizing Annualized Returns
For different time intervals (daily, weekly, monthly), the annualized return formula adjusts the power in the equation, with ( P_y ) representing the number of periods per year:
Assessing Volatility and Risk
Volatility, a risk measure, is the standard deviation of asset returns:
where
When dealing with monthly return data, the calculation of volatility usually focuses on the monthly scale, termed as monthly volatility. However, to understand the asset's risk profile over a longer period, such as a year, this monthly volatility needs to be scaled up. This process is necessary because volatility metrics derived from different time intervals are not directly comparable.
The conversion to annualized volatility
where
For different time frames, the calculation adjusts as follows:
- Monthly Volatility
: To annualize, use - Weekly Volatility
: Annualize by calculating - Daily Volatility
: Convert to annual volatility with
This method standardizes volatility to a yearly scale, allowing for a consistent and comparable measure of risk across different time frames.
Zero Volatility Concept
Question cosidering this scenario: Asset A experiences a monthly decrease of 1% over a period of 12 months, while Asset B sees a consistent monthly increase of 1% during the same timeframe.
Which asset exhibits greater volatility?
Interestingly, the answer is that neither of them display any volatility (volatility is non-existent in both cases), as there are no real fluctuations; Asset A consistently decreases, whereas Asset B consistently increases.
import pandas as pd
from statsmodels.iolib.table import SimpleTable
# 1) Create your DataFrame
a = [100]
b = [100]
for i in range(12):
a.append(a[-1] * 0.99) # Asset A loses 1% each month
b.append(b[-1] * 1.01) # Asset B gains 1% each month
asset_df = pd.DataFrame({"Asset A": a, "Asset B": b})
asset_df["Return A"] = asset_df["Asset A"].pct_change()
asset_df["Return B"] = asset_df["Asset B"].pct_change()
# 2) Prepare data for SimpleTable
data = asset_df.values.tolist() # Convert the DataFrame to a list of rows
headers = asset_df.columns.tolist() # Use DataFrame column names as headers
# 3) Build a statsmodels SimpleTable
table = SimpleTable(data, headers)
# 5) Compute & display summary statistics in a second table
total_returns = (1 + asset_df[["Return A", "Return B"]].iloc[1:]).prod() - 1
volatility = asset_df[["Return A", "Return B"]].iloc[1:].std()
stats_data = [
["Total Returns", f"{total_returns['Return A']:.2%}", f"{total_returns['Return B']:.2%}"],
["Volatility", f"{volatility['Return A']:.4f}", f"{volatility['Return B']:.4f}"]
]
stats_headers = ["Metric", "Asset A", "Asset B"]
stats_table = SimpleTable(stats_data, stats_headers)
print("\nSummary Stats:")
print(stats_table) Python Example: Analyzing Stock Data
# In this analysis, we first generate synthetic stock prices for two stocks with distinct volatilities.
# We then calculate their monthly returns and visualize the data.
# The total compound returns, mean returns, and volatility are computed to understand the risk profile of each stock.
# Finally, we calculate the Return on Risk (ROR) for each stock, which provides insights into the risk-adjusted performance of the investments.
# This analysis suggests that Stock A offers a better return per unit of risk compared to Stock B, even though their total returns are similar.
import numpy as np
import pandas as pd
import json
# Example of generating stock prices with a DatetimeIndex
np.random.seed(51)
dates = pd.date_range(start='2020-01', periods=10, freq='M') # Monthly dates
stocks = pd.DataFrame({
"Stock A": np.random.normal(10, 1, size=10),
"Stock B": np.random.normal(10, 5, size=10)
}, index=dates)
stocks.index.name = "Months"
stocks = round(stocks, 2)
# Calculating returns
stocks["Stock A Rets"] = stocks["Stock A"] / stocks["Stock A"].shift(1) - 1
stocks["Stock B Rets"] = stocks["Stock B"] / stocks["Stock B"].shift(1) - 1
stocks = round(stocks, 2)
# Print numerical results with proper formatting
total_ret = (1 + stocks[["Stock A Rets", "Stock B Rets"]]).prod() - 1
means = stocks[["Stock A Rets", "Stock B Rets"]].mean()
volatility = stocks[["Stock A Rets", "Stock B Rets"]].std()
ann_volatility = volatility * np.sqrt(12)
print("\nTotal Returns:")
for col in total_ret.index:
print(f"{col}: {total_ret[col]:.2%}")
print("\nMean Returns:")
for col in means.index:
print(f"{col}: {means[col]:.2%}")
print("\nVolatility:")
for col in volatility.index:
print(f"{col}: {volatility[col]:.2%}")
print("\nAnnualized Volatility:")
for col in ann_volatility.index:
print(f"{col}: {ann_volatility[col]:.2%}")
# Define crisis events (optional, adjust dates as needed)
crisis = [
{"date": "2020-03", "name": "Initial Market Shock"},
{"date": "2020-07", "name": "Mid-Year Adjustment"}
]
# Prepare visualization data matching OutputDisplay.vue structure
plot_data = {
# X-Axis: convert DatetimeIndex to string (YYYY-MM)
"dates": stocks.index.strftime('%Y-%m').tolist(),
# "crisis": crisis,
# Stock Prices
"stockPrices": {
"series": {
"Stock A": stocks["Stock A"].tolist(),
"Stock B": stocks["Stock B"].tolist()
},
"type": "line", # Specify chart type
"yAxisName": "Price (USD)" # Y-axis label
},
# Stock Returns
"returns": {
"series": {
"Stock A Returns": (stocks["Stock A Rets"] * 100).fillna(0).tolist(),
"Stock B Returns": (stocks["Stock B Rets"] * 100).fillna(0).tolist()
},
"type": "bar", # Specify chart type as histogram
"yAxisName": "Returns (%)" # Y-axis label
}
}
# Print data for chart visualization
print("\n<ECHARTS_DATA>" + json.dumps(plot_data)) Evaluating Return on Risk
Return on Risk (ROR) measures the reward per unit of risk, calculated as:
where
# This code segment computes the ROR for each stock, helping investors understand which stock offers better returns for the risk taken.
import numpy as np
import pandas as pd
# Example of generating stock prices with a DatetimeIndex
np.random.seed(51)
dates = pd.date_range(start='2020-01', periods=10, freq='M') # Monthly dates
stocks = pd.DataFrame({
"Stock A": np.random.normal(10, 1, size=10),
"Stock B": np.random.normal(10, 5, size=10)
}, index=dates)
stocks.index.name = "Months"
stocks = round(stocks, 2)
# Calculating returns
stocks["Stock A Rets"] = stocks["Stock A"] / stocks["Stock A"].shift(1) - 1
stocks["Stock B Rets"] = stocks["Stock B"] / stocks["Stock B"].shift(1) - 1
stocks = round(stocks, 2)
# Print numerical results with proper formatting
total_ret = (1 + stocks[["Stock A Rets", "Stock B Rets"]]).prod() - 1
means = stocks[["Stock A Rets", "Stock B Rets"]].mean()
volatility = stocks[["Stock A Rets", "Stock B Rets"]].std()
ann_volatility = volatility * np.sqrt(12)
# Calculating Return on Risk
ROR = total_ret / volatility
print("\nTotal Returns:")
for col in ROR.index:
print(f"{col}: {ROR[col]:.2%}")
# Conditional Comment Based on ROR
ror_a = ROR["Stock A Rets"]
ror_b = ROR["Stock B Rets"]
if ror_a > ror_b:
comment = (
"Higher ROR for Stock A indicates that we achieve a better return per unit of risk by investing in this stock.\n"
"In other words, it is more advantageous to invest in Stock A rather than Stock B, even though the total returns "
"from both stocks are similar."
)
elif ror_a < ror_b:
comment = (
"Higher ROR for Stock B indicates that we achieve a better return per unit of risk by investing in this stock.\n"
"In other words, it is more advantageous to invest in Stock B rather than Stock A, even though the total returns "
"from both stocks are similar."
)
else:
comment = (
"Both stocks have identical Return on Risk (ROR), indicating that they offer similar returns per unit of risk."
)
print("\n" + comment) Sharpe Ratio: Assessing Risk-Adjusted Returns
The Sharpe Ratio provides a more nuanced view of an investment's performance by considering the risk-free rate. This ratio adjusts the return on risk by accounting for the returns of a risk-free asset, like a US Treasury Bill. It's defined as the excess return per unit of risk:
Here,
# This calculation demonstrates how the Sharpe Ratio can provide
# additional insights into the risk-adjusted performance of an investment.
import numpy as np
import pandas as pd
# Example of generating stock prices with a DatetimeIndex
np.random.seed(51)
dates = pd.date_range(start='2020-01', periods=10, freq='M') # Monthly dates
stocks = pd.DataFrame({
"Stock A": np.random.normal(10, 1, size=10),
"Stock B": np.random.normal(10, 5, size=10)
}, index=dates)
stocks.index.name = "Months"
stocks = round(stocks, 2)
# Calculating returns
stocks["Stock A Rets"] = stocks["Stock A"] / stocks["Stock A"].shift(1) - 1
stocks["Stock B Rets"] = stocks["Stock B"] / stocks["Stock B"].shift(1) - 1
stocks = round(stocks, 2)
# Print numerical results with proper formatting
total_ret = (1 + stocks[["Stock A Rets", "Stock B Rets"]]).prod() - 1
means = stocks[["Stock A Rets", "Stock B Rets"]].mean()
volatility = stocks[["Stock A Rets", "Stock B Rets"]].std()
ann_volatility = volatility * np.sqrt(12)
# Assuming a 3% risk-free rate
risk_free_rate = 0.03
excess_return = total_ret - risk_free_rate
sharpe_ratio = excess_return / volatility
print("\nSharpe Ratio:")
for col in sharpe_ratio.index:
print(f"{col}: {sharpe_ratio[col]:.2%}") Illustrating Financial Concepts Using a Real-World Dataset
The performance of Small Cap and Large Cap US stocks using a dataset from Kaggle will be used to illustrate Financial Concepts. This dataset includes monthly returns from July 1926 to December 2018.
Data Analysis
Dataset Source: PortfolioOptimizationKit.py to load data from a CSV file will be used. This dataset categorizes US stocks into Small Caps (bottom 10% by market capitalization) and Large Caps (top 10%).
Data Visualization: The code visualizes monthly returns for both categories. This step is crucial for a quick assessment of performance over time.
# This code segment analyzes the monthly returns of Small Cap and Large Cap stocks.
import pandas as pd
import json
import PortfolioOptimizationKit as pok
# ---------------------------------------------------------------------------------
# Load and format data files
# ---------------------------------------------------------------------------------
# Load the dataset without parsing dates initially
file_to_load = pok.path_to_data_folder() + "Portfolios_Formed_on_ME_monthly_EW.csv"
df = pd.read_csv(file_to_load, index_col=0, parse_dates=False, na_values=-99.99)
# Convert the index to string to facilitate datetime conversion
df.index = df.index.astype(str)
# Convert the string index to datetime using the format 'YYYYMM'
try:
df.index = pd.to_datetime(df.index, format='%Y%m')
print("\nConversion to DatetimeIndex successful.")
except ValueError as e:
print("\nDate conversion failed:", e)
print("The index remains as is.")
# Print the first few rows of the dataframe as plain text output
print("\n" + df.head().to_string())
# Focus on Small Cap and Large Cap stocks
small_large_caps = df[["Lo 10", "Hi 10"]] / 100 # Dividing by 100 to convert to actual returns
# Optional: Print descriptive statistics
print("\nDescriptive Statistics:")
print(small_large_caps.describe())
# Aggregate data annually to display only the year on the X-axis
small_large_caps_yearly = small_large_caps.resample('A').mean()
# Verify the aggregation
print("\nAnnual Average Returns:")
print(small_large_caps_yearly.head())
# Prepare visualization data matching OutputDisplay.vue structure
plot_data = {
# X-Axis: convert DatetimeIndex to string (YYYY)
"dates": small_large_caps_yearly.index.strftime('%Y').tolist(),
# Small Cap and Large Cap Returns
"smallLargeCaps": {
"series": {
"Small Cap (Lo 10)": small_large_caps_yearly["Lo 10"].tolist(),
"Large Cap (Hi 10)": small_large_caps_yearly["Hi 10"].tolist()
},
"type": "line", # Specify chart type
"yAxisName": "Annual Average Returns (%)" # Y-axis label
}
}
# Print data for chart visualization
print("\n<ECHARTS_DATA>" + json.dumps(plot_data)) Calculating Volatility
Monthly Volatility: Using Python's standard deviation function to calculate the monthly volatility, a measure of how much stock returns vary from their average value.
Annualizing Volatility: Then scale up the monthly volatility to an annual figure. This conversion provides a broader view of the stocks' risk over a longer period.
# Calculating monthly volatility
monthly_volatility = small_large_caps.std()
# Annualizing the volatility
annualized_volatility = monthly_volatility * (12 ** 0.5)
print("\nMonthly Volatility:")
for col in monthly_volatility.index:
print(f"{col}: {monthly_volatility[col]:.2%}")
print("\nAnnualized Volatility:")
for col in annualized_volatility.index:
print(f"{col}: {annualized_volatility[col]:.2%}") Returns Analysis
Monthly & Annual Returns: Python code is used to calculate both monthly and annual returns. This step is key to understanding the long-term growth potential of the stocks.
# Number of months in the dataset
n_months = small_large_caps.shape[0]
# Total compound return
total_return = (1 + small_large_caps).prod() - 1
# Monthly and annualized returns
return_per_month = (1 + total_return) ** (1 / n_months) - 1
annualized_return = (1 + return_per_month) ** 12 - 1
print("\nReturn per Month:")
for col in return_per_month.index:
print(f"{col}: {return_per_month[col]:.2%}")
print("\nAnnualized Return:")
for col in annualized_return.index:
print(f"{col}: {annualized_return[col]:.2%}") Risk-Adjusted Returns: Return on Risk and Sharpe Ratio are computed. These metrics help us understand which stocks offer better returns for their level of risk.
# Assuming a risk-free rate
risk_free_rate = 0.03
# Return on Risk and Sharpe Ratio
return_on_risk = annualized_return / annualized_volatility
sharpe_ratio = (annualized_return - risk_free_rate) / annualized_volatility
print("\nReturn on Risk:")
for col in return_on_risk.index:
print(f"{col}: {return_on_risk[col]:.2%}")
print("\nSharpe Ratio:")
for col in sharpe_ratio.index:
print(f"{col}: {sharpe_ratio[col]:.2%}") Drawdown
Drawdown is a crucial metric in portfolio management, indicating the maximum loss from a peak to a trough of a portfolio, before a new peak is achieved. It's a measure of the most significant drop in asset value and is often used to assess the risk of a particular investment or portfolio.
To calculate the drawdown, these steps are needed:
Calculate the Wealth Index: This represents the value of a portfolio as it evolves over time, taking into account the compounding of returns.
Determine the Previous Peaks: Identify the highest value of the portfolio before each time point.
Calculate the Drawdown: It is the difference between the current wealth index and the previous peak, represented as a percentage of the previous peak.
The formula can be expressed as:
where
In extension, the Maximum Drawdown is a metric that quantifies the most substantial loss experienced from a peak to a trough of a portfolio before the emergence of a new peak. This metric is crucial for measuring the most severe decrease in value. The Maximum Drawdown formula is described as:
where
Practical example
To showcase Drawdown, we will use the identical Small Cap and Large Cap US stocks dataset.
import pandas as pd
import numpy as np
import json
import sys
import os
# Ensure "/assets" is in the Python path so we can import your module & CSV files
sys.path.append('/assets')
try:
# ------------------------------------------------------------------------------
# 1) Import your portfolio optimization module
# ------------------------------------------------------------------------------
import PortfolioOptimizationKit as pok
# ------------------------------------------------------------------------------
# 2) Load Fama-French Monthly returns via pok.get_ffme_returns()
# ------------------------------------------------------------------------------
rets = pok.get_ffme_returns()
# Ensure columns are consistent
rets.columns = ["Small Caps", "Large Caps"]
# ------------------------------------------------------------------------------
# 3) Calculate key indices: wealth, peaks, drawdown, difference from peaks
# ------------------------------------------------------------------------------
wealth_index = 100 * (1 + rets).cumprod()
previous_peaks = wealth_index.cummax()
drawdown = (wealth_index - previous_peaks) / previous_peaks # ratio
diff_from_peaks = wealth_index - previous_peaks
# ------------------------------------------------------------------------------
# 5) Define crisis events in an array (adjusted to match data's date range)
# ------------------------------------------------------------------------------
crisis = [
{"date": "1929", "name": "Great Depression"},
{"date": "1990", "name": "Dot-Com Bubble Burst"},
{"date": "2005", "name": "Lehman Brothers Crisis"}
]
# ------------------------------------------------------------------------------
# 6) Build a dictionary matching the chart keys in OutputDisplay.vue
# ------------------------------------------------------------------------------
plot_data = {
# X-Axis: convert DatetimeIndex to string (YYYY)
"dates": rets.index.strftime('%Y').tolist(),
"crisis": crisis,
# Small Caps Wealth with peaks
"smallCapsWealth": {
"series": {
"Wealth": wealth_index["Small Caps"].tolist(),
"Peaks": previous_peaks["Small Caps"].tolist()
},
"type": "line",
"yAxisName": "Wealth Index",
"markLine": {
"data": [
{"xAxis": event["date"], "name": event["name"], "lineStyle": {"color": "#BC1142", "type": "dashed"}}
for event in crisis
],
"label": {
"formatter": "{b}",
"position": "insideEndTop",
"color": "#172E5C",
"rotate": 90
}
}
},
# Large Caps Wealth with peaks
"largeCapsWealth": {
"series": {
"Wealth": wealth_index["Large Caps"].tolist(),
"Peaks": previous_peaks["Large Caps"].tolist()
},
"type": "line",
"yAxisName": "Wealth Index",
"markLine": {
"data": [
{"xAxis": event["date"], "name": event["name"], "lineStyle": {"color": "#BC1142", "type": "dashed"}}
for event in crisis
],
"label": {
"formatter": "{b}",
"position": "insideEndTop",
"color": "#172E5C",
"rotate": 90
}
}
},
# Difference from peaks (nominal) - No markLine
"diffFromPeak": {
"series": {
"Small Caps": diff_from_peaks["Small Caps"].tolist(),
"Large Caps": diff_from_peaks["Large Caps"].tolist()
},
"type": "line",
"yAxisName": "Difference from Peaks"
# Removed markLine to exclude crisis annotations
},
# Drawdown in %
"drawdown": {
"series": {
"Small Caps": (drawdown["Small Caps"] * 100).tolist(),
"Large Caps": (drawdown["Large Caps"] * 100).tolist()
},
"type": "line",
"yAxisName": "Drawdown (%)"
# Removed markLine to exclude crisis annotations
}
}
# ------------------------------------------------------------------------------
# 7) Print the final <ECHARTS_DATA> block so OutputDisplay.vue can parse it
# ------------------------------------------------------------------------------
print("\n<ECHARTS_DATA>" + json.dumps(plot_data))
except Exception as e:
print(f"Error: {str(e)}")
print(f"Current working directory: {os.getcwd()}")
print(f"Python path: {sys.path}") Insights from Historical Crises
# Calculate max drawdown and corresponding dates for 1929 Crisis (1929-1933)
min_drawdown_1929 = drawdown.loc["1929":"1933"].min().round(2) * 100
date_min_drawdown_1929 = drawdown.loc["1929":"1933"].idxmin()
# Calculate max drawdown and corresponding dates for Dot Com Crisis (1990-2005)
min_drawdown_dotcom = drawdown.loc["1990":"2005"].min().round(2) * 100
date_min_drawdown_dotcom = drawdown.loc["1990":"2005"].idxmin()
# Calculate max drawdown and corresponding dates for Lehman Brothers Crisis (2005 Onwards)
min_drawdown_lehman = drawdown.loc["2005":].min().round(2) * 100
date_min_drawdown_lehman = drawdown.loc["2005":].idxmin()
# Define the headers for the tables
headers = ["Crisis", "Category", "Max Drawdown (%)", "Date of Max Drawdown"]
# Define the data for 1929 Crisis
crisis_1929 = "1929 Crisis"
data_1929 = [
[crisis_1929, "Small Caps", f"{min_drawdown_1929['Small Caps']}%", date_min_drawdown_1929['Small Caps'].strftime('%Y-%m-%d')],
[crisis_1929, "Large Caps", f"{min_drawdown_1929['Large Caps']}%", date_min_drawdown_1929['Large Caps'].strftime('%Y-%m-%d')]
]
# Define the data for Dot Com Crisis
crisis_dotcom = "Dot Com Crisis (1990-2005)"
data_dotcom = [
[crisis_dotcom, "Small Caps", f"{min_drawdown_dotcom['Small Caps']}%", date_min_drawdown_dotcom['Small Caps'].strftime('%Y-%m-%d')],
[crisis_dotcom, "Large Caps", f"{min_drawdown_dotcom['Large Caps']}%", date_min_drawdown_dotcom['Large Caps'].strftime('%Y-%m-%d')]
]
# Define the data for Lehman Brothers Crisis
crisis_lehman = "Lehman Brothers Crisis (2005 Onwards)"
data_lehman = [
[crisis_lehman, "Small Caps", f"{min_drawdown_lehman['Small Caps']}%", date_min_drawdown_lehman['Small Caps'].strftime('%Y-%m-%d')],
[crisis_lehman, "Large Caps", f"{min_drawdown_lehman['Large Caps']}%", date_min_drawdown_lehman['Large Caps'].strftime('%Y-%m-%d')]
]
def print_table(headers, data):
"""
Generates and prints a formatted table given headers and data.
Parameters:
- headers: List of column headers.
- data: List of rows, where each row is a list of cell values.
"""
# Calculate the maximum width for each column
col_widths = [len(header) for header in headers]
for row in data:
for i, cell in enumerate(row):
cell_length = len(str(cell))
if cell_length > col_widths[i]:
col_widths[i] = cell_length
# Create format string for each row
row_format = "| " + " | ".join(f"{{:<{w}}}" for w in col_widths) + " |"
# Print header row
print(row_format.format(*headers))
# Print separator row
separator = "|" + "|".join(["-" * (w + 2) for w in col_widths]) + "|"
print(separator)
# Print data rows
for row in data:
print(row_format.format(*row))
# Print ending separator
print(separator)
# Generate and print the 1929 Crisis table
print_table(headers, data_1929)
print() # Add a blank line for separation
# Generate and print the Dot Com Crisis table
print_table(headers, data_dotcom)
print() # Add a blank line for separation
# Generate and print the Lehman Brothers Crisis table
print_table(headers, data_lehman) 1929 Crisis:
The Great Depression profoundly affected both Small Caps and Large Caps, leading to severe wealth erosion. This case study serves as a reminder of the potential extreme risks in the stock market.
- Interpretation:
- Small Caps: The maximum drawdown for Small Caps was
. This means that at its worst point during the 1929 crisis, the value of Small Caps assets decreased to of their peak value before recovering. - Large Caps: Similarly, Large Caps had a maximum drawdown of -84.0%. This indicates that Large Caps assets fell to 16% of their peak value at the lowest point of the crisis.
- Both Small Caps and Large Caps experienced their maximum drawdown on May, 1932.
- Small Caps: The maximum drawdown for Small Caps was
- Interpretation:
Dot Com Crisis (1990-2005):
Characterized by the bursting of the dot-com bubble, this period saw significant losses, particularly in technology-heavy Large Caps. Reflects the volatility and potential downside of tech-driven market booms.
- Interpretation:
- Small Caps: Experienced a maximum drawdown of
. This indicates that, at its lowest point during the Dot Com crisis, the value of Small Caps assets fell to of their peak value before recovering on December, 1990. - Large Caps: Had a maximum drawdown of
. This means that Large Caps assets decreased to of their peak value at the worst point of the crisis on September, 2002.
- Small Caps: Experienced a maximum drawdown of
- Interpretation:
Lehman Brothers Crisis (2005 Onwards):
Triggered by the collapse of Lehman Brothers, this crisis led to a global financial meltdown. Highlights the interconnectedness of modern financial markets and the rapidity with which shocks can propagate.
- Interpretation:
- Small Caps: Experienced a maximum drawdown of
. This indicates that, at its lowest point during the Lehman Brothers crisis, the value of Small Caps assets decreased to of their peak value before recovering. - Large Caps: Had a maximum drawdown of
. This means that Large Caps assets fell to of their peak value at the worst point of the crisis. - Both Small Caps and Large Caps experienced their maximum drawdown on February, 2009.
- Small Caps: Experienced a maximum drawdown of
- Interpretation:
Gaussian Density & Distribution
A Gaussian random variable
- A Density function:
- And a Cumulative distribution function (CDF):
Here,
Where
Symmetry Property
For a standard normal random variable
This property reflects the symmetry of the standard normal distribution.
Distribution of Negative Random Variable
The exponential function
confirming that the cumulative distribution function of
Quantiles
Imagine a random variable
- Quantiles in a Standard Normal Distribution:
For a standard normal distribution, where
This leads to a useful identity:
In essence, this identity illustrates the symmetry of the standard normal distribution.
- Symmetry in Probability:
In a standard normal distribution, the probability of the variable lying within certain symmetric bounds is given by:
This formula is useful for understanding the distribution of values around the mean in a normal distribution.
Practical Application: Finding a Specific Quantile
For example, to determine the 0.9-quantile of a standard normal distribution (i.e., the value below which 90% of the data lies), let's look for
Using the norm.ppf() function from scipy.stats, which provides quantiles for the Gaussian distribution, we can calculate this value directly.
import scipy.stats
# Calculating the 0.9-quantile of a standard normal distribution
phi_0_9 = scipy.stats.norm.ppf(0.9, loc=0, scale=1)
# Double-checking by calculating the probability up to the quantile
probability_check = scipy.stats.norm.cdf(phi_0_9, loc=0, scale=1)
# Preparing data for tabulation
data = [
["0.9-quantile (phi_0.9)", "{:.4f}".format(phi_0_9)],
["Probability check for phi_0.9", "{:.4f}".format(probability_check)],
]
# Creating a table with headers
headers = ["Description", "Value"]
def print_table(headers, data):
"""
Generates and prints a formatted table given headers and data.
Parameters:
- headers: List of column headers.
- data: List of rows, where each row is a list of cell values.
"""
# Calculate the maximum width for each column
col_widths = [len(header) for header in headers]
for row in data:
for i, cell in enumerate(row):
cell_length = len(str(cell))
if cell_length > col_widths[i]:
col_widths[i] = cell_length
# Create format string for each row
row_format = "| " + " | ".join(f"{{:<{w}}}" for w in col_widths) + " |"
# Print header row
print(row_format.format(*headers))
# Print separator row with '=' for headers
separator = "|" + "|".join(["=" * (w + 2) for w in col_widths]) + "|"
print(separator)
# Print data rows
for row in data:
print(row_format.format(*row))
# Print ending separator row with '-'
separator_dash = "|" + "|".join(["-" * (w + 2) for w in col_widths]) + "|"
print(separator_dash)
# Generate and print the table
print_table(headers, data) - Interpretation:
- 0.9-Quantile of a Standard Normal Distribution: "The 0.9-quantile (phi_0.9) of a standard normal distribution: 1.2816"
- Probability Check for the 0.9-Quantile: "Probability check for phi_0.9: 0.9000"
These results confirm that the value of approximately 1.2816 is indeed the 0.9-quantile of a standard normal distribution. Additionally, the probability check verifies that the cumulative distribution function (CDF) up to this quantile is 0.9, as expected.
Exploring Skewness and Kurtosis in Financial Data
Skewness: Assessing Asymmetry in Distributions
Skewness quantifies how asymmetrical a distribution is regarding its mean. It can be:
Positive: Indicating a tail on the right side.
Negative: Showing a tail on the left side.
Formally, skewness is defined using the third centered moment:
where
Kurtosis: Analyzing Tails and Outliers
Kurtosis measures the "tailedness" or concentration of outliers in a distribution. It's calculated as:
where
A normal distribution has a kurtosis of 3. Thus, "excess kurtosis" is often computed as
Practical Analysis
Here's a practical demonstration using Python to understand skewness and kurtosis in financial datasets:
import pandas as pd
import numpy as np
import PortfolioOptimizationKit as pok
import json
# Generate normally distributed data
A = pd.DataFrame({"A": np.random.normal(0, 2, size=800)})
# Get market returns
B = pok.get_ffme_returns()
B = B["Hi 10"]
# Calculate histograms
def calculate_histogram(data_series, bins=60):
counts, bin_edges = np.histogram(data_series, bins=bins, density=True)
bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
return counts.tolist(), bin_centers.tolist()
# Calculate histogram data
normal_counts, normal_bins = calculate_histogram(A["A"])
market_counts, market_bins = calculate_histogram(B)
# Print statistics
print(f"Normal Distribution - Mean: {A['A'].mean():.3f}, Std: {A['A'].std():.3f}")
print(f"Market Returns - Mean: {B.mean():.3f}, Std: {B.std():.3f}")
# Prepare data in format matching OutputDisplay.vue expectations
plot_data = {
"normalDist": {
"type": "bar",
"yAxisName": "Density",
"series": {
"Density": normal_counts
},
"xAxis": normal_bins
},
"marketDist": {
"type": "bar",
"yAxisName": "Density",
"series": {
"Density": market_counts
},
"xAxis": market_bins
}
}
# Print chart data
print("\n<ECHARTS_DATA>" + json.dumps(plot_data)) In this analysis, the normal distribution is expected to show skewness near zero and kurtosis close to three. In contrast, the market returns, deviating from normality, may exhibit different skewness and higher kurtosis values.
Advanced Analysis with Hedge Fund Indices incorporation
We extend our analysis of skewness and kurtosis to a dataset involving hedge fund indices. This dataset provides a different perspective, often diverging from the standard characteristics of normal distributions.
import pandas as pd
import PortfolioOptimizationKit as pok
import scipy.stats
# Load the hedge fund index data
hfi = pok.get_hfi_returns()
print(hfi.head(3))
# Initialize a DataFrame to store skewness and kurtosis values
hfi_skew_kurt = pd.DataFrame(columns=["Skewness", "Kurtosis"])
# Calculate skewness and kurtosis for each column in the hedge fund index data
hfi_skew_kurt["Skewness"] = hfi.aggregate(pok.skewness)
hfi_skew_kurt["Kurtosis"] = hfi.aggregate(pok.kurtosis)
# Display the calculated skewness and kurtosis
print(hfi_skew_kurt) When trying identifying Gaussian Distributions, CTA Global, shows skewness near zero and kurtosis close to three, indicating a possible normal distribution.
- Using Jarque-Bera Test for Normality
The Jarque-Bera test, a statistical test for normality, helps confirm if an index follows a Gaussian distribution.
# Jarque-Bera test on CTA Global
jb_result_cta_global = scipy.stats.jarque_bera(hfi["CTA Global"])
print("CTA Global:", jb_result_cta_global)
# Check normality using custom function in erk toolkit
is_normal_cta_global = pok.is_normal(hfi["CTA Global"])
print("Is CTA Global Normal?", is_normal_cta_global)
# Jarque-Bera test on Convertible Arbitrage
jb_result_conv_arb = scipy.stats.jarque_bera(hfi["Convertible Arbitrage"])
print("Convertible Arbitrage:", jb_result_conv_arb)
# Check normality for Convertible Arbitrage
is_normal_conv_arb = pok.is_normal(hfi["Convertible Arbitrage"])
print("Is Convertible Arbitrage Normal?", is_normal_conv_arb) - Normality Across Indices
Finally, we examine the normality of all indices in the hedge fund dataset.
# Aggregate normality test across all indices
normality_test_results = hfi.aggregate(pok.is_normal)
print(normality_test_results) This comprehensive analysis reveals that only the CTA Global index passes the normality test, suggesting it's the most normally distributed among the hedge fund indices.
Understanding Downside Risk Measures
Semivolatility: Focusing on Negative Fluctuations
Semivolatility, distinct from total volatility, zeroes in on the negative side of asset return fluctuations. In investment, the concern often isn't how much returns deviate when they're positive but rather how volatile they are when they're negative.
Semivolatility addresses this by measuring the standard deviation of only the subset of returns that are negative or below the mean. This measure is crucial for investors who prioritize safeguarding against losses over pursuing high returns. Mathematically, it's denoted as:
where
NOTE
This measure can be adapted to include returns below any chosen benchmark, such as the overall mean return.
Calculating Semivolatility in Python To compute semivolatility, filter the returns to only include the negative or below-mean values and then apply the standard deviation formula to this subset.
print(pok.semivolatility(hfi)) Value at Risk (VaR): Gauging Maximum Expected Loss
Value at Risk, or VaR, quantifies the maximum anticipated loss over a defined period under normal market conditions. The confidence level for this measure usually lies between 0 and 1 and is often expressed as a percentage.
Conceptualizing VaR with a 99% Confidence Level
For instance, let's consider a VaR with a 99% confidence level (i.e.,
Given a set of monthly returns like:
The task is to determine the 90% monthly VaR which implies two steps:
Exclude the 10% Worst Returns: This means removing the lowest 10% of returns, which, in a dataset of 10, equates to the single worst return, such as
Identify the Next Worst Return: After excluding the worst, the next worst, in this case, is
It's important to note that even though the VaR is found to be
From a mathematical standpoint, given a confidence level
essentially making it the
indicating a
In the illustrated example, a
which implies a 10% chance of losing over
Exploring Conditional Value at Risk (CVaR)
Conditional Value at Risk, CVaR is a measure used to assess the risk of investments. It predicts the average loss that could occur in the worst-case scenarios beyond a certain threshold set by Value at Risk (VaR). While VaR tells you how bad a loss might be on a very bad day, CVaR provides insight into the average losses expected if things go even worse than the VaR estimation. Essentially, CVaR calculates the mean of the tail end of the loss distribution, representing the expected loss in the worst
The formula is given by:
where
Illustrative Example
Consider this set of monthly returns:
the objective is to determine the 80% monthly CVaR which implies three steps:
Exclude the Bottom 20% of Returns: Identify and remove the worst 20% of returns. For 10 returns, this means excluding the two worst, which are
Identify the VaR: The next worst return after removing the bottom
Calculate the Average Beyond VaR: Consider the average of the returns worse than
This process illustrates how CVaR provides a more detailed risk assessment, taking into account the severity of losses beyond the VaR threshold.
Estimating VaR and CVaR: A Comparative Overview
Several methods are available to estimate VaR (Value at Risk) and CVaR (Conditional Value at Risk), each with its own approach and implications.
Historical Approach (Non-Parametric)
The historical method is a straightforward, non-parametric approach to estimating VaR. It directly applies the concept of VaR as the
import pandas as pd
import numpy as np
import PortfolioOptimizationKit as pok
import json
try:
# Retrieve returns for the CTA Global index
hfi = pok.get_hfi_returns()
cta_returns = hfi["CTA Global"]
# Calculate histogram data
counts, bin_edges = np.histogram(cta_returns, bins=60, density=True)
bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
# Print basic statistics
print("CTA Global Returns Statistics:")
print(f"Mean: {cta_returns.mean():.3f}")
print(f"Std Dev: {cta_returns.std():.3f}")
print(f"Skewness: {cta_returns.skew():.3f}")
print(f"Kurtosis: {cta_returns.kurtosis():.3f}")
# Prepare data for visualization
plot_data = {
"ctaDistribution": {
"type": "bar",
"yAxisName": "Density",
"series": {
"Density": counts.tolist()
},
"xAxis": bin_centers.tolist()
}
}
# Print chart data
print("\n<ECHARTS_DATA>" + json.dumps(plot_data))
except Exception as e:
print(f"Error: {str(e)}") Suppose you're interested in calculating the
alpha = np.array([0.90, 0.95, 0.99])
level = 1 - alpha
# The percentile method requires the level in the range 0 to 100
VaRs = -np.percentile(hfi["CTA Global"], level*100)
print("90% VaR: {:.2f}%".format(VaRs[0] * 100))
print("95% VaR: {:.2f}%".format(VaRs[1] * 100))
print("99% VaR: {:.2f}%".format(VaRs[2] * 100)) This implies there's a
However, it's crucial to note that this method's accuracy depends on the timescale of the returns used. A VaR calculated with monthly returns might differ significantly from one calculated using weekly data, highlighting the sensitivity of the historical method to the chosen timescale.
Parametric Approach (Gaussian)
In the Gaussian or parametric approach, returns are assumed to be normally distributed, a presumption which might not always hold true in real-world scenarios. If the returns
To determine the specific threshold
This leads to a sequence of equalities:
which, when solved, gives us the value
Thus, we establish the formula for
where norm.ppf python module.
# Compute the 95% monthly Gaussian VaR of the hedge fund indices
alpha = 0.95
print(pok.var_gaussian(hfi, level=1-alpha)) Cornish-Fisher Modification (Semi-Parametric)
This approach modifies the Gaussian method using the Cornish-Fisher expansion, which adjusts the Gaussian quantiles to account for skewness and kurtosis of the return distribution. This makes it a better fit for non-Gaussian distributions:
where
Thus, using this approach, the Value at Risk (VaR) at confidence level
# Compute the 95% monthly Gaussian VaR of the hedge fund indices using the Cornish-Fisher method
print(pok.var_gaussian(hfi, cf=True)) Comparison of VaR Methods
Comparing the VaR computed via different methods provides insights into the sensitivity and suitability of each approach under various market conditions:
import pandas as pd
import numpy as np
import PortfolioOptimizationKit as pok
import json
try:
# Get HFI data
hfi = pok.get_hfi_returns()
# Calculate different VaR measures
comparevars = pd.concat([
pok.var_historic(hfi),
pok.var_gaussian(hfi),
pok.var_gaussian(hfi, cf=True),
pok.cvar_historic(hfi)
], axis=1)
# Name the columns
comparevars.columns = ["Historical", "Gaussian", "Cornish-Fisher", "Conditional VaR"]
# Convert to percentage
comparevars = comparevars * 100
# Print summary statistics
print("VaR Comparison Statistics (%):")
print(comparevars.round(2))
# Prepare data for visualization
plot_data = {
"varComparison": {
"type": "bar",
"title": "Comparison of 95% monthly VaRs for Hedge Fund indices",
"yAxisName": "Value at Risk (%)",
"xAxisName": "Strategy",
"series": {
var_type: comparevars[var_type].tolist()
for var_type in comparevars.columns
},
"xAxis": {
"type": "category",
"data": list(comparevars.index),
"axisLabel": {
"rotate": 45, # Rotate labels for better readability
"interval": 0 # Show all labels
}
}
}
}
# Print chart data
print("\n<ECHARTS_DATA>" + json.dumps(plot_data))
except Exception as e:
print(f"Error: {str(e)}") This visualization generally shows that Conditional VaR tends to estimate higher risk levels compared to the other methods, especially in tail events, while the historical method often presents the lowest VaR estimates. Each method has its place, depending on the risk profile, investment horizon, and market conditions an investor is dealing with.
