Skip to content

Understanding Returns and Assessing Risks with Value at Risk

This guide simplifies the understanding of investment returns and risk management. It starts with how to calculate and interpret percentage and compound returns, then explains monthly and annualized returns. The focus then shifts to understanding and measuring volatility and risk, including practical Python examples. It covers how to assess investment risks and returns using the Sharpe Ratio and real-world data, and introduces concepts like skewness, kurtosis, Value at Risk (VaR), and Conditional Value at Risk (CVaR). These tools help predict potential investment losses and provide a solid foundation for anyone looking to understand the essentials of financial risk management.

Analyzing Returns

Percentage Returns Explained

Percentage Returns Explained: Percentage return measures the financial gain or loss between two time points, t and t+1. It's calculated as:

Pt+1=Pt+Rt,t+1Pt=Pt(1+Rt,t+1)Rt,t+1:=Pt+1PtPt=Pt+1Pt1.

where Rt,t+1 is the return. For example, if a stock price rises from $100 to $104, the return is Rt,t+1=104/1001=0.04$=4%

Understanding Compound Returns

Compound Returns over Multiple Periods: The total return over several periods is not merely the sum of individual returns. Consider two consecutive time periods, with prices P0 and P2:

P1=P0+R0,1P0andP2=P1+R1,2P1.

So, the total return over two periods,

R0,2=P2P01=1+R0,1+R1,2+R1,2R0,11=(1+R0,1)(1+R1,2)1.

In a timeframe t to t+k, with k>1, it generalizes to:

Rt,t+k=(1+R)k1.i=0k1(1+Rt+i,t+i+1)1

Practical Examples

  1. Two-Day Stock Performance: A stock increasing by 10% on day one and decreasing by 3% on day two has a compound return R0,2=(1+0.10)(10.03)1=6.7%, not simply 10%3%.

  2. Annualized Return from Quarterly Returns: A stock with consistent 1% quarterly returns yields an annualized return R0,12=(1+0.01)41=4.06%.

Monthly & Annual Returns

  • Monthly Returns: Given monthly returns, the compound total return after two months Rtotal is calculated. To find the equivalent monthly return Rpm, we solve Rtotal=(1+Rpm)21, giving Rpm=1+Rtotal1.

  • Annualized Returns: The annualized return Rpy is derived from monthly returns using Rpy=(1+Rpm)121. For a series of returns, the formula becomes Rpy=(1+Rtotal)12/n1, where n is the number of months.

Generalizing Annualized Returns

For different time intervals (daily, weekly, monthly), the annualized return formula adjusts the power in the equation, with ( P_y ) representing the number of periods per year:

Rpy=(1+Rtotal)Py/Nrets1,wherePy={252if daily,52if weekly,12if monthly.

Assessing Volatility and Risk

Volatility, a risk measure, is the standard deviation of asset returns:

σ:=1N1t(Rtμ)2,

where μ is the mean return. For monthly returns, annualized volatility is σann=σm12.

When dealing with monthly return data, the calculation of volatility usually focuses on the monthly scale, termed as monthly volatility. However, to understand the asset's risk profile over a longer period, such as a year, this monthly volatility needs to be scaled up. This process is necessary because volatility metrics derived from different time intervals are not directly comparable.

The conversion to annualized volatility σann involves a simple mathematical adjustment:

σann=σpp,

where σp is the volatility calculated over the shorter time period p, and σann is the annualized volatility. The variable p represents the number of periods in a year.

For different time frames, the calculation adjusts as follows:

  • Monthly Volatility σm: To annualize, use σann=σm12
  • Weekly Volatility σw: Annualize by calculating σann=σw12
  • Daily Volatility σd: Convert to annual volatility with σann=σd12

This method standardizes volatility to a yearly scale, allowing for a consistent and comparable measure of risk across different time frames.

Zero Volatility Concept

Question cosidering this scenario: Asset A experiences a monthly decrease of 1% over a period of 12 months, while Asset B sees a consistent monthly increase of 1% during the same timeframe.

Which asset exhibits greater volatility?

Interestingly, the answer is that neither of them display any volatility (volatility is non-existent in both cases), as there are no real fluctuations; Asset A consistently decreases, whereas Asset B consistently increases.

python
import pandas as pd
from statsmodels.iolib.table import SimpleTable

# 1) Create your DataFrame
a = [100]
b = [100]
for i in range(12):
    a.append(a[-1] * 0.99)  # Asset A loses 1% each month
    b.append(b[-1] * 1.01)  # Asset B gains 1% each month

asset_df = pd.DataFrame({"Asset A": a, "Asset B": b})
asset_df["Return A"] = asset_df["Asset A"].pct_change()
asset_df["Return B"] = asset_df["Asset B"].pct_change()

# 2) Prepare data for SimpleTable
data = asset_df.values.tolist()          # Convert the DataFrame to a list of rows
headers = asset_df.columns.tolist()      # Use DataFrame column names as headers

# 3) Build a statsmodels SimpleTable
table = SimpleTable(data, headers)

# 5) Compute & display summary statistics in a second table
total_returns = (1 + asset_df[["Return A", "Return B"]].iloc[1:]).prod() - 1
volatility = asset_df[["Return A", "Return B"]].iloc[1:].std()

stats_data = [
    ["Total Returns", f"{total_returns['Return A']:.2%}", f"{total_returns['Return B']:.2%}"],
    ["Volatility",    f"{volatility['Return A']:.4f}",    f"{volatility['Return B']:.4f}"]
]
stats_headers = ["Metric", "Asset A", "Asset B"]
stats_table = SimpleTable(stats_data, stats_headers)

print("\nSummary Stats:")
print(stats_table)

Python Example: Analyzing Stock Data

python
# In this analysis, we first generate synthetic stock prices for two stocks with distinct volatilities. 
# We then calculate their monthly returns and visualize the data. 
# The total compound returns, mean returns, and volatility are computed to understand the risk profile of each stock. 
# Finally, we calculate the Return on Risk (ROR) for each stock, which provides insights into the risk-adjusted performance of the investments. 
# This analysis suggests that Stock A offers a better return per unit of risk compared to Stock B, even though their total returns are similar.

import numpy as np
import pandas as pd
import json

# Example of generating stock prices with a DatetimeIndex
np.random.seed(51)
dates = pd.date_range(start='2020-01', periods=10, freq='M')  # Monthly dates
stocks = pd.DataFrame({
    "Stock A": np.random.normal(10, 1, size=10),
    "Stock B": np.random.normal(10, 5, size=10)
}, index=dates)
stocks.index.name = "Months"
stocks = round(stocks, 2)

# Calculating returns
stocks["Stock A Rets"] = stocks["Stock A"] / stocks["Stock A"].shift(1) - 1
stocks["Stock B Rets"] = stocks["Stock B"] / stocks["Stock B"].shift(1) - 1
stocks = round(stocks, 2)

# Print numerical results with proper formatting
total_ret = (1 + stocks[["Stock A Rets", "Stock B Rets"]]).prod() - 1
means = stocks[["Stock A Rets", "Stock B Rets"]].mean()
volatility = stocks[["Stock A Rets", "Stock B Rets"]].std()
ann_volatility = volatility * np.sqrt(12)

print("\nTotal Returns:")
for col in total_ret.index:
    print(f"{col}: {total_ret[col]:.2%}")

print("\nMean Returns:")
for col in means.index:
    print(f"{col}: {means[col]:.2%}")

print("\nVolatility:")
for col in volatility.index:
    print(f"{col}: {volatility[col]:.2%}")

print("\nAnnualized Volatility:")
for col in ann_volatility.index:
    print(f"{col}: {ann_volatility[col]:.2%}")

# Define crisis events (optional, adjust dates as needed)
crisis = [
    {"date": "2020-03", "name": "Initial Market Shock"},
    {"date": "2020-07", "name": "Mid-Year Adjustment"}
]

# Prepare visualization data matching OutputDisplay.vue structure
plot_data = {
    # X-Axis: convert DatetimeIndex to string (YYYY-MM)
    "dates": stocks.index.strftime('%Y-%m').tolist(),
    # "crisis": crisis,

    # Stock Prices
    "stockPrices": {
        "series": {
            "Stock A": stocks["Stock A"].tolist(),
            "Stock B": stocks["Stock B"].tolist()
        },
        "type": "line",                  # Specify chart type
        "yAxisName": "Price (USD)"       # Y-axis label
    },
    # Stock Returns
    "returns": {
        "series": {
            "Stock A Returns": (stocks["Stock A Rets"] * 100).fillna(0).tolist(),
            "Stock B Returns": (stocks["Stock B Rets"] * 100).fillna(0).tolist()
        },
        "type": "bar",                   # Specify chart type as histogram
        "yAxisName": "Returns (%)"       # Y-axis label
    }
}

# Print data for chart visualization
print("\n<ECHARTS_DATA>" + json.dumps(plot_data))

Evaluating Return on Risk

Return on Risk (ROR) measures the reward per unit of risk, calculated as:

ROR:=RETURNRISK=Rσ,

where R is the total compound return. This metric helps compare investments with different risk profiles.

python
# This code segment computes the ROR for each stock, helping investors understand which stock offers better returns for the risk taken.
import numpy as np
import pandas as pd

# Example of generating stock prices with a DatetimeIndex
np.random.seed(51)
dates = pd.date_range(start='2020-01', periods=10, freq='M')  # Monthly dates
stocks = pd.DataFrame({
    "Stock A": np.random.normal(10, 1, size=10),
    "Stock B": np.random.normal(10, 5, size=10)
}, index=dates)
stocks.index.name = "Months"
stocks = round(stocks, 2)

# Calculating returns
stocks["Stock A Rets"] = stocks["Stock A"] / stocks["Stock A"].shift(1) - 1
stocks["Stock B Rets"] = stocks["Stock B"] / stocks["Stock B"].shift(1) - 1
stocks = round(stocks, 2)

# Print numerical results with proper formatting
total_ret = (1 + stocks[["Stock A Rets", "Stock B Rets"]]).prod() - 1
means = stocks[["Stock A Rets", "Stock B Rets"]].mean()
volatility = stocks[["Stock A Rets", "Stock B Rets"]].std()
ann_volatility = volatility * np.sqrt(12)

# Calculating Return on Risk
ROR = total_ret / volatility

print("\nTotal Returns:")
for col in ROR.index:
    print(f"{col}: {ROR[col]:.2%}")

# Conditional Comment Based on ROR
ror_a = ROR["Stock A Rets"]
ror_b = ROR["Stock B Rets"]

if ror_a > ror_b:
    comment = (
        "Higher ROR for Stock A indicates that we achieve a better return per unit of risk by investing in this stock.\n"
        "In other words, it is more advantageous to invest in Stock A rather than Stock B, even though the total returns "
        "from both stocks are similar."
    )
elif ror_a < ror_b:
    comment = (
        "Higher ROR for Stock B indicates that we achieve a better return per unit of risk by investing in this stock.\n"
        "In other words, it is more advantageous to invest in Stock B rather than Stock A, even though the total returns "
        "from both stocks are similar."
    )
else:
    comment = (
        "Both stocks have identical Return on Risk (ROR), indicating that they offer similar returns per unit of risk."
    )

print("\n" + comment)

Sharpe Ratio: Assessing Risk-Adjusted Returns

The Sharpe Ratio provides a more nuanced view of an investment's performance by considering the risk-free rate. This ratio adjusts the return on risk by accounting for the returns of a risk-free asset, like a US Treasury Bill. It's defined as the excess return per unit of risk:

λ:=ERσwhereER:=RRF,

Here, ER is the excess return, calculated by subtracting the risk-free rate RF from the return R.

python
# This calculation demonstrates how the Sharpe Ratio can provide 
# additional insights into the risk-adjusted performance of an investment.

import numpy as np
import pandas as pd

# Example of generating stock prices with a DatetimeIndex
np.random.seed(51)
dates = pd.date_range(start='2020-01', periods=10, freq='M')  # Monthly dates
stocks = pd.DataFrame({
    "Stock A": np.random.normal(10, 1, size=10),
    "Stock B": np.random.normal(10, 5, size=10)
}, index=dates)
stocks.index.name = "Months"
stocks = round(stocks, 2)

# Calculating returns
stocks["Stock A Rets"] = stocks["Stock A"] / stocks["Stock A"].shift(1) - 1
stocks["Stock B Rets"] = stocks["Stock B"] / stocks["Stock B"].shift(1) - 1
stocks = round(stocks, 2)

# Print numerical results with proper formatting
total_ret = (1 + stocks[["Stock A Rets", "Stock B Rets"]]).prod() - 1
means = stocks[["Stock A Rets", "Stock B Rets"]].mean()
volatility = stocks[["Stock A Rets", "Stock B Rets"]].std()
ann_volatility = volatility * np.sqrt(12)

# Assuming a 3% risk-free rate
risk_free_rate = 0.03 
excess_return  = total_ret - risk_free_rate
sharpe_ratio   = excess_return / volatility

print("\nSharpe Ratio:")
for col in sharpe_ratio.index:
    print(f"{col}: {sharpe_ratio[col]:.2%}")

Illustrating Financial Concepts Using a Real-World Dataset

The performance of Small Cap and Large Cap US stocks using a dataset from Kaggle will be used to illustrate Financial Concepts. This dataset includes monthly returns from July 1926 to December 2018.

Data Analysis

Dataset Source: PortfolioOptimizationKit.py to load data from a CSV file will be used. This dataset categorizes US stocks into Small Caps (bottom 10% by market capitalization) and Large Caps (top 10%).

Data Visualization: The code visualizes monthly returns for both categories. This step is crucial for a quick assessment of performance over time.

python
# This code segment analyzes the monthly returns of Small Cap and Large Cap stocks.
import pandas as pd
import json
import PortfolioOptimizationKit as pok

# ---------------------------------------------------------------------------------
# Load and format data files
# ---------------------------------------------------------------------------------

# Load the dataset without parsing dates initially
file_to_load = pok.path_to_data_folder() + "Portfolios_Formed_on_ME_monthly_EW.csv"
df = pd.read_csv(file_to_load, index_col=0, parse_dates=False, na_values=-99.99)

# Convert the index to string to facilitate datetime conversion
df.index = df.index.astype(str)

# Convert the string index to datetime using the format 'YYYYMM'
try:
    df.index = pd.to_datetime(df.index, format='%Y%m')
    print("\nConversion to DatetimeIndex successful.")
except ValueError as e:
    print("\nDate conversion failed:", e)
    print("The index remains as is.")

# Print the first few rows of the dataframe as plain text output
print("\n" + df.head().to_string())

# Focus on Small Cap and Large Cap stocks
small_large_caps = df[["Lo 10", "Hi 10"]] / 100  # Dividing by 100 to convert to actual returns

# Optional: Print descriptive statistics
print("\nDescriptive Statistics:")
print(small_large_caps.describe())

# Aggregate data annually to display only the year on the X-axis
small_large_caps_yearly = small_large_caps.resample('A').mean()

# Verify the aggregation
print("\nAnnual Average Returns:")
print(small_large_caps_yearly.head())

# Prepare visualization data matching OutputDisplay.vue structure
plot_data = {
    # X-Axis: convert DatetimeIndex to string (YYYY)
    "dates": small_large_caps_yearly.index.strftime('%Y').tolist(),

    # Small Cap and Large Cap Returns
    "smallLargeCaps": {
        "series": {
            "Small Cap (Lo 10)": small_large_caps_yearly["Lo 10"].tolist(),
            "Large Cap (Hi 10)": small_large_caps_yearly["Hi 10"].tolist()
        },
        "type": "line",                      # Specify chart type
        "yAxisName": "Annual Average Returns (%)"  # Y-axis label
    }
}

# Print data for chart visualization
print("\n<ECHARTS_DATA>" + json.dumps(plot_data))

Calculating Volatility

Monthly Volatility: Using Python's standard deviation function to calculate the monthly volatility, a measure of how much stock returns vary from their average value.

Annualizing Volatility: Then scale up the monthly volatility to an annual figure. This conversion provides a broader view of the stocks' risk over a longer period.

python
# Calculating monthly volatility
monthly_volatility = small_large_caps.std()

# Annualizing the volatility
annualized_volatility = monthly_volatility * (12 ** 0.5)

print("\nMonthly Volatility:")
for col in monthly_volatility.index:
    print(f"{col}: {monthly_volatility[col]:.2%}")

print("\nAnnualized Volatility:")
for col in annualized_volatility.index:
    print(f"{col}: {annualized_volatility[col]:.2%}")

Returns Analysis

Monthly & Annual Returns: Python code is used to calculate both monthly and annual returns. This step is key to understanding the long-term growth potential of the stocks.

python
# Number of months in the dataset
n_months = small_large_caps.shape[0]

# Total compound return
total_return = (1 + small_large_caps).prod() - 1

# Monthly and annualized returns
return_per_month = (1 + total_return) ** (1 / n_months) - 1
annualized_return = (1 + return_per_month) ** 12 - 1

print("\nReturn per Month:")
for col in return_per_month.index:
    print(f"{col}: {return_per_month[col]:.2%}")

print("\nAnnualized Return:")
for col in annualized_return.index:
    print(f"{col}: {annualized_return[col]:.2%}")

Risk-Adjusted Returns: Return on Risk and Sharpe Ratio are computed. These metrics help us understand which stocks offer better returns for their level of risk.

python
# Assuming a risk-free rate
risk_free_rate = 0.03

# Return on Risk and Sharpe Ratio
return_on_risk = annualized_return / annualized_volatility
sharpe_ratio = (annualized_return - risk_free_rate) / annualized_volatility

print("\nReturn on Risk:")
for col in return_on_risk.index:
    print(f"{col}: {return_on_risk[col]:.2%}")

print("\nSharpe Ratio:")
for col in sharpe_ratio.index:
    print(f"{col}: {sharpe_ratio[col]:.2%}")

Drawdown

Drawdown is a crucial metric in portfolio management, indicating the maximum loss from a peak to a trough of a portfolio, before a new peak is achieved. It's a measure of the most significant drop in asset value and is often used to assess the risk of a particular investment or portfolio.

To calculate the drawdown, these steps are needed:

Calculate the Wealth Index: This represents the value of a portfolio as it evolves over time, taking into account the compounding of returns.

Determine the Previous Peaks: Identify the highest value of the portfolio before each time point.

Calculate the Drawdown: It is the difference between the current wealth index and the previous peak, represented as a percentage of the previous peak.

The formula can be expressed as:

Drawdown at time t=Ptmaxt[0,t]Ptmaxt[0,t]Pt 

where Pt is the portfolio value at time t, and maxt[0,t]Pt is the maximum portfolio value up to time t.

In extension, the Maximum Drawdown is a metric that quantifies the most substantial loss experienced from a peak to a trough of a portfolio before the emergence of a new peak. This metric is crucial for measuring the most severe decrease in value. The Maximum Drawdown formula is described as:

Max Drawdown=maxt[0,T](maxt[0,t]PtPt) 

where T is the entire time period under consideration, Pt is the portfolio value at time t, and maxt[0,t]Pt is the maximum portfolio value up to time t.

Practical example

To showcase Drawdown, we will use the identical Small Cap and Large Cap US stocks dataset.

python
import pandas as pd
import numpy as np
import json
import sys
import os

# Ensure "/assets" is in the Python path so we can import your module & CSV files
sys.path.append('/assets')

try:
    # ------------------------------------------------------------------------------
    # 1) Import your portfolio optimization module
    # ------------------------------------------------------------------------------
    import PortfolioOptimizationKit as pok

    # ------------------------------------------------------------------------------
    # 2) Load Fama-French Monthly returns via pok.get_ffme_returns()
    # ------------------------------------------------------------------------------
    rets = pok.get_ffme_returns()
    # Ensure columns are consistent
    rets.columns = ["Small Caps", "Large Caps"]

    # ------------------------------------------------------------------------------
    # 3) Calculate key indices: wealth, peaks, drawdown, difference from peaks
    # ------------------------------------------------------------------------------
    wealth_index = 100 * (1 + rets).cumprod()
    previous_peaks = wealth_index.cummax()
    drawdown = (wealth_index - previous_peaks) / previous_peaks  # ratio
    diff_from_peaks = wealth_index - previous_peaks

    # ------------------------------------------------------------------------------
    # 5) Define crisis events in an array (adjusted to match data's date range)
    # ------------------------------------------------------------------------------
    crisis = [
        {"date": "1929", "name": "Great Depression"},
        {"date": "1990", "name": "Dot-Com Bubble Burst"},
        {"date": "2005", "name": "Lehman Brothers Crisis"}
    ]

    # ------------------------------------------------------------------------------
    # 6) Build a dictionary matching the chart keys in OutputDisplay.vue
    # ------------------------------------------------------------------------------
    plot_data = {
        # X-Axis: convert DatetimeIndex to string (YYYY)
        "dates": rets.index.strftime('%Y').tolist(),
        "crisis": crisis,

        # Small Caps Wealth with peaks
        "smallCapsWealth": {
            "series": {
                "Wealth": wealth_index["Small Caps"].tolist(),
                "Peaks": previous_peaks["Small Caps"].tolist()
            },
            "type": "line",
            "yAxisName": "Wealth Index",
            "markLine": {
                "data": [
                    {"xAxis": event["date"], "name": event["name"], "lineStyle": {"color": "#BC1142", "type": "dashed"}}
                    for event in crisis
                ],
                "label": {
                    "formatter": "{b}",
                    "position": "insideEndTop",
                    "color": "#172E5C",
                    "rotate": 90
                }
            }
        },
        # Large Caps Wealth with peaks
        "largeCapsWealth": {
            "series": {
                "Wealth": wealth_index["Large Caps"].tolist(),
                "Peaks": previous_peaks["Large Caps"].tolist()
            },
            "type": "line",
            "yAxisName": "Wealth Index",
            "markLine": {
                "data": [
                    {"xAxis": event["date"], "name": event["name"], "lineStyle": {"color": "#BC1142", "type": "dashed"}}
                    for event in crisis
                ],
                "label": {
                    "formatter": "{b}",
                    "position": "insideEndTop",
                    "color": "#172E5C",
                    "rotate": 90
                }
            }
        },
        # Difference from peaks (nominal) - No markLine
        "diffFromPeak": {
            "series": {
                "Small Caps": diff_from_peaks["Small Caps"].tolist(),
                "Large Caps": diff_from_peaks["Large Caps"].tolist()
            },
            "type": "line",
            "yAxisName": "Difference from Peaks"
            # Removed markLine to exclude crisis annotations
        },
        # Drawdown in %
        "drawdown": {
            "series": {
                "Small Caps": (drawdown["Small Caps"] * 100).tolist(),
                "Large Caps": (drawdown["Large Caps"] * 100).tolist()
            },
            "type": "line",
            "yAxisName": "Drawdown (%)"
            # Removed markLine to exclude crisis annotations
        }
    }

    # ------------------------------------------------------------------------------
    # 7) Print the final <ECHARTS_DATA> block so OutputDisplay.vue can parse it
    # ------------------------------------------------------------------------------
    print("\n<ECHARTS_DATA>" + json.dumps(plot_data))

except Exception as e:
    print(f"Error: {str(e)}")
    print(f"Current working directory: {os.getcwd()}")
    print(f"Python path: {sys.path}")

Insights from Historical Crises

python
# Calculate max drawdown and corresponding dates for 1929 Crisis (1929-1933)
min_drawdown_1929 = drawdown.loc["1929":"1933"].min().round(2) * 100 
date_min_drawdown_1929 = drawdown.loc["1929":"1933"].idxmin()

# Calculate max drawdown and corresponding dates for Dot Com Crisis (1990-2005)
min_drawdown_dotcom = drawdown.loc["1990":"2005"].min().round(2) * 100 
date_min_drawdown_dotcom = drawdown.loc["1990":"2005"].idxmin()

# Calculate max drawdown and corresponding dates for Lehman Brothers Crisis (2005 Onwards)
min_drawdown_lehman = drawdown.loc["2005":].min().round(2) * 100 
date_min_drawdown_lehman = drawdown.loc["2005":].idxmin()

# Define the headers for the tables
headers = ["Crisis", "Category", "Max Drawdown (%)", "Date of Max Drawdown"]

# Define the data for 1929 Crisis
crisis_1929 = "1929 Crisis"
data_1929 = [
    [crisis_1929, "Small Caps", f"{min_drawdown_1929['Small Caps']}%", date_min_drawdown_1929['Small Caps'].strftime('%Y-%m-%d')],
    [crisis_1929, "Large Caps", f"{min_drawdown_1929['Large Caps']}%", date_min_drawdown_1929['Large Caps'].strftime('%Y-%m-%d')]
]

# Define the data for Dot Com Crisis
crisis_dotcom = "Dot Com Crisis (1990-2005)"
data_dotcom = [
    [crisis_dotcom, "Small Caps", f"{min_drawdown_dotcom['Small Caps']}%", date_min_drawdown_dotcom['Small Caps'].strftime('%Y-%m-%d')],
    [crisis_dotcom, "Large Caps", f"{min_drawdown_dotcom['Large Caps']}%", date_min_drawdown_dotcom['Large Caps'].strftime('%Y-%m-%d')]
]

# Define the data for Lehman Brothers Crisis
crisis_lehman = "Lehman Brothers Crisis (2005 Onwards)"
data_lehman = [
    [crisis_lehman, "Small Caps", f"{min_drawdown_lehman['Small Caps']}%", date_min_drawdown_lehman['Small Caps'].strftime('%Y-%m-%d')],
    [crisis_lehman, "Large Caps", f"{min_drawdown_lehman['Large Caps']}%", date_min_drawdown_lehman['Large Caps'].strftime('%Y-%m-%d')]
]

def print_table(headers, data):
    """
    Generates and prints a formatted table given headers and data.

    Parameters:
    - headers: List of column headers.
    - data: List of rows, where each row is a list of cell values.
    """
    # Calculate the maximum width for each column
    col_widths = [len(header) for header in headers]
    for row in data:
        for i, cell in enumerate(row):
            cell_length = len(str(cell))
            if cell_length > col_widths[i]:
                col_widths[i] = cell_length

    # Create format string for each row
    row_format = "| " + " | ".join(f"{{:<{w}}}" for w in col_widths) + " |"

    # Print header row
    print(row_format.format(*headers))

    # Print separator row
    separator = "|" + "|".join(["-" * (w + 2) for w in col_widths]) + "|"
    print(separator)

    # Print data rows
    for row in data:
        print(row_format.format(*row))

    # Print ending separator
    print(separator)

# Generate and print the 1929 Crisis table
print_table(headers, data_1929)
print()  # Add a blank line for separation

# Generate and print the Dot Com Crisis table
print_table(headers, data_dotcom)
print()  # Add a blank line for separation

# Generate and print the Lehman Brothers Crisis table
print_table(headers, data_lehman)
  • 1929 Crisis:

    The Great Depression profoundly affected both Small Caps and Large Caps, leading to severe wealth erosion. This case study serves as a reminder of the potential extreme risks in the stock market.

    • Interpretation:
      • Small Caps: The maximum drawdown for Small Caps was 83.0%. This means that at its worst point during the 1929 crisis, the value of Small Caps assets decreased to 17% of their peak value before recovering.
      • Large Caps: Similarly, Large Caps had a maximum drawdown of -84.0%. This indicates that Large Caps assets fell to 16% of their peak value at the lowest point of the crisis.
      • Both Small Caps and Large Caps experienced their maximum drawdown on May, 1932.
  • Dot Com Crisis (1990-2005):

    Characterized by the bursting of the dot-com bubble, this period saw significant losses, particularly in technology-heavy Large Caps. Reflects the volatility and potential downside of tech-driven market booms.

    • Interpretation:
      • Small Caps: Experienced a maximum drawdown of 38.0%. This indicates that, at its lowest point during the Dot Com crisis, the value of Small Caps assets fell to 62% of their peak value before recovering on December, 1990.
      • Large Caps: Had a maximum drawdown of 50.0%. This means that Large Caps assets decreased to 50.0% of their peak value at the worst point of the crisis on September, 2002.
  • Lehman Brothers Crisis (2005 Onwards):

    Triggered by the collapse of Lehman Brothers, this crisis led to a global financial meltdown. Highlights the interconnectedness of modern financial markets and the rapidity with which shocks can propagate.

    • Interpretation:
      • Small Caps: Experienced a maximum drawdown of 63.0%. This indicates that, at its lowest point during the Lehman Brothers crisis, the value of Small Caps assets decreased to 37% of their peak value before recovering.
      • Large Caps: Had a maximum drawdown of 53.0%. This means that Large Caps assets fell to 47% of their peak value at the worst point of the crisis.
      • Both Small Caps and Large Caps experienced their maximum drawdown on February, 2009.

Gaussian Density & Distribution

A Gaussian random variable X, characterized by a mean μ and σ2 (notated as XN(μ,σ2)), is characterized by the following:

  • A Density function:
f(x):=12πσ2exp((xμ)22σ2),
  • And a Cumulative distribution function (CDF):
FX(x):=P(Xx)=xf(t)dt=Φ(x).

Here, Φ(x) gives the probability that X is less than or equal to x.

Where μ=0, and σ2=1, X is considered as standard Gaussian random variable. This standardization simplifies many statistical calculations.

Symmetry Property

For a standard normal random variable XN(0,1), it holds that:

Φ(x)=1Φ(x).

This property reflects the symmetry of the standard normal distribution.

Distribution of Negative Random Variable

The exponential function exp(t2/2) is symmetric (i.e., an even function). Therefore, for a standard normal random variable XN(0,1), the distribution of X is also N(0,1). Mathematically:

P(Xx)=P(Xx)=1P(Xx)=1Φ(x)=Φ(x)=P(Xx),

confirming that the cumulative distribution function of X is identical tot hat of X:

FX(x)=FX(x).

Quantiles

Imagine a random variable X, and let's consider a number α that lies between 0 and 1 (i.e., α(0,1)). The quantile of order α, denoted as ϕαR, is a value such that the probability of X being less than or equal to ϕα equals α. Mathematically, it's defined as:

P(Xϕα)=α.
  • Quantiles in a Standard Normal Distribution:

For a standard normal distribution, where XN(0,1), the concept of quantiles becomes particularly interesting. If ϕα is an α-quantile of a standard normal distribution, it satisfies:

Φ(ϕα)=P(Xϕα)=P(Xϕα)=P(Xϕα)=1P(Xϕα)=1Φ(ϕα)=1α=P(Xϕ1α)=Φ(ϕ1α),

This leads to a useful identity:

ϕα=ϕ1α.

In essence, this identity illustrates the symmetry of the standard normal distribution.

  • Symmetry in Probability:

In a standard normal distribution, the probability of the variable lying within certain symmetric bounds is given by:

P(|X|ϕ1α/2)=P(ϕ1α/2Xϕ1α/2)=Φ(ϕ1α/2)Φ(ϕ1α/2=ϕα/2)=(1α/2)(α/2)=1α.

This formula is useful for understanding the distribution of values around the mean in a normal distribution.

Practical Application: Finding a Specific Quantile

For example, to determine the 0.9-quantile of a standard normal distribution (i.e., the value below which 90% of the data lies), let's look for ϕ0.9 such that:

ϕ0.9=Φ1(0.9).

Using the norm.ppf() function from scipy.stats, which provides quantiles for the Gaussian distribution, we can calculate this value directly.

python
import scipy.stats

# Calculating the 0.9-quantile of a standard normal distribution
phi_0_9 = scipy.stats.norm.ppf(0.9, loc=0, scale=1)

# Double-checking by calculating the probability up to the quantile
probability_check = scipy.stats.norm.cdf(phi_0_9, loc=0, scale=1)

# Preparing data for tabulation
data = [
    ["0.9-quantile (phi_0.9)", "{:.4f}".format(phi_0_9)],
    ["Probability check for phi_0.9", "{:.4f}".format(probability_check)],
]

# Creating a table with headers
headers = ["Description", "Value"]

def print_table(headers, data):
    """
    Generates and prints a formatted table given headers and data.

    Parameters:
    - headers: List of column headers.
    - data: List of rows, where each row is a list of cell values.
    """
    # Calculate the maximum width for each column
    col_widths = [len(header) for header in headers]
    for row in data:
        for i, cell in enumerate(row):
            cell_length = len(str(cell))
            if cell_length > col_widths[i]:
                col_widths[i] = cell_length

    # Create format string for each row
    row_format = "| " + " | ".join(f"{{:<{w}}}" for w in col_widths) + " |"

    # Print header row
    print(row_format.format(*headers))

    # Print separator row with '=' for headers
    separator = "|" + "|".join(["=" * (w + 2) for w in col_widths]) + "|"
    print(separator)

    # Print data rows
    for row in data:
        print(row_format.format(*row))

    # Print ending separator row with '-'
    separator_dash = "|" + "|".join(["-" * (w + 2) for w in col_widths]) + "|"
    print(separator_dash)

# Generate and print the table
print_table(headers, data)
  • Interpretation:
    • 0.9-Quantile of a Standard Normal Distribution: "The 0.9-quantile (phi_0.9) of a standard normal distribution: 1.2816"
    • Probability Check for the 0.9-Quantile: "Probability check for phi_0.9: 0.9000"

These results confirm that the value of approximately 1.2816 is indeed the 0.9-quantile of a standard normal distribution. Additionally, the probability check verifies that the cumulative distribution function (CDF) up to this quantile is 0.9, as expected.

Exploring Skewness and Kurtosis in Financial Data

Skewness: Assessing Asymmetry in Distributions

Skewness quantifies how asymmetrical a distribution is regarding its mean. It can be:

  • Positive: Indicating a tail on the right side.

  • Negative: Showing a tail on the left side.

Formally, skewness is defined using the third centered moment:

S(X):=E(XE(X)3)σ3,

where σ is the standard deviation of X.

Kurtosis: Analyzing Tails and Outliers

Kurtosis measures the "tailedness" or concentration of outliers in a distribution. It's calculated as:

K(X):=E(XE(X)4)σ4,

where σ is the standard deviation of X.

A normal distribution has a kurtosis of 3. Thus, "excess kurtosis" is often computed as K(X)3, offering a comparison to a normal distribution.

Practical Analysis

Here's a practical demonstration using Python to understand skewness and kurtosis in financial datasets:

python
import pandas as pd
import numpy as np
import PortfolioOptimizationKit as pok
import json

# Generate normally distributed data
A = pd.DataFrame({"A": np.random.normal(0, 2, size=800)})

# Get market returns
B = pok.get_ffme_returns()
B = B["Hi 10"]

# Calculate histograms
def calculate_histogram(data_series, bins=60):
    counts, bin_edges = np.histogram(data_series, bins=bins, density=True)
    bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
    return counts.tolist(), bin_centers.tolist()

# Calculate histogram data
normal_counts, normal_bins = calculate_histogram(A["A"])
market_counts, market_bins = calculate_histogram(B)

# Print statistics
print(f"Normal Distribution - Mean: {A['A'].mean():.3f}, Std: {A['A'].std():.3f}")
print(f"Market Returns - Mean: {B.mean():.3f}, Std: {B.std():.3f}")

# Prepare data in format matching OutputDisplay.vue expectations
plot_data = {
    "normalDist": {
        "type": "bar",
        "yAxisName": "Density",
        "series": {
            "Density": normal_counts
        },
        "xAxis": normal_bins
    },
    "marketDist": {
        "type": "bar",
        "yAxisName": "Density",
        "series": {
            "Density": market_counts
        },
        "xAxis": market_bins
    }
}

# Print chart data
print("\n<ECHARTS_DATA>" + json.dumps(plot_data))

In this analysis, the normal distribution is expected to show skewness near zero and kurtosis close to three. In contrast, the market returns, deviating from normality, may exhibit different skewness and higher kurtosis values.

Advanced Analysis with Hedge Fund Indices incorporation

We extend our analysis of skewness and kurtosis to a dataset involving hedge fund indices. This dataset provides a different perspective, often diverging from the standard characteristics of normal distributions.

python
import pandas as pd
import PortfolioOptimizationKit as pok
import scipy.stats

# Load the hedge fund index data
hfi = pok.get_hfi_returns()
print(hfi.head(3))

# Initialize a DataFrame to store skewness and kurtosis values
hfi_skew_kurt = pd.DataFrame(columns=["Skewness", "Kurtosis"])

# Calculate skewness and kurtosis for each column in the hedge fund index data
hfi_skew_kurt["Skewness"] = hfi.aggregate(pok.skewness)
hfi_skew_kurt["Kurtosis"] = hfi.aggregate(pok.kurtosis)

# Display the calculated skewness and kurtosis
print(hfi_skew_kurt)

When trying identifying Gaussian Distributions, CTA Global, shows skewness near zero and kurtosis close to three, indicating a possible normal distribution.

  • Using Jarque-Bera Test for Normality

The Jarque-Bera test, a statistical test for normality, helps confirm if an index follows a Gaussian distribution.

python
# Jarque-Bera test on CTA Global
jb_result_cta_global = scipy.stats.jarque_bera(hfi["CTA Global"])
print("CTA Global:", jb_result_cta_global)

# Check normality using custom function in erk toolkit
is_normal_cta_global = pok.is_normal(hfi["CTA Global"])
print("Is CTA Global Normal?", is_normal_cta_global)

# Jarque-Bera test on Convertible Arbitrage
jb_result_conv_arb = scipy.stats.jarque_bera(hfi["Convertible Arbitrage"])
print("Convertible Arbitrage:", jb_result_conv_arb)

# Check normality for Convertible Arbitrage
is_normal_conv_arb = pok.is_normal(hfi["Convertible Arbitrage"])
print("Is Convertible Arbitrage Normal?", is_normal_conv_arb)
  • Normality Across Indices

Finally, we examine the normality of all indices in the hedge fund dataset.

python
# Aggregate normality test across all indices
normality_test_results = hfi.aggregate(pok.is_normal)
print(normality_test_results)

This comprehensive analysis reveals that only the CTA Global index passes the normality test, suggesting it's the most normally distributed among the hedge fund indices.

Understanding Downside Risk Measures

Semivolatility: Focusing on Negative Fluctuations

Semivolatility, distinct from total volatility, zeroes in on the negative side of asset return fluctuations. In investment, the concern often isn't how much returns deviate when they're positive but rather how volatile they are when they're negative.

Semivolatility addresses this by measuring the standard deviation of only the subset of returns that are negative or below the mean. This measure is crucial for investors who prioritize safeguarding against losses over pursuing high returns. Mathematically, it's denoted as:

σsemi:=1NsemiRt<0(Rtμsemi)2,

where μsemi represents the mean of the negative returns and Nsemi is their count.

NOTE

This measure can be adapted to include returns below any chosen benchmark, such as the overall mean return.

Calculating Semivolatility in Python To compute semivolatility, filter the returns to only include the negative or below-mean values and then apply the standard deviation formula to this subset.

python
print(pok.semivolatility(hfi))

Value at Risk (VaR): Gauging Maximum Expected Loss

Value at Risk, or VaR, quantifies the maximum anticipated loss over a defined period under normal market conditions. The confidence level for this measure usually lies between 0 and 1 and is often expressed as a percentage.

Conceptualizing VaR with a 99% Confidence Level

For instance, let's consider a VaR with a 99% confidence level (i.e., α=0.99). This implies the assessment of the highest potential loss for a month, excluding the worst 1% of potential outcomes. Essentially, it answers the question: "What's the maximum amount we could lose with 99% certainty in a month?"

Given a set of monthly returns like:

R=(4%,+5%,+2%,7%,+1%,+0.5%,2%,1%,2%,+5%).

The task is to determine the 90% monthly VaR which implies two steps:

Exclude the 10% Worst Returns: This means removing the lowest 10% of returns, which, in a dataset of 10, equates to the single worst return, such as 7%.

Identify the Next Worst Return: After excluding the worst, the next worst, in this case, is 4. Therefore, the VaR is 4%.

It's important to note that even though the VaR is found to be 4%, it's conventionally expressed as a positive figure, hence, VaR=4%.

From a mathematical standpoint, given a confidence level α within (0,1), VaR is defined as:

VaRα:=inf{xR:P(Rx)1α}=inf{xR:P(Rx)α},

essentially making it the (1α)-quantile. it is aimed to find the number VaRα such that:

P(RVaRα)=1α,

indicating a (1α)% likelihood of a negative return equal to or worse than VaRα.

In the illustrated example, a 90% monthly VaR of 4% signifies:

0.04=VaR0.9=inf{xR:P(Rx)0.1},

which implies a 10% chance of losing over 4% of the investment (experiencing returns lower than 4%).

Exploring Conditional Value at Risk (CVaR)

Conditional Value at Risk, CVaR is a measure used to assess the risk of investments. It predicts the average loss that could occur in the worst-case scenarios beyond a certain threshold set by Value at Risk (VaR). While VaR tells you how bad a loss might be on a very bad day, CVaR provides insight into the average losses expected if things go even worse than the VaR estimation. Essentially, CVaR calculates the mean of the tail end of the loss distribution, representing the expected loss in the worst (1α) of cases.

The formula is given by:

CVaR:=E(R|R<VaR)=VaRtfR(t)dtFR(Var),

where fR represents the density function of the returns R, and FR is their cumulative distribution function. This formula effectively captures the expected losses beyond the VaR threshold, providing a more comprehensive view of potential downside risk.

Illustrative Example

Consider this set of monthly returns:

R=(4%,+5%,+2%,7%,+1%,+0.5%,2%,1%,2%,+5%).

the objective is to determine the 80% monthly CVaR which implies three steps:

Exclude the Bottom 20% of Returns: Identify and remove the worst 20% of returns. For 10 returns, this means excluding the two worst, which are 7 and 4.

Identify the VaR: The next worst return after removing the bottom 20% is 2, setting the 80% VaR at 2%.

Calculate the Average Beyond VaR: Consider the average of the returns worse than VaR0.8=2%, which in this case are 7% and 4%. The average of these values gives CVaR0.8=(7%4%)/2=5.5%.

This process illustrates how CVaR provides a more detailed risk assessment, taking into account the severity of losses beyond the VaR threshold.

Estimating VaR and CVaR: A Comparative Overview

Several methods are available to estimate VaR (Value at Risk) and CVaR (Conditional Value at Risk), each with its own approach and implications.

Historical Approach (Non-Parametric)

The historical method is a straightforward, non-parametric approach to estimating VaR. It directly applies the concept of VaR as the (1α)-quantile of return distributions. Here's how it's typically implemented:

python
import pandas as pd
import numpy as np
import PortfolioOptimizationKit as pok
import json

try:
    # Retrieve returns for the CTA Global index
    hfi = pok.get_hfi_returns()
    cta_returns = hfi["CTA Global"]
    
    # Calculate histogram data
    counts, bin_edges = np.histogram(cta_returns, bins=60, density=True)
    bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
    
    # Print basic statistics
    print("CTA Global Returns Statistics:")
    print(f"Mean: {cta_returns.mean():.3f}")
    print(f"Std Dev: {cta_returns.std():.3f}")
    print(f"Skewness: {cta_returns.skew():.3f}")
    print(f"Kurtosis: {cta_returns.kurtosis():.3f}")
    
    # Prepare data for visualization
    plot_data = {
        "ctaDistribution": {
            "type": "bar",
            "yAxisName": "Density",
            "series": {
                "Density": counts.tolist()
            },
            "xAxis": bin_centers.tolist()
        }
    }
    
    # Print chart data
    print("\n<ECHARTS_DATA>" + json.dumps(plot_data))

except Exception as e:
    print(f"Error: {str(e)}")

Suppose you're interested in calculating the 90%, 95%, and 99% monthly VaR. You'd set your confidence levels respectively at 1α=0.1,0.05, and 0.01. The percentile method comes in handy for this calculation:

python
alpha = np.array([0.90, 0.95, 0.99])
level = 1 - alpha

# The percentile method requires the level in the range 0 to 100
VaRs = -np.percentile(hfi["CTA Global"], level*100)

print("90% VaR: {:.2f}%".format(VaRs[0] * 100))
print("95% VaR: {:.2f}%".format(VaRs[1] * 100))
print("99% VaR: {:.2f}%".format(VaRs[2] * 100))

This implies there's a 10%, 5%, and 1% chance in any given month that losses will exceed approximately 2.4%, 3%, and 5% respectively. Conversely, there's a 90%, 95%, and 99% chance that losses won't surpass these thresholds within the same timeframe.

However, it's crucial to note that this method's accuracy depends on the timescale of the returns used. A VaR calculated with monthly returns might differ significantly from one calculated using weekly data, highlighting the sensitivity of the historical method to the chosen timescale.

Parametric Approach (Gaussian)

In the Gaussian or parametric approach, returns are assumed to be normally distributed, a presumption which might not always hold true in real-world scenarios. If the returns R follow a normal distribution N(μ,σ), with mean μ and volatility σ, we can use standardization to convert R into a standard normal form, XN(0,1). The VaR is then derived from the quantile of the standardized normal distribution.

To determine the specific threshold zα that satisfies the definition of VaRα and quantiles, we need to solve for the value where the probability of returns falling below zα equals (1α):

P(Rzα)=1α.

This leads to a sequence of equalities:

1α=P(Rzα)=P(μ+Xσzα)=P(Xzαμσ)=Φ(zαμσ)zα=μ+Φ1(1α)σ

which, when solved, gives us the value zα:

Thus, we establish the formula for VaRα:

VaRα=(μ+Φ1(1α)σ),

where Φ1(1α) the (1α)-quantile of the Gaussian distribution, and μ and σ are the mean and volatility of the returns series, respectively, typically obtained using tools like norm.ppf python module.

python
# Compute the 95% monthly Gaussian VaR of the hedge fund indices 
alpha = 0.95
print(pok.var_gaussian(hfi, level=1-alpha))

Cornish-Fisher Modification (Semi-Parametric)

This approach modifies the Gaussian method using the Cornish-Fisher expansion, which adjusts the Gaussian quantiles to account for skewness and kurtosis of the return distribution. This makes it a better fit for non-Gaussian distributions:

z~α=zα+16(zα21)S+124(zα33zα)(K3)136(2zα35zα)S2

where z~α, represents the α-quantile of a non-Gaussian distribution (like the series of returns), while S , and K stand for its skewness and kurtosis, respectively. On the other hand, zα refers to the α-quantile of a standard Gaussian distribution. It's important to note that if the return series truly followed a Gaussian distribution, it'd be expected S=0 and K=3. Under these conditions, z~α would naturally align with zα, reflecting the Gaussian nature of the data.

Thus, using this approach, the Value at Risk (VaR) at confidence level α is given by:

VaRα=(μ+z~ασ).
python
# Compute the 95% monthly Gaussian VaR of the hedge fund indices using the Cornish-Fisher method
print(pok.var_gaussian(hfi, cf=True))

Comparison of VaR Methods

Comparing the VaR computed via different methods provides insights into the sensitivity and suitability of each approach under various market conditions:

python
import pandas as pd
import numpy as np
import PortfolioOptimizationKit as pok
import json

try:
    # Get HFI data
    hfi = pok.get_hfi_returns()
    
    # Calculate different VaR measures
    comparevars = pd.concat([
        pok.var_historic(hfi), 
        pok.var_gaussian(hfi), 
        pok.var_gaussian(hfi, cf=True), 
        pok.cvar_historic(hfi)
    ], axis=1)
    
    # Name the columns
    comparevars.columns = ["Historical", "Gaussian", "Cornish-Fisher", "Conditional VaR"]
    
    # Convert to percentage
    comparevars = comparevars * 100
    
    # Print summary statistics
    print("VaR Comparison Statistics (%):")
    print(comparevars.round(2))
    
    # Prepare data for visualization
    plot_data = {
        "varComparison": {
            "type": "bar",
            "title": "Comparison of 95% monthly VaRs for Hedge Fund indices",
            "yAxisName": "Value at Risk (%)",
            "xAxisName": "Strategy",
            "series": {
                var_type: comparevars[var_type].tolist() 
                for var_type in comparevars.columns
            },
            "xAxis": {
                "type": "category",
                "data": list(comparevars.index),
                "axisLabel": {
                    "rotate": 45,  # Rotate labels for better readability
                    "interval": 0   # Show all labels
                }
            }
        }
    }
    
    # Print chart data
    print("\n<ECHARTS_DATA>" + json.dumps(plot_data))

except Exception as e:
    print(f"Error: {str(e)}")

This visualization generally shows that Conditional VaR tends to estimate higher risk levels compared to the other methods, especially in tail events, while the historical method often presents the lowest VaR estimates. Each method has its place, depending on the risk profile, investment horizon, and market conditions an investor is dealing with.

After earning certification from EDHEC Business School, I translated complex financial theories into practical Python modules, openly shared under the MIT License.