A Thesis By
JUAN RUFINO MORGA REYES, 51198208
Master of Public Policy, International Program (MPP/IP)
(Economic Policy, Finance, and Development)
PROFESSOR KONSTANTIN KUCHERYAVYY (PH.D.)
Academic Supervisor
GRADUATE SCHOOL OF PUBLIC POLICY
THE UNIVERSITY OF TOKYO
TOKYO, JAPAN
JUNE 2021
© 2021 Juan Rufino M. Reyes
All Rights Reserved.
DECLARATION
I hereby declare that this thesis is my original work, and I have written it in its entirety.
I have duly acknowledged all the sources of information that have been used in this research.
In addition, this study has not been submitted for any degree or university previously.
(Sgd.)
JUAN RUFINO M. REYES, 51198208
Master of Public Policy, International Program (MPP/IP)
Graduate School of Public Policy
The University of Tokyo
ACKNOWLEDGEMENT
First of all, I would like to express my deepest gratitude to my academic supervisor,
Professor Konstantin Kucheryavyy (Ph.D.), for imparting his knowledge on data science and
providing technical assistance for this research. I appreciate the effort and encouragement you
conveyed during the entire thesis writing process. Thank you for the patience you have shown
and for the remarkable recommendations for this thesis.
I also would like to thank the (1) Joint Japan/World Bank Graduate Scholarship
Program (JJ/WBGSP) for giving me the opportunity and support to study at the most
prestigious university in Japan, The University of Tokyo (UTokyo); and
(2) Bangko Sentral ng Pilipinas (BSP) for allowing me to pursue a degree in the field of
public policy (i.e., economic policy, finance, and development) to become a better central banker
that could contribute to the development of monetary policy in the Philippines.
To the significant contributors of data in this thesis: Mr. Justin Parco of
Investor Relations Office (IRO), colleagues at the Department of Economic Statistics (DES),
and International Operations Department (IOD) of the BSP, I appreciate your generosity in
providing relevant statistics despite your busy schedules. Thank you for your
kind understanding.
Lastly, I am particularly thankful to Ms. Mia Agcaoili, my parents
(Engr. Rico and Marylen Reyes), and my siblings (Ms. Michelle and Ana Reyes) for
their unending love and support. I am forever grateful for your encouragement that
I can produce a study that is timely and relevant. Thank you for believing that this
thesis could be one of the best!
TABLE OF CONTENTS
DECLARATION
ACKNOWLEDGEMENT
TABLE OF CONTENTS
ABSTRACT
ACRONYMS
LIST OF TABLES
LIST OF FIGURES
PART ONE
RESEARCH FRAMEWORK:
BACKGROUND, THEORY, AND METHODOLOGY OF THE STUDY
CHAPTER I: INTRODUCTION
1.1. Background of the Study 2
1.1.1. Economic Nowcasting, Big Data, and Machine Learning 3
1.1.2. The Philippines and Domestic Liquidity 6
1.2. Statement of the Problem 8
1.3. Research Objectives 10
1.4. Significance of the Study 10
1.5. Scope and Limitations 11
1.6. Definition of Terms 12
CHAPTER II: REVIEW OF RELATED LITERATURE
2.1. Primer 14
2.2. Regularization Methods 15
2.3. Tree-Based Methods 19
2.4. The Utilization of Two (2) Machine Learning Methods 23
CHAPTER III: RESEARCH METHODOLOGY
3.1. Primer 26
3.2. Models 26
3.2.1. Benchmark Models 27
3.2.1.1. Autoregressive Models 27
3.2.1.1.1. Autoregressive Integrated Moving Average 27
3.2.1.1.2. Random Walk 28
3.2.1.2. Vector Autoregression 28
3.2.1.3. Dynamic Factor Model 29
3.2.2. Machine Learning Models 30
3.2.2.1. Regularization Methods 30
3.2.2.1.1. Ridge Regression 30
3.2.2.1.2. Least Absolute Shrinkage and Selection Operator 31
3.2.2.1.3. Elastic Net 32
3.2.2.2. Tree-Based Methods 32
3.2.2.2.1. Decision Tree 33
3.2.2.2.2. Random Forest 34
3.2.2.2.3. Gradient Boosted Trees 34
3.3. Nowcast Evaluation Methodology 35
3.4. Research Tool 36
PART TWO
RESEARCH ANALYSIS:
DATA AND EMPIRICAL RESULTS
CHAPTER IV: DATA AND DIAGNOSTICS
4.1. Primer 38
4.2. Data 38
4.2.1. Target Variable 38
4.2.2. Input Variables 38
4.2.2.1. Monetary Indicators 39
4.2.2.2. Financial Indicators 40
4.2.2.3. External Indicators 41
4.2.2.4. Lagged Values of Domestic Liquidity 41
4.3. Averaging and Interpolation 41
4.3.1. Averaging of High Frequency Variables 42
4.3.2. Interpolation of Low Frequency Variables 42
4.4. Diagnostics and Feature Engineering 43
4.4.1. Seasonal Adjustment 43
4.4.2. Logarithmic Transformation 43
4.4.3. Stationarity 43
CHAPTER V: EMPIRICAL RESULTS AND ANALYSIS
5.1. Primer 46
5.2. Calibration and Nowcast Results 46
5.2.1. One-Step-Ahead (Out-of-Sample) via Expanding Window 46
5.2.2. Autoregressive Models 47
5.2.2.1. Model Calibration 47
5.2.2.2. Nowcast Results 49
5.2.3. Dynamic Factor Model 51
5.2.3.1. Model Calibration 51
5.2.3.2. Nowcast Results 52
5.2.4. Machine Learning Models 54
5.2.4.1. Regularization Methods 54
5.2.4.1.1. Model Calibration 54
5.2.4.1.2. Nowcast Results 55
5.2.4.2. Tree-Based Methods 57
5.2.4.2.1. Model Calibration 57
5.2.4.2.2. Nowcast Results 59
5.3. Further Analysis 61
5.3.1. Variable Importance 61
5.3.1.1. LASSO and ENET 61
5.3.1.2. Random Forest and Gradient Boosted Trees 62
PART THREE
FINAL CHAPTERS
CHAPTER VI: CONCLUSION
6.1. Summary and Conclusion 65
CHAPTER VII: RECOMMENDATION
7.1. Potential Actions 69
7.2. Suggestions for Future Research 70
BIBLIOGRAPHY
ANNEXES
Annex A R Studio Packages
Annex B Unit Root Tests for Input Variables
Annex C Optimal Shrinkage Penalty via Ridge Regression
Annex D Optimal Shrinkage Penalty via LASSO
Annex E Optimal Shrinkage Penalty via ENET
Annex F OOB Error of Training Datasets via Random Forest
Annex G Optimal Number of Trees via Gradient Boosted Trees
Annex H Variable Coefficients via LASSO: January to December 2020
Annex I Variable Coefficients via ENET: January to December 2020
ABSTRACT
1
,
2
Domestic liquidity (also known as broad money) is defined as the sum of all
liquid financial instruments held by money-holding sectors that are used as a
medium of exchange in an economy (IMF, 2016). The changes in the overall growth of this
monetary indicator are among the most important dynamics that numerous central banks are
closely monitoring. This is because of its property of being an essential element to the
overall transmission mechanism of monetary policy, particularly the impact of
money supply expansion or contraction on aggregate demand, interest rates, inflation, and
overall economic growth (Mankiw, n.d.).
In the Philippines, data on domestic liquidity is used as a primary component
to formulate monetary policy and utilized as a leading indicator to observe
price and financial stability. However, similar to the concerns regarding the delayed publication
of data or statistical indicators generated by most government offices, data on domestic liquidity
in the said country also suffers from series of lags and revisions. Due to this predicament,
policymakers in the Central Bank of the Philippines or Bangko Sentral ng Pilipinas (BSP)
typically formulate monetary policies and address different economic phenomena (e.g., inflation,
business cycle) using its outdated or lagged values.
The concept of short-
methodologies utilized by numerous institutions (e.g., International Financial Institutions (IFIs),
central banks) to address the aforementioned issues in data publication. This approach,
at present, also became prevalent because of the emergence of big data and machine learning
which augment its overall process (Hassani and Silva, 2015; Richardson et al., 2018).
1
juanrufin[email protected]; juanrufinom[email protected]tokyo.ac.jp
2
The results expressed herein do not represent the views nor opinions of GraSPP, UTokyo, as well as the BSP. Errors and omissions
are sole responsibility of the author.
That being said, this study aims to utilize machine learning algorithms to provide an
optimal model to nowcast the growth of domestic liquidity in the Philippines.
In particular, the following steps are performed to support this objective:
(1) perform one-step-ahead (out-of-sample) nowcasts through regularization
(i.e., Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO),
Elastic Net (ENET)) and tree-based methods (i.e., Random Forest (RF),
Gradient Boosted Trees (GBT)); (2) recognize and compare the accuracy of each algorithm
vis-à-vis traditional time series models used in economic forecasting, such as
Autoregressive (AR) Models and Dynamic Factor Model (DFM); and (3) systematically identify
important high-frequency variables (i.e., monetary, financial, external sector) that could
accurately nowcast domestic liquidity in the Philippines.
Based on the conducted recursive nowcasts from January to December 2020,
it was found that machine learning algorithms provide more accurate estimates than the
traditional time series models utilized in this study. This is due from the consistent
monthly estimates with low forecast errors (i.e., Root Mean Square Error, Mean Absolute Error)
that the machine learning algorithms registered. The said quantitative models also registered
precise nowcasts on the months where domestic liquidity growth suddenly expand
(e.g., increased borrowings and deposits of National Government to BSP) due to the impact of
Coronavirus Disease 2019 (COVID-19) in the Philippines. Further, the results indicate that
regularization methods are the most optimal machine learning algorithms to nowcast the
aforementioned monetary indicator.
This study also concludes that using regularization methods, such as
LASSO and ENET, as well as tree-based methods, such as RF and GBT, are useful in
filtering out or identifying important indicators that stipulate parsimonious nowcasting models
with precise results.
Keywords: Domestic Liquidity, Machine Learning, Nowcasting, Philippines
ACRONYMS
ACF Autocorrelation Function
ADB Asian Development Bank
ADF Augmented Dickey-Fuller Test
AIC Akaike Information Criterion
ARC Advance Release Calendar
ARIMA Autoregressive Integrated Moving Average
AT Adaptive Trees
BOP Balance of Payments
BSP Bangko Sentral ng Pilipinas
BVAR Bayesian Vector Autoregression
CBS Central Bank Survey
CDS Credit Default Swap
COVID-19 Coronavirus Disease 2019
CPI Consumer Price Index
DCS Depository Corporations Survey
DES Department of Economic Statistics
DFM Dynamic Factor Model
ENET Elastic Net
EWS Early Warning System
FOF Flow of Funds
FOREX Foreign Exchange Rate
FPI Foreign Portfolio Investment
GBT Gradient Boosted Trees
GDP Gross Domestic Product
HQ Hannan-Quinn Information Criterion
IFI International Financial Institutions
IMF International Monetary Fund
IOD International Operations Department
LASSO Least Absolute Shrinkage and Selection Operator
LIBOR London Interbank Offered Rates
LSM Large-Scale Manufacturing
M1 Monetary Base
M2 M1 and Savings/Time Deposits
M3 Domestic Liquidity
MAE Mean Absolute Error
MAFE Mean Absolute Forecast Error
MFSM Monetary and Financial Statistics Manual
MSFE Mean Squared Forecast Error
NG National Government
NGA National Government Agencies
ODC Other Depository Corporations
OLS Ordinary Least Squares
OOB Out-of-Bag Error
PACF Partial Autocorrelation Function
PBS Philippine Banking System
PHIREF Philippine Interbank Reference Rate
PP Philipps-Perron Test
RF Random Forest
RMSE Root Mean Square Error
RSS Residual Sum of Squares
RW Random Walk
SARIMA Seasonal Autoregressive Integrated Moving Average
SIBOR Singapore Interbank Offered Rates
VAR Vector Autoregression
WB World Bank Group
WEO World Economic Outlook
WMOR Weighted Monetary Operations Rate
YOY Year-on-Year
LIST OF TABLES
Table 1.1 Depository Corporations Survey (Date Accessed: 10 April 2021)
Table 4.1 Summary Statistics of Domestic Liquidity in the Philippines
Table 4.2 List of Data
Table 4.3 Unit Root Tests for Domestic Liquidity in the Philippines
Table 5.1 RMSE of Autoregressive Models
Table 5.2 MAE of Autoregressive Models
Table 5.3 RMSE of DFM
Table 5.4 MAE of DFM
Table 5.5 RMSE of Ridge Regression, LASSO, and ENET
Table 5.6 MAE of Ridge Regression, LASSO, and ENET
Table 5.7 RMSE of RF and GBT
Table 5.8 MAE of RF and GBT
Table 5.9 Variable Coefficients via LASSO and ENET (Jan.-Feb. 2020)
Table 6.1 RMSE of Benchmark and Machine Learning Models (Summary)
Table 6.2 MAE of Benchmark and Machine Learning Models (Summary)
LIST OF FIGURES
Figure 3.1 Decision Tree Growing Process
Figure 3.2 Expanding Window Process
Figure 4.1(a) Domestic Liquidity in the Philippines (Levels, in Million PHP)
Figure 4.1(b) Domestic Liquidity in the Philippines (Growth Rate, in Percent)
Figure 4.2(b) Domestic Liquidity in the Philippines (Growth Rate, in First Diff.)
Figure 4.3 Research Workflow Diagram
Figure 5.1(a) ACF of M3 (Seasonally Adjusted)
Figure 5.1(b) PACF of M3 (Seasonally Adjusted)
Figure 5.2 Residual Plot for ARIMA (4,1,1)
Figure 5.3 Autoregressive Model Nowcasts vs. Actual M3 Growth (in Percent)
Figure 5.4 Eigenvalues of Input Variables via Factor Analysis
Figure 5.5 DFM Nowcasts vs. Actual M3 Growth (in Percent Diff.)
Figure 5.6(a) Overall RMSE of Autoregressive Models and DFM
Figure 5.6(b) Overall MAE of Autoregressive Models and DFM
Figure 5.7 Optimal Shrinkage Penalty via Ridge Regression
Figure 5.8 Regularization Nowcasts vs. Actual M3 Growth (in Percent Diff.)
Figure 5.9(a) Overall RMSE of Benchmark Models and Regularization Methods
Figure 5.9(b) Overall MAE of Benchmark Models and Regularization Methods
Figure 5.10 OOB Error of Training Datasets via Random Forest
Figure 5.11 Optimal Number of Trees via Gradient Boosted Trees
Figure 5.12 Tree-Based Method Nowcasts vs. Actual M3 Growth (in Percent Diff.)
Figure 5.13(a) Overall RMSE of Benchmark Models and Tree-Based Methods
Figure 5.13(b) Overall MAE of Benchmark Models and Tree-Based Methods
Figure 5.14 Node Impurity via Random Forest
Figure 5.15 Variable Importance Plot via Gradient Boosted Trees
Figure 6.1 Overall Forecast Errors of Benchmark and Machine Learning Models
- this page left intentionally blank -
1
CHAPTER I: INTRODUCTION
CHAPTER II: REVIEW OF RELATED LITERATURE
CHAPTER III: RESEARCH METHODOLOGY
2
Chapter I:
INTRODUCTION
1.1. Background of the Study
Understanding the current condition of their respective economy is essential
for every policymaker around the world. Therefore, timely announcements of various
macroeconomic indicators (e.g., monetary, national accounts) are important for them to be able
to monitor the current growth of different economic sectors comprehensively (e.g., households,
other depository corporations) as well as to formulate and implement strong policy
(e.g., fiscal, monetary) responses. Proponents of high-quality public data management,
such as the International Monetary Fund (IMF), argued that having reliable and sensible
datasets are essential to depict the overall condition of an economy and to strictly monitor
if any negative externalities could cause a financial crisis. Hence, numerous government offices
(e.g., central banks, finance ministries) are transforming their approach to ensure that
macroeconomic indicators are published in a timely and consistent manner
(Carriere-Swallow and Haskar, 2019).
Adopting these data management principles, however, cannot be easily implemented in
every country. This is because of the tedious and complicated processes that each
government office must perform to produce numerous macroeconomic indicators promptly.
The proper classification of accounts, changes in the overall compilation framework, and
inevitable delays in receiving input documents are among the few reasons that coerced the
delay in publishing data at the national level (Dafnai and Sidi, 2010;
Chikamatsu et al., 2018). Recent studies discussed that national government agencies (NGAs)
and central banks from different advanced (e.g., United States (US), Japan, New Zealand) and
emerging economies (e.g., Israel, Lebanon) had encountered this difficulty
(Dafnai and Sidi, 2010; Bragoli and Modugno, 2016; Chikamatsu et al., 2018;
Richardson et al., 2018). Due to this predicament, policymakers from these countries are forced
3
to formulate policies and address several economic phenomena (e.g., inflation, business cycle)
using non-related, outdated, or lagged datasets (Richardson et al., 2018).
To systematically address this concern, short-
of the recently introduced methodologies by different International Financial Institutions (IFIs),
NGAs, and central banks. This is because of its strong capacity to observe the overall state of
an economy or any target variable of interest using conventional and unconventional data
as well as high-frequency indicators that are usually published at an earlier date (Tiffin, 2016).
Due to the difficulty in producing official macroeconomic indicators on a real-time basis,
nowcasting has been the alternative approach used by said institutions to systemically
estimate the official figure of a specific set of information before it becomes available
Asian Development Bank (ADB) are
among the IFIs that conducted comprehensive studies regarding the use of nowcasting in
different fields of study (e.g., economics, finance). Meanwhile, central banks of Indonesia, Israel,
Japan, and New Zealand are among the well-known institutions that attempted to use the said
concept to estimate the short-run growth of their respective Gross Domestic Product (GDP) and
Consumer Price Index (CPI).
3
1.1.1. Economic Nowcasting, Big Data, and Machine Learning
For the past years, predicting the overall growth of an economy, the progress of a
particular economic sector, and the transmission mechanism of policies are commonly performed
through economic forecasting using time series analysis. This approach has been the traditional
forecasting methodology under the field of economics (or econometrics) because numerous studies
have already established its capacity to provide a clear and substantial outlook of different
macro and socioeconomic indicators, such as GDP, CPI, and poverty incidence, among others.
Aside from this, the said approach is frequently used by various well-known institutions to
estimate the dynamic effects of policy implementation on the overall economic growth of their
3
See Dafnai and Sidi (2010), Chikamatsu et al. (2018), Richardson et al. (2018), and Tamara et al. (2020).
4
respective country. Among the numerous time series models used in economic forecasting are
Autoregressive (AR), Vector Autoregressive (VAR), and Dynamic Factor Models (DFM).
4
However, in most cases, time series models used in economic forecasting are
highly dependent on the timeliness of data or information. Therefore, any delay in the
publication process of the explanatory variable(s) included in a particular forecasting model
could hamper the attempt to predict the future condition of the target output.
For instance, to predict the GDP for Q2:2020 using a simple AR(1) model, its figure as of
end-Q1:2020 is strongly needed.
5
However, in a typical situation, the publication of GDP for
Q1:2020 is not released exactly at the end of said period. The latest figures are typically posted
one (1) or two (2) months after the reference date (e.g., GDP for Q2:2020 is published in
August 2020, rather than end-June 2020).
6
Therefore, an individual or institution that aims to
forecast the economic growth for Q2:2020 using an AR(1) model should wait until the GDP as
of end-Q1:2020 is published.
This concern was one of the main reasons that pushed numerous individuals and
institutions to adopt the concept of nowcasting in the field of economics. This is because of its
capacity to exploit multiple real-time data or information (e.g., daily financial data,
survey results) to accurately estimate the present, near future, and recent past of a particular
macro or socioeconomic variable l., 2013, Chikamatsu et al., 2018;
Richardson et al., 2018). For example, to predict the current state of an economy,
high-frequency data or information (e.g., trade balances, financial data) that signals the current
GDP can be utilized before associated official GDP figures are published (Tiffin, 2016).
Moreover, since most conventional macroeconomic indicators are published with lags and
frequent revisions, nowcasting became an essential tool for policymakers to minimize the
usual approach of addressing different economic phenomena using non-related, outdated, or
lagged data (Richardson et al., 2018).
4
See Hang (2010), Ikoku (2014), Doguwa and Alade (2015), and Rajapov and Axmadjonov (2018).
5
Autoregressive Model of Order 1 or AR(1) model is defined as .
6
Depending on the statistical calendar (or advance release calendar) of a specific country.
5
The stu
In particular, the authors mentioned that:
Nowcasting is relevant in economics because key statistics on the
present state of the economy are available with a
significant delay. This is particularly true for those collected
on a quarterly basis, with GDP being a prominent example.
For instance, the first official estimate of GDP in the United States
or in the United Kingdom is published approximately
one month after the end of the reference quarter.
In the Euro area, the corresponding publication lag is two (2) to
three (3) weeks longer. Nowcasting can also be meaningfully applied
to other target variables revealing particular aspects of the state of
the economy and thereby followed closely by markets (p. 2).
Aside from the institutional concern, another factor that contributed to the emergence
of nowcasting is the recent trend in the use of big data and machine learning.
7
,
8
The rise of these concepts improved the overall effectiveness of nowcasting in the
field of economics because of two (2) particular reasons. The first reason is that the former has
a strong potential to provide complementary information with respect to the macroeconomic
data that government offices usually published (Baldacci et al., 2016). Meanwhile, the latter has
the capacity to utilize the immense amount of data or information that the former concept
provided (Hassani and Silva, 2015; Richardson et al., 2018). In addition to economics, conducting
nowcasting through big data and machine learning is also performed by different individuals and
institutions in the fields of energy, medicine, and population dynamics. This is because the said
approach was found to be an essential tool to have an accurate short-term forecast,
7
Big data is defined as large datasets that can be examined computationally to observe different patterns, trends, among others .
8
Machine learning refers to the use of computer system, algorithms, and/or statistical models to analyze and draw conclusions from
patterns in data.
6
which further improves the decision-making as well as policy formulation and implementation of
individuals or institutions under these fields (Hassani and Silva, 2015).
1.1.2. The Philippines and Domestic Liquidity
Domestic liquidity is defined as the total amount of money available in an economy that
is usually determined by a central bank and banking system (Mankiw, n.d. p. 623).
9
In particular, as stated under the Monetary and Financial Statistics Manual (MSFM) of the
IMF, the said monetary indicator is the sum of all liquid financial instruments held by
money-holding sectors, such as Other Depository Corporations (ODCs). It can be categorized as
a particular commodity that is widely accepted as (1) medium of exchange and
(2) close substitute for the medium of exchange that has a reliable store value
(IMF, 2016 p. 180).
10
,
11
The change in the overall growth of this monetary indicator is one of the most important
dynamics that most central banks are closely monitoring. Mainly because it is an
essential element to the transmission mechanism of monetary policy, particularly the
impact of money supply expansion or contraction on aggregate demand, interest rates,
inflation, and overall economic growth. For this reason, policymakers in different central banks
passionately observe its current and future development to formulate an effective and timely
monetary policy response, especially when there are seen predicaments that require them to
adjust policy rates and the overall monetary base (Mankiw, n.d.).
Similar to its role in every economy across regions, domestic liquidity likewise holds a
critical function in the economy of the Philippines. Both the level and growth of said
monetary indicator are usually being monitored by its central bank otherwise known as the
9
The words domestic liquidity, broad money, money supply, money demand, and M3 are interchangeably used in this paper.
10
The MFSM is the official guideline of IMF member countries in compiling and presenting monetary statistics.
11
ODCs refers to financial corporations (other than the central bank) that incur liabilities included in domestic liquidity
(IMF, 2016 p. 405).
7
Bangko Sentral ng Pilipinas (BSP) because it is also primarily used as the measurement of
liquidity in the country, input for early warning system (EWS) models on the macroeconomy,
and principal data to formulate and implement monetary policy, among others.
12
Money supply in the Philippines has a similar structure with most countries with
fractional-reserve banking systems (e.g., US, Japan).
13
Mainly because bank reserves,
currency deposits (or monetary base), and other liquid financial instruments are likewise its
main components. In particular, based on the Depository Corporations Survey (DCS) conducted
by the BSP, broad money in the said country is mainly composed of currency in circulation and
transferable deposits (M1), other deposits such as savings and time deposits (M2), and
deposit substitutes such as debt instruments (BSP, 2018).
14
On a monthly basis, the BSP announces the current level and growth of broad money in
the Philippines. However, for the said monetary indicator to be released in a timely manner,
the said institution needs to strictly ensure that the monthly submission of bank reports
(e.g., balance sheets, income statements) is observed promptly. Since the
Philippine Banking System (PBS) is characterized as a fractional-reserve banking system,
the balance sheets of the BSP together with the ODCs are necessary to be consolidated to
calculate M3 in a given period.
Therefore, in order for the BSP to achieve its primary mandate in having price and
financial stability in the Philippines, timely and reliable data on money supply which highly
requires the overall position (e.g., assets, liabilities) of the BSP and ODCs is critical to support
the overall monetary policy formulation and implementation in the said country.
12
See BSP DCS Frequently Asked Questions (FAQs).
13
Fractional-reserve banking system refers to a system in which banks retain a portion of their overall deposits on reserves
(Mankiw, n.d. p. 620).
14
The DCS is a consolidated report based on the balance sheets of BSP and ODCs, such as universal and commercial banks,
thrift banks, rural banks, non-stock savings and loan associations, non-banks with quasi-banking functions.
8
1.2. Statement of the Problem
As mentioned in the previous section, delay in data publication is one of the
most common difficulties that government institutions encounter. This scenario, unfortunately,
is also observed in producing domestic liquidity statistics in the Philippines. Even though the
BSP met the deadline to announce its latest available figure based on their
advance release calendar (ARC), the publicly shared data on M3 are not based on
real-time position. As seen in Table 1, despite retrieving the DCS last 10 April 2021, the latest
available domestic liquidity statistics was based on its level and growth as of end-February 2021
(e.g., current release has four (4) to six (6) weeks lags).
Table 1.1: Depository Corporations Survey
(Date Accessed: 10 April 2021)
Source: BSP
Aside from this concern, the official data on money supply also suffers from series of
revisions. Based on the publication policy of the BSP, the latest statistical reports
(which includes the DCS) are treated as preliminary information (Table 1).
9
The initial publication is revised within two (2) months to reflect changes (if any) on the reports
submitted by the banks under its jurisdiction.
15
This procedure is also applicable to the other
key statistical indicators being produced by the said institution, such as the
balance of payments (BOP) and flow of funds (FOF), to name a few. However, in some cases,
the preliminary and revised data have significant numerical discrepancies.
Drawing upon this background, this study aims to address these issues and concerns by
investigating the use of different machine learning algorithms to predict the real-time growth of
broad money in the Philippines. This approach particularly intends to formulate an
accurate quantitative model that the BSP can sustainably use to estimate
domestic liquidity in the said country using regularization and tree-based methods.
For this reason, the overarching research question for this study is:
WHAT IS THE OPTIMAL MACHINE LEARNING ALGORITHM TO ACCURATELY
NOWCAST THE GROWTH OF DOMESTIC LIQUIDITY IN THE PHILIPPINES?
The study also intends to answer these sub-research questions that could further
strengthen the overall finding(s):
a. Does the use of machine learning algorithms improve the overall accuracy in
predicting the real-time growth of domestic liquidity in the Philippines?
b. What are the substantial advantages of using machine learning algorithms vis-à-vis
traditional time series models (e.g., Autoregressive Models, Dynamic Factor Model)
in predicting the current growth of domestic liquidity in the Philippines?
c. By using a wide range of high-frequency monetary, financial, and external sector
indicators as explanatory variables, what are the critical factors that should be
included in the nowcasting model to comprehensively explain and predict the
real-time growth of domestic liquidity in the Philippines?
15
See DCS revision policy https://www.bsp.gov.ph/SitePages/Statistics/Financial%20System%20Accounts.aspx?TabId=2.
10
1.3. Research Objectives
To comprehensively answer the abovementioned research questions, this study aims to
achieve the following objectives:
a. To develop/formulate an accurate nowcasting model that could be used as a
primary method in predicting the real-time growth of money supply in the
Philippines.
b. To strongly utilize various key monetary, financial, and external sector indicators as
input variables.
c. To conduct one-step-ahead (out-of-sample) nowcasts using time series models and
machine learning algorithms.
d. To investigate the performance and accuracy of each time series model and
machine learning algorithm in obtaining nowcasts.
e. To determine the advantages and disadvantages (if any) of using machine learning
algorithms to determine the current state of domestic liquidity in the said country.
1.4. Significance of the Study
For the past years, there was an increasing number of scholars in the field of economics
that showed their interest in using nowcasting as a primary approach to determine the real-time
growth of numerous macroeconomic indicators. Most of these studies are focused on formulating
quantitative models using different time series and machine learning algorithms that could
accurately estimate the movement of numerous macro and socioeconomic indicators using
conventional and unconventional data or information.
In the case of the Philippines, the studies of Rufino (2017), Mapa (2018), and
Mariano and Ozmucur (2015; 2020) already established the use of different
mixed frequency models and machine learning algorithms to nowcast GDP and inflation.
However, none of these published studies have explored the usefulness of nowcasting in
11
monetary policy, particularly in using different machine learning algorithms to estimate the
growth of broad money in the said country.
Due to this literature gap, the researcher sees the following reasons wherein this study is
considered as timely and relevant:
a. The output of this study could serve as a primary tool of the BSP to accurately
nowcast the growth of domestic liquidity, which is considered one of the most critical
inputs for monetary policy formulation (e.g., reserve requirements,
open market operations) in the Philippines.
b. Machine learning algorithms utilized in this study can be replicated to nowcast the
different key economic indicators produced by the said institution
(e.g., balance of payments, financial soundness indicators) and other NGAs within
the country.
c. The result of this study could be a valuable input to the current nowcasting
initiatives performed by the BSP, such as GDP and inflation nowcasting,
among others.
d. The determinants identified as principal components in this study could be used as
additional leading indicators of domestic liquidity growth in the Philippines.
e. Through this study, recommendations can be crafted to mainstream and integrate
big data and machine learning in the monetary policy formulation and
implementation of the BSP.
f. This study could also strengthen the growing body of literature regarding the
application of time series and machine learning models in economic forecasting.
1.5. Scope and Limitations
Although this paper intends to provide a comprehensive analysis in establishing a model
to conduct short-term forecasting or nowcasting using machine learning algorithms, the following
are the scope and limitations of this study:
12
a. The main objective of this study is to nowcast the growth of domestic liquidity (M3)
in the Philippines. Therefore, its monetary aggregate components, such as
narrow money (M1) and other deposits included in broad money (M2), are not
individually analyzed.
b. The benchmark models used in this study are limited to (1) Autoregressive (AR)
such as Autoregressive Integrated Moving Average (ARIMA) and Random Walk
Models as well as (2) Dynamic Factor Model (DFM).
16
c. To conduct domestic liquidity nowcasting using machine learning algorithms,
the models used in this study are limited to (1) Regularization Methods, such as
Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), and
Elastic Net and (2) Tree-Based Methods, such as Random Forest and
Gradient Boosted Trees.
d. The study initially aims to incorporate numerous variables that can represent
different sectors of the economy (e.g., central bank, financial sector) in the
Philippines. However, the final indicators used in the different nowcasting models
became limited due to (1) data confidentiality, (2) access restrictions, and
(3) time constraints.
e. Due to the limited availability of data (especially data on the explanatory variables),
the overall timeframe of this study is restricted from January 2008 to December 2020
(mixed of daily, weekly, monthly frequency).
1.6. Definition of Terms
The following terms, which are frequently cited in this study, are defined operationally
or derived from official or technical sources:
Autoregressive (AR) Model a time series model whose current value strongly depends
linearly on its current value and an unpredictable disturbance (Wooldridge, 2012 p. 844).
16
Vector Autoregression (VAR) is used as part of DFM.
13
Big Data large datasets that can be examined computationally to observe
different patterns, trends, among others.
Central Bank an institution responsible for the conduct of monetary policy
(Mankiw, n.d. p.618).
Domestic Liquidity the total amount of money available in an economy that is usually
determined by a central bank and banking system (Mankiw, n.d. p. 623).
Liquidity refers to the assets that can be exchanged in a rapid manner without affecting
its overall price (IMF, 2016).
Machine Learning use of computer systems, algorithms, and statistical models to
analyze and conclusions from patterns in data.
Monetary Policy refers to the management of money supply and interest rates
(Mishkin, n.d. p. 10).
Other Depository Corporations (ODCs) financial corporations (other than the
central bank) that incurs liabilities included in domestic liquidity (IMF, 2016 p. 405).
Time Series Data refers to any data or information that is collected over time
(Wooldridge, 2012 p. 859).
Vector Autoregressive (VAR) Model a model for two (2) or more time series.
Each variable is modeled as a linear function of past values of all variables,
plus disturbances that have zero (0) means given all past values of the observed variables
(Wooldridge, 2012 p. 860).
14
Chapter II:
REVIEW OF RELATED LITERATURE
2.1. Primer
Nowcasting became one of the alternative methodologies used by numerous
institutions to predict the recent developments of various macroeconomic indicators
(e.g., Gross Domestic Product (GDP), inflation) and potential transmission mechanisms of
fiscal or monetary policies. This quantitative approach transpired because most
economic indicators published by government offices (e.g., national government agencies
(NGAs), central banks) tend to suffer from lags and revisions. Hence, numerous
nowcasting exercises are recently conducted to eliminate the practice of using non-related,
outdated, or lagged datasets in addressing different predicaments in an economy, such as
hyperinflation, unemployment, among others (Richardson et al., 2018).
Aside from this concern, the popularity of nowcasting is strongly enhanced by
the recent emergence of big data and machine learning. This is due to the potential of the
former concept to provide complementary information, such as high-frequency data
concerning the macroeconomic data that government offices usually published
(Baldacci et al., 2016). In contrast, the latter concept has the capacity to accurately provide
estimates despite having an immense amount of data or information in a nowcasting model
(Hassani and Silva, 2015; Richardson et al., 2018).
That being said, to strengthen the foundation of this research, previous studies
that conducted nowcasting through the use of big data (or high-frequency data)
and different machine learning algorithms are discussed in this chapter.
However, this literature review mainly focuses on the studies that used
(1) regularization methods (i.e., Ridge Regression, Least Absolute Shrinkage and
Selection Operator, Elastic Net) and (2) tree-based methods (i.e., Random Forest,
15
Gradient Boosted Trees) as their primary or secondary approach to nowcast different
macroeconomic variables and other statistical indicators.
2.2. Regularization Methods
Regularization methods are among the prevalent machine learning algorithms used to
conduct nowcasting. This is because regression models under its purview almost have similar
characteristics with the Ordinary Least Squares (OLS) to fit a linear model (James et al., 2013;
Tiffin, 2016). Compared to OLS, however, each of these methods has the characteristic to
constrain its coefficient estimates to significantly reduce their variance with the intention to
improve the overall model fit (James et al., 2013). In other words, Ridge Regression,
Least Absolute Shrinkage (LASSO), and Elastic Net (ENET) have the capacity to provide better
forecast output because it reduces model complexity by incorporating penalties to its
coefficient(s) which then address the issue of bias-variance tradeoff.
17
This approach is called
shrinkage in machine learning literature (Tiffin, 2016; Richardson et al., 2018).
The studies of Tiffin (2016), as well as Dafnai and Sidi (2010), are among the
well-known studies in the field of economics that managed to use regularization methods as an
approach to conduct nowcasting. Both of these studies attempted to formulate
nowcasting models that could accurately estimate the GDP growth in Lebanon and Israel,
respectively. Due to the data publication lags that both countries experienced, these authors
similarly agreed that there was a need to conduct an approach wherein the current status of
economic growth can be immediately determined to improve policy decisions. Their attempt to
formulate nowcasting models also aimed to address the difficulty of their stakeholders from the
domestic (e.g., NGAs, central banks) and international (e.g., International Financial Institutions
(IFIs), bilateral partners) landscape in assessing the overall economic health of their respective
countries (Tiffin, 2016; Dafnai and Sidi; 2010).
17
Bias-variance tradeoff is a central concept in forecasting and machine learning (Bolhuis and Rayner, 2020 p. 5). This refers to the
balance between interpretability and flexibility of a (supervised) machine learning model (James et al., 2013).
16
To meet these objectives, the aforementioned authors used high-frequency data or
information as explanatory variables to their corresponding GDP nowcasting models.
Tiffin (2016) used nineteen (19) monthly macroeconomic variables (e.g., customs revenue,
tourist arrivals) to observe economic growth in Lebanon.
18
Using the aforementioned
data through regularization methods, the author found that ENET is the most suitable
machine learning algorithm to predict the short-run economic development of Lebanon.
Mainly because its in-sample and out-of-sample nowcasting results managed to systematically
On the other hand, Dafnai and Sidi (2010) used one hundred forty (140) domestic indicators
and fifteen (15) global indicators as input variables to nowcast the GDP in Israel.
19
The authors similarly found that ENET is the most comprehensive regularization method to
nowcast the economic growth in said country. Compared to other regularization methods used
in their study, Dafnai and Sidi (2010) argued that ENET is the only regularization method that
successfully captured the timing and magnitude of the economic cycle in Israel while only
generating a low Mean Absolute Forecast Error (MAFE).
Hussain et al. (2018) also performed nowcasting using the aforementioned
machine learning algorithms. This study, however, intended to predict the short-run growth of
Large-Scale Manufacturing (LSM) in Pakistan. The authors decided to conduct this research
because the official GDP data in the said country also encounters publication lag.
Therefore, since LSM is published on a monthly basis and strongly depicts the significant
economic activities in Pakistan, predicting its current state could be a valuable tool for the
-changing domestic and
global economic condition (Hussain et al., 2018).
Given this objective, Hussain et al. (2018) also used high-frequency data or information
as explanatory variables to nowcast the aforementioned indicator. This includes
monthly indicators regarding financial markets, confidence surveys, interest rate spreads, credit,
18
See Page 10 of Tiffin (2016).
19
See Annex of Dafnai and Sidi (2010).
17
and the external sector in Pakistan.
20
Using these data as inputs to their regularization methods,
the authors concluded that Ridge Regression, LASSO, and ENET methods are comprehensive
quantitative tools in predicting the overall growth of LSM. This is because all three (3)
machine learning algorithms scrupulously tracked the overall growth, trends, and
cyclical movement of LSM with small forecast error. Comparing each method,
Hussain et al. (2018) found that LASSO rendered the most accurate nowcasting result since it
comprehensively traced the trends and cycle of LSM in Pakistan while having the lowest RMSE.
The Dynamic Factor Model (DFM) used in the study of said authors provided the smallest
forecasting error in nowcasting the trend. However, it presented inconsistent estimates in
predicting the overall growth and cycle of said macroeconomic indicator (Hussain et al., 2018).
The aforementioned machine learning algorithms were likewise used by
Cepni et al. (2018) as well as Ferrara and Simoni (2019). These authors utilized the said methods
to formulate models that could accurately nowcast the GDP of emerging economies
(i.e., Brazil, Indonesia, Mexico, South Africa, Turkey) and the United States (US), respectively.
Similar to the previous studies discussed in this section, numerous high-frequency data or
information were used as explanatory variables to nowcast the economic growth of said countries.
Cepni et al. (2018), in particular, utilized country-specific (1) macroeconomic indicators such as
industrial production, demand, and consumption indices and (2) survey data from
21
On the other hand, Ferrara and Simoni (2019)
used a large set of data from Google (e.g., Google Trends) to nowcast GDP in the US.
22
The former authors notably used LASSO to augment the nowcasting activity done through
DFM. Meanwhile, the latter authors utilized Ridge Regression and compared it with their
bridge equation benchmark model since numerous variables were included in their model.
Both studies concluded that these machine learning models are
convenient and comprehensive quantitative approaches to predict GDP in the short run
20
See Page 13 of Hussain et al. (2018).
21
See Page 2 of Cepni et al. (2018).
22
See Page 7 of Ferrera and Simoni (2019).
18
accurately. This is because Ridge Regression and LASSO each have the capacity to filter out
the insignificant variables, which could provide a parsimonious set of nowcasting models with
precise results (Cepni et al., 2018; Ferrara and Simoni, 2019).
The use of nowcasting is not only popular to estimate future values of different
macroeconomic indicators, such as GDP. Recent studies showed that this quantitative approach
could also be used to predict firm-level and sectoral data. The paper of Fornano et al. (2017)
was among the few studies that fall under this category. In particular, the authors applied the
three (3) regularization methods to nowcast the turnover indices growth of the main economic
sectors (e.g., services, manufacturing) in Finland.
23
Individual results of these methods were
compared with traditional time series models, such as Autoregressive Integrated Moving Average
(ARIMA), to estimate their respective prediction accuracy. Based on the conducted analysis,
Fornano et al. (2017) found that these machine learning algorithms outperformed ARIMA in
predicting the turnover indices growth of all sectors in Finland. This is because Ridge Regression,
LASSO, and ENET provided low Mean Squared Forecast Errors (MSFE) compared to the said
time series benchmark (Fornano et al., 2017).
Aside from predicting macroeconomic and firm-level indicators, nowcasting was also
utilized in the field of energy and medicine. The papers of Ziel (2020) as well as
Lan and Subramanian (2019) were among the studies in these fields that used
regularization methods to conduct nowcasting. In particular, the former author used
the said quantitative approach to predict the current state of electricity or power consumption
in Europe. Meanwhile, the latter authors applied the said concept to formulate a
nowcasting model to estimate the recent dengue occurrence in Puerto Rico and Peru.
Both of the authors mentioned that their attempt to estimate these
circumstances was due to the increasing concerns regarding publication lag on the official data
of electricity consumption and dengue occurrence in Europe as well as Puerto Rico and Peru,
23
See Page 5 of Fornano et al. (2017).
19
respectively. This is because different stakeholders strongly use the two (2) indicators for
economic and public health reasons (Ziel, 2020; Lan and Subramanian, 2019).
To perform their corresponding nowcasting exercise, these authors likewise use
high-frequency data or information. Ziel (2020) makes use of daily energy load values provided
by the European Transmission System Operators (TSO) from 2014 to 2019, while
Lan and Subramanian (2019) employed climatic variables and data from Google Trends as
explanatory variables.
24
,
25
Based on their analysis, both authors concluded that
regularization methods could accurately nowcast the two (2) aforementioned circumstances
with ease. This is because the machine learning algorithms used in their respective model could
handle and incorporate a large number of predictors with a low level of Mean Absolute Error
(MAE) and RMSE. Ziel (2020), as well as Lan and Subramanian (2019), specifically found that
Ridge Regression and LASSO are the most accurate regularization models to nowcast electricity
consumption in Europe and dengue occurrence in Puerto Rico and Peru, respectively.
2.3. Tree-Based Methods
Aside from regularization methods, numerous studies also introduced the use of
tree-based methods to conduct nowcasting. The said approach is one of the well-known options
to perform nowcasting through machine learning algorithms. This is because of its
strong capacity, similar to regularization methods, in being flexible and interpretable.
26
However, in contrast to Ridge Regression, LASSO, and ENET, tree-based methods strongly
involve stratifying or segmenting the predictor space into a number of simple regions.
In order to make a prediction for a given observation, the mean or mode of the training
observation is typically used in the region to which it belongs (James et al., 2013 p. 303).
24
See Page 8 of Ziel (2020).
25
See Page 5 of Lan and Subramanian (2019).
26
Similar to regularization methods, tree-based methods in machine learning also address the issue of bias-variance tradeoff.
20
recognized studies
that used tree-based methods to predict economic growth. These authors, in particular, utilized
Random Forest (RF) algorithm to forecast the short-term GDP growth in Europe.
The analysis of said authors was complemented by the numerous datasets under the
European Union Business and Consumer Survey to strongly utilize the capacity of said machine
learning model in handling a large number of input variables with robust prediction accuracy.
27
Using the aforementioned data through RF, the
said approach is a well-performing machine learning algorithm to predict the short-term growth
of GDP in Europe. This is because RF provided more accurate estimates than the projections
registered by the traditional time series model, such as the Autoregressive (AR) Model,
to forecast the said macroeconomic indicator. In particular, forecasting the GDP in Europe using
the said tree-based approach only generated an MSE of 0.43 while the AR produced 0.64.
The authors also cited that RF is an effective tool to create a parsimonious model.
Since the aforementioned had identified which among the predictive variables included in their
This approach was similarly performed under the study of
Adriansson and Mattson (2015). The authors, in particular, used the concept of
GDP growth of Sweden. To attain this objective, these authors similarly used a large amount
of survey dataset to predict the said macroeconomic variable. The data or information under the
Economic Tendency Survey conducted by the National Institute of Economic Research (NIER)
were mainly used as explanatory variables in their forecasting model using RF.
28
This survey consists of different confidence indicators and questions to private firms and
households regarding their economic outlook and perception of economic activity in the said
country (Adriansson and Mattson, 2015).
27
28
See Page 5 of Adriansson and Mattson (2015).
21
Using these data as inputs for their tree-based method nowcasting,
Adriansson and Mattson (2015) found that RF provides a better prediction performance against
the ad hoc linear model and AR model in forecasting the GDP growth of Sweden.
RF had the most precise forecasting results since it has the lowest RMSE of 0.75 compared to
the 0.79 and 0.95 of the two (2) time series benchmark models, respectively
(Adriansson and Mattson, 2015). Therefore, similar to the recommendation of
udy of Adriansson and Mattson (2015) proposed that
RF is a valuable quantitative approach that could bring forecasting improvements when applied
to economic time series data.
Aside from RF, Adaptive Trees (AT) which is highly based on Gradient Boosted Trees
(GBT) was also utilized as a primary machine learning model to conduct forecasting.
This is because of its strong capacity to deal with nonlinearities and structural changes,
among others (James et al., 2013; Woloszko, 2020). The paper of Woloszko (2020) was one of
the recent studies that specifically used AT to provide three (3)- to twelve (12)-months ahead
GDP growth forecast to the Group of Seven (G7) countries.
29
In this study, the author employed
country-specific information (e.g., expectation surveys, consumer confidence) and
macroeconomic data (e.g., housing prices, employment rate) as explanatory variables to the
tree-based forecasting model.
30
Based on the conducted forecast simulations, Woloszko (2020) similarly concluded that
the said machine learning algorithm is a valuable tool in economic forecasting.
This was attributable to the accurate prediction results it generates compared to the traditional
time series models. In contrast to AR models, the 3- and 6-months ahead GDP growth forecast
for the US, United Kingdom (UK), France, and Japan using AT displayed lower RMSEs.
The authors, however, found that this level of accuracy was only applicable in short-run
forecasting. This is because the forecasting results of AT became uninformative after they used
it to conduct the one (1)-year-ahead forecast. Due to this reason, Woloszko (2020) argued that
29
Canada, however, was not included in the analysis of Woloszko (2020).
30
See Page 11 of Woloszko (2020).
22
despite having the advantage to handle a large number of variables in economic forecasting,
AT might not be a suitable model to predict long-run effects.
Other empirical studies both utilized RF and GBT as machine learning algorithms to
forecast economic growth. Among these were the papers of Boluis and Rayner (2020) as well as
Soybilgen and Yazgan (2021). In particular, these authors used the said methods to forecast the
GDP growth in Turkey and the US, respectively. Similar to the previous studies discussed in
this section, the studies of these authors also aim to determine the most optimal
tree-based method to predict economic growth using high-frequency data or information.
The study of Boluis and Rayner (2020) used two hundred thirty-four (234) country-specific and
global indicators from Haver Analytics. This includes macroeconomic indicators regarding the
financial, labor, and external sectors.
31
Meanwhile, Soybilgen and Yazgan (2021) utilized more
than one hundred (100) financial and macroeconomic variables, which include data on the labor
market, money and credit, and stock market, among others.
32
Using the aforementioned input variables, Boluis and Rayner (2020) as well as
Soybilgen and Yazgan (2021) concluded that the tree-based methods provide
superior forecasts compared to benchmark models, such as DFM and linear models.
This is because RF and GBT produced lower forecast errors against the benchmark models.
Boluis and Rayner (2020) mentioned that the RMSE of RF was 1.26 while GBT produced 1.29.
Both of these results were lower compared to the benchmark linear model, which registered an
RMSE of 1.66. Likewise, Soybilgen and Yazgan (2021) discussed that, compared to the DFM,
the tree-based methods provided the lowest average RMSE and MAE.
33
Aside from their outstanding individual accuracy, these authors also cited that RF and GBT
have the strength to predict economic volatility and the capacity to determine which among the
variables included in the forecasting model are the most essential.
31
See Tables A5.1 and A5.2, Pages 24-25 of Boluis and Rayner (2020).
32
See Appendix 1, Page 23 of Soybilgen and Yazgan (2021).
33
See Table 1 and 2, Page 13 of Soybilgen and Yazgan (2021).
23
2.4. The Utilization of Two (2) Machine Learning Methods
Several studies also attempted to utilize the strengths of both regularization and
tree-based methods to perform nowcasting. Authors of these studies have considered this
research approach because most of them intended to distinguish the accuracy of each
machine learning method to nowcast or forecast the growth of a specific macroeconomic indicator
or the possible impact of policy implementation (Richardson et al., 2018; Tamara et al., 2020;
Aguilar et al. 2019).
One of the studies that fall under this category is the research produced by
Richardson et al. (2018). In particular, the authors attempted to use both regularization and
tree-based methods to formulate a model that can precisely nowcast the GDP in New Zealand.
The objective of this study was drawn from the difficulty of their policymakers in addressing
various economic vulnerabilities. This is because policy formulations in the said country are
highly dependent on the non-related, outdated, or lagged data (Richardson et al., 2018).
Given this scenario, Richardson et al. (2018) used a number of real-time vintages of a
range of macroeconomic and financial market statistics as explanatory variables to their
simulated nowcasting models. This includes data from business surveys, consumer and
producer prices, and general domestic activity production, among others.
34
By using these as
inputs for the different machine learning algorithms, Richardson et al. (2018) concluded that
regularization or tree-based approach could be used as a primary methodology to nowcast the
economic growth in New Zealand. Mainly because the RMSE and Mean Absolute Deviation
(MAD) of these machine learning algorithms are lower than the traditional time series models
used to forecast the GDP in the said country. However, comparing these methods,
Richardson et al. (2018) argued that LASSO (0.45) had the lowest average forecast errors.
35
34
See Page 8 of Richardson et al. (2018).
35
Richardson et al. (2018) also found that Support Vector Machines (SVM) and Neural Network (NN) both have low forecast errors
compared to AR and BVAR.
24
The authors also found that GBT (0.47) and Ridge Regression (0.57) provided lower RMSE
compared to Bayesian VAR (BVAR) model (0.61).
This research methodology is also utilized under the study of Tamara et al. (2020).
These authors used regularization and tree-based methods to nowcast the GDP growth in
Indonesia. Similar to the objective of Richardson et al. (2018), Tamara et al. (2020) conducted
this research to provide accurate estimates on the output growth of the said country.
This is because the quarterly data of GDP for Indonesia is released with five (5) weeks lag after
the end of reference (Tamara et al., 2020).
Based on this objective, Tamara et al. (2020) used eighteen (18) predictor variables
in their model. These data are comprised of quarterly macroeconomic
(e.g., consumption expenditure, current account) and financial market statistics
(e.g., change in stocks).
36
Using these indicators as explanatory variables, the authors concluded
that regularization and tree-based methods precisely estimate the short-run growth of GDP in
Indonesia. Mainly because these machine learning algorithms reduce the average forecast errors
at thirty-eight (38) to sixty-three (63) percent (on average) relative to the AR benchmark.
Tamara et al. (2020) also found that the forecasted values using these methods could produce a
similar pattern close to the actual values. However, comparing these methods, the authors cited
that RF (1.27) and ENET (1.31) have the lowest average forecast errors.
The potential of regularization and tree-based methods was also used to provide
estimates on global poverty. The paper of Aguilar et al. (2019) utilized these machine learning
algorithms to formulate a quantitative model to improve the accuracy of the current poverty
nowcasting model of the World Bank (WB). Remarkably, the authors applied LASSO, RF, and
GBT to predict the mean welfare and back out poverty rates. This study was drawn to have a
more reliable and cost-effective method to predict the current state of poverty across regions
(Aguilar et al., 2019).
36
See Appendix of Tamara et al. (2020).
25
Taking this into consideration, Aguilar et al. (2019) used similar datasets utilized under
the current forecasting model of WB to predict the current level and growth of global poverty.
These datasets include macroeconomic and social indicators, which were extracted from the
World Economic Outlook (WEO) database and World Development Indicators (WDI).
37
Using these as inputs, the authors found that using regularization and tree-based methods to
nowcast the said indicator decreased the overall nowcast error by 5.7 percent from
2.8 percentage points (Aguilar et al., 2019). However, Aguilar et al. (2019) argued that despite
having accurate estimates, LASSO, RF, and GBT only provide minor improvement vis-à-vis the
current method used by the WB to nowcast global poverty.
37
See Page 6 of Aguilar et al. (2019).
26
Chapter III:
RESEARCH METHODOLOGY
3.1. Primer
The overall methodology of this study is comprehensively discussed in this chapter.
In particular, each section presents detailed information about (1) benchmark models,
(2) machine learning algorithms, (3) nowcast evaluation methodology, and (4) statistical tool or
software used to formulate a nowcasting model that aims to accurately estimate the growth and
development of domestic liquidity in the Philippines.
3.2. Models
Time series models and machine learning algorithms are utilized to support the
main objective of this research systematically. The former models are used as benchmarks since
these are the most commonly used econometric models to predict the current and future growth
of a particular macroeconomic indicator or economic phenomenon. Meanwhile, the latter
algorithms are used as the alternative quantitative methods to nowcast domestic liquidity growth
in the Philippines. This approach is conducted because of two (2) main reasons. The first reason
is to establish which quantitative models could accurately estimate the real-time growth of said
monetary indicator. Another reason is to determine the strength of machine learning algorithms
to precisely nowcast vis-à-vis traditional time series models.
Drawing upon this background, the properties of each time series and machine learning
models which are utilized in this study are comprehensively discussed in this chapter.
The former includes traditional forecasting models such as (1) Autoregressive Model
(e.g., Autoregressive Integrated Moving Average and Random Walk) and
(2) Vector Autoregression, and (3) Dynamic Factor Model. On the other hand,
the latter models are comprised of (1) Regularization Methods such as
Ridge Regression, Least Absolute Shrinkage and Selection Operator, and Elastic Net,
27
as well as (2) Tree-Based Methods such as Decision Trees, Random Forest, and
Gradient Boosted Trees.
3.2.1. Benchmark Models
3.2.1.1. Autoregressive Models
Autoregressive (AR) models are the most frequently used approach to predict the growth
and development of a particular macroeconomic indicator or scenario. Mainly because of its
strong ability to perform forecasting despite using a single time series. Numerous studies argued
that AR models are highly utilized in time series forecasting because of their simple but
powerful method in using past values to identify the future growth and development of a
particular indicator (Meyler et al. 1998; Medel and Pincheira, 2015).
3.2.1.1.1. Autoregressive Integrated Moving Average
There are various AR models that are specifically used depending on the nature of a
time series. The Autoregressive Integrated Moving Average (ARIMA) is one of the general
models under this approach. This univariate time series model is frequently used in most
forecasting studies when a specific time series data is non-stationary, previous values are
significant to predict its current state, or errors are autocorrelated. This is because ARIMA can
be interpreted as a filter that aims to separate the signal from the noise, and the signal is then
generalized into the future to acquire forecasts (Nau, 2014). The general forecasting equation
using ARIMA is structured as follows:
Under equation 3.1, represents the order of the autoregression, which includes the
overall effect(s) of past values into consideration. The notation , on the other hand, denotes
the order of the moving average, constructing the error of ARIMA as a linear combination of
28
the error values observed at the previous time points in the past (Meyler et al. 1998;
Fan, 2019 pp. 10-11).
3.2.1.1.2. Random Walk
Another popular univariate model used in economic forecasting is the Random Walk.
The property of this time series model is quite similar to ARIMA. Mainly because the two (2)
models similarly use the previous data points as a reference of the future trend of a specific
time series. However, compared to ARIMA, the Random Walk model assumes that the
next step is only decided by the last data point and takes an independent random step away
(Fan, 2019 p. 11-12). This univariate model is also utilized if a particular time series is
non-stationary.
38
,
39
The general forecasting equation using Random Walk is written below:
In equation 3.2, the and represents the observations of the time series and is
the white noise with zero mean and constant variance (Fan, 2019 p.12).
3.2.1.2. Vector Autoregression
Using univariate models as a principal approach to forecast a particular time series data
has a limitation. This is their characteristic to heavily rely on previous data points to forecast a
particular indicator. In other words, when ARIMA or Random Walk are used as a
forecasting technique, other determinants that could influence the growth and development of
an indicator are not being strongly considered.
To address this concern, most studies in the field of economics used multivariate
time series models such as Vector Autoregression (VAR). The superiority of this algorithm
38
Random walk is similar with ARIMA(0,1,0) model.
39
Random walk is a prevalent forecasting model for non-stationary time series data such as foreign exchange rates (FOREX).
29
against univariate time series models has been proven and established over time.
This is because it has the capability to create structural equations with other influential features
and incorporate two (2) or more time series to forecast the growth and development of a
particular indicator. Hence, compared to ARIMA or Random Walk, VAR can be
considered as a comprehensive forecasting model. The general form of VAR model with
deterministic term and exogenous variable can be expressed as:
Under equation 3.3, denotes matrix of other deterministic terms as such linear
time trend or seasonal dummy variables and represents matrix of stochastic
exogenous components. The notations and are the parameter matrices
(Fan, 2019 p. 12-13).
3.2.1.3. Dynamic Factor Model
The Dynamic Factor Model (DFM) is also a prevalent choice for most econometricians
that aim to predict the future growth of a particular macroeconomic variable with the use of
numerous explanatory variables. This is because it has the capacity to handle
large datasets with no practical or computational limits (Stock and Watson, 2016).
Mariano and Ozmucur (2020) also mentioned that DFM is a valuable tool to forecast a
specific indicator with numerous explanatory variables because it addresses the difficulty of
getting convergence in a state-space framework.
Compared to VAR, where the set of variables can be immediately included in the model,
the DFM first reduces the dimension of these datasets by summarizing the information available
into a small number of common factors. Each of the variables is represented as the common and
idiosyncratic components. The former is constructed with a linear combination of the
common factors that could explain the main part of the variance of the time series,
30
while the latter contains the remaining variable-specific information (Fan, 2019 p. 13).
The DFM can be expressed as:
Under Equation 3.4, notation represents the vector of observed time series variables
depending on a reduced number of latent factors and idiosyncratic component
The denotes the lag polynomial matrix, which represents the vector of dynamic
factor loading (Stock and Watson, 2016; Fan, 2019).
3.2.2. Machine Learning Models
3.2.2.1. Regularization Methods
As discussed in the previous chapter, regularization methods are among the well-known
machine learning algorithms used to conduct nowcasting. This is because their individual
properties have a strong resemblance with the characteristics of Ordinary Least Squares (OLS)
in fitting a linear model (James et al., 2013; Tiffin, 2016). However, in contrast with OLS,
regularization methods constrain its coefficient estimates to significantly reduce their variance
with the intention to improve the overall model fit (James et al., 2013).
3.2.2.1.1. Ridge Regression
One of the regularization methods used in nowcasting is Ridge Regression.
This regularization method is very similar to least squares. Mainly because it also aims to obtain
coefficients that fit the data well by making the residual sum of squares (RSS) as small as
possible. However, the said approach seeks to minimize a second term called shrinkage penalty
which is small when the regression coefficients are close to zero (Tiffin, 2016 p. 7)
(Equation 3.5).
31
Equation 3.5 depicts the RSS and penalty term on the said regularization method.
The notation represents the total number of observations included in the model, while is the
number of candidate predictors. The essential factor in this equation is the tuning parameter ,
which controls the relative impact of the regression coefficient estimates
(James et al., 2013 p. 215). When , the penalty has no effect, and Ridge Regression
produces estimates similar to OLS estimates. However, as , the impact of
shrinkage penalty increases, and the coefficient estimates approach to zero (0) (Tiffin, 2016).
3.2.2.1.2. Least Absolute Shrinkage and Selection Operator
Another form of regularization method is the Least Absolute Shrinkage and
Selection Operator (LASSO). Similar to Ridge Regression, LASSO also includes a
penalty term to its RSS (Equation 3.6).
In contrast with the former regularization method, which only shrinks all of its
coefficients towards zero (0) but not set any of them exactly to zero (0), LASSO forces its
coefficients to be precisely equal to zero (0) when tuning the parameter is adequately large
(James et al., 2016).
40
Therefore, due to its substantial penalty, the main advantage of LASSO
over Ridge Regression is its ability to select important variables and produce a parsimonious
model with fewer predictors.
40
Except if the penalty of Ridge Regression is .
32
3.2.2.1.3. Elastic Net
Numerous studies also used Elastic Net (ENET) as their primary approach to
perform nowcasting to maximize the strengths of the two (2) aforementioned methods.
41
ENET is a form of regularization method that contains both properties of Ridge Regression and
LASSO (Equation 3.7).
In particular, this regularization method utilizes the penalty strength of Ridge Regression
and LASSO by selecting the best predictors to provide parsimonious models while still identifying
groups of correlated predictors. The respective weights of the two (2) penalties are determined
through the additional tuning parameter (Richardson et al., 2018).
3.2.2.2. Tree-Based Methods
Numerous studies also utilized tree-based methods as a primary approach to conduct
nowcasting. These studies particularly used Random Forest and Gradient Boosting Trees
because it has a strong resemblance with regularization methods, which are popular for their
capacity to address bias-variance tradeoff that provides an intuitive and easy-to-implement way
of modeling non-linear relationships.
However, in contrast with Ridge Regression, LASSO, and ENET, these methods are
considered non-parametric models that do not require the underlying relationship between the
dependent and independent variables (Fan, 2019). Tree-based methods involve stratifying or
segmenting the predictor space into a number of simple regions. Therefore, in order to make a
41
See the studies of Tiffin (2016), Richardson et al. (2018), and Tamara et al. (2020).
33
prediction for a given observation, tree-based methods utilize the mean or mode of training
observation in the region to which it belongs (James et al., 2013 p. 303).
3.2.2.2.1. Decision Tree
Decision Tree is the fundamental structure of any tree-based machine learning method,
which can be used for classification and regression problems (James et al., 2013; Fan, 2019).
Basically, this approach divides categorical (e.g., name, address) or continuous (e.g., level,
growth rate) data into two (2) classes in a systematic manner in order to reduce the prediction
error of the target variable of interest. This procedure is repeated until the number of training
samples at the branch exceeds the minimum node size (Figure 3.1). The algorithm, afterward,
makes the prediction by using the mean or mode of training observation in that particular region
(James et al., 2013).
Figure 3.1: Decision Tree Growing Process
(Recursive Binary Splitting of Two-Dimensional Feature Space)
Source: James et al. (2013)
34
3.2.2.2.2. Random Forest
One of the most well-known tree-based machine learning algorithms is the
Random Forest (RF). Mainly because this particular model is computationally simple to use,
does not require tuning of model parameters, and ideal for forecasting time series data with
relatively few observations (James et al., 2013).
RF is a machine learning algorithm that makes use of combinations of multiple
decision trees to formulate a comprehensive forecast. Notably, it modifies the approach of a
decision tree in order to minimize the problem of overfitting and maximize the information
content of the data by using subsamples of observations and predictions (Tiffin, 2016;
Bolhuis and Rayner, 2020). To perform this, RF uses bootstrap aggregation (also known as
bagging) in each decision tree using a random sample of observations in the training dataset.
This procedure is repeated number of times, and the results are averaged to reduce the overall
variance without increasing the bias of the dataset. It also uses random sampling in each split
to ensure that the multiple trees that go into the final collection are relatively diverse.
Using these approaches, RF generates an aggregate prediction that is strong and accurate
(Tiffin, 2016; Bolhuis and Rayner, 2020).
3.2.2.2.3. Gradient Boosted Trees
Gradient Boosted Trees (GBT) is another form of tree-based model that is often used
by numerous studies to conduct nowcasting. This is because of its powerful forecasting capability
to capture complex non-linear functions (Fan, 2019). However, compared with RF,
GBT is a machine learning algorithm that formulates sequential decision trees rather than
combinations to construct an aggregate forecast. This tree-based model does not involve
bootstrap sampling that RF conducts. GBT, instead, train an initial decision tree based on the
time-series data. It then uses the prediction errors from said decision tree to train a
second decision tree. The errors from the second decision tree are used to train the tree,
35
and so on. After the final iteration, the algorithm uses the summation of these predictions to
provide a final forecast (James et al., 2013; Bolhuis and Rayner, 2020).
3.3. Nowcast Evaluation Methodology
In this study, the performance of time series and machine learning algorithms are
evaluated based on their one-step-ahead (out-of-sample) nowcast. The models are trained over
an expanding window (also known as recursive) to estimate domestic liquidity growth from
January to December 2020 (Figure 3.2). For instance, for the first nowcast in January 2020,
the dataset used is based on January 2008 to December 2019. For the second nowcast in
February 2020, the dataset used is based on January 2008 to January 2020. This process is done
until the last out-of-sample period. Overall, there are twenty-four (24) generated nowcasts for
each time series and machine learning algorithms used in this research, with the end-month
nowcast being the principal prediction result.
42
Figure 3.2: Expanding Window Process
After the individual performance is evaluated, the forecast accuracy of each
model is gauged through their respective forecast errors such as Root Mean Square Error (RMSE)
42
Since there the data of target and input variables are unbalanced (e.g., monthly for target variable, daily/weekly for
input variable) problem. Averaging and interpolation are conducted to align of the data properly. This is further discussed in
Chapter 4: Research Data and Diagnostics.
36
(Equation 3.8) and Mean Absolute Error (MAE) (Equation 3.9). The RMSE and MAE of each
machine learning algorithm are compared against benchmark models (i.e., AR, DFM).
This method of comparison is performed to determine whether the nowcast results obtained from
the former are significantly superior to the latter methods or vice versa.
3.4. Research Tool
The R environment is the primary statistical software used in this study.
It is a well-known software environment for statistical computations, mathematical equations,
and data visualizations. In particular, this study highly utilized the capacity of R Studio to
perform the whole process of this research. This particular includes data integration,
data cleaning, model building, and statistical validation.
43
43
The R packages used in this study are listed in Annex A.
37
CHAPTER IV: DATA AND DIAGNOSTICS
CHAPTER V: EMPIRICAL RESULTS AND ANALYSIS
38
Chapter IV:
DATA AND DIAGNOSTICS
4.1. Primer
The activities performed to prepare datasets and enhance the overall performance of
benchmark and machine learning models used in this study are presented in this chapter.
In particular, each section presents the (1) dataset and variables, (2) averaging and interpolation
conducted, and (3) diagnostics and feature engineering efforts performed in this research.
4.2. Data
4.2.1. Target Variable
Driven by the objective and nature of this study, the dependent variable utilized is
the domestic liquidity in the Philippines. This monetary indicator represents the
total amount of money available in the economy of said country. The numerical figures
(i.e., level, growth rate) of domestic liquidity are acquired from the monthly
Depository Corporations Survey (DCS) that the Bangko Sentral ng Pilipinas (BSP) published
on its official website from January 2008 to December 2020.
44
,
45
Figure 3.1 depicts the level
(in million PHP) and year-on-year (YOY) growth rate (in percent), while Table 4.1 presents the
summary statistics of domestic liquidity in the Philippines.
4.2.2. Input Variables
Similar to previous studies that intend to formulate nowcasting models in order to
estimate recent developments of various macroeconomic indicators and transmission mechanisms
44
Official BSP Website: https://www.bsp.gov.ph.
45
To ensure that the data on domestic liquidity are not subject to any revisions, the last figure used in this study was as of
end-December 2020.
39
of policies through the use of machine learning algorithms, high-frequency data or information
are also used as independent variables in this study. These are comprised of numerous
high-frequency monetary, financial, and external sector indicators, which are used as
typical components to monitor or observe the growth of domestic liquidity.
Figure 4.1: Domestic Liquidity in the Philippines (January 2008-December 2020)
(a) Levels (in Million PHP); (b) Growth Rate (in Percent)
(a)
(b)
Table 4.1: Summary Statistics of Domestic Liquidity in the Philippines
MIN.
1ST QU.
MEDIAN
MEAN
3RD QU.
MAX
M3 (Level in PHP)
3,101,926
4,357,222
7,118,632
7,395,092
10,203,734
14,211,479
M3 (Growth %)
2.550
8.615
11.200
12.292
13.365
37.970
4.2.2.1. Monetary Indicators
The numerical data of monetary variables used in this study are formally requested from
the Department of Economic Statistics (DES) and obtained from the official website of the
BSP.
46
A formal request is made because daily figures of these variables are not published nor
shared publicly. Monetary indicators that are requested from the DES are the daily
(1) available reserves (i.e., required reserves, excess reserves) (2) reserve money
(i.e., currency-in-circulation, central bank liabilities). Meanwhile, the central bank (3) claims on
National Government (NG) and (4) claims on other sectors are obtained from the monthly
46
The DES is the technical arm of the BSP that generates monetary and economic statistics needed in the formulation and
implementation of monetary policy (2020 BSP Organization Primer, p. 25).
40
C
January 2008 to December 2020.
Table 4.2: List of Data
NO.
VARIABLE
TYPE
FREQ.
PUBLICATION DELAY
(DAYS AFTER REF. DATE)
1
Domestic Liquidity (M3) Growth
Target Variable
Monthly
30
2
M3 Growth (T-1)
Input Variable
Monthly
-
3
BSP Liabilities on National Government
Input Variable
Monthly
15
4
BSP Claims on Other Sectors
Input Variable
Monthly
15
5
Foreign Portfolio Investment (In)
Input Variable
Weekly
30
6
Foreign Portfolio Investment (Out)
Input Variable
Weekly
30
7
Available Reserves
Input Variable
Daily
1
8
Reserve Money
Input Variable
Daily
1
9
CBOE Volatility Index
Input Variable
Daily
1
10
Credit Default Swap
Input Variable
Daily
1
11
London Interbank Reference Rate
Input Variable
Daily
1
12
Singapore Interbank Reference Rate
Input Variable
Daily
1
13
Philippine Interbank Reference Rate
Input Variable
Daily
1
14
Philippine Government Bond Rate
Input Variable
Daily
1
15
BSP Discount Rate
Input Variable
Daily
1
16
Bank Savings Rate
Input Variable
Daily
1
17
Bank Prime Rate
Input Variable
Daily
1
18
Money Market Rate (Promissory Note)
Input Variable
Daily
1
19
Treasury Bill Rate
Input Variable
Daily
1
20
Interbank Call Rate
Input Variable
Daily
1
21
Philippine Peso per US Dollar (FOREX)
Input Variable
Daily
1
22
Weighted Monetary Operations Rate
Input Variable
Daily
1
4.2.2.2. Financial Indicators
Bloomberg. These are comprised of daily (1) Weighted Monetary Operations Rate (WMOR),
(2) BSP Discount Rate, (3) CBOE Volatility Index, (4) Credit Default Swap (CDS),
(5) London Interbank Offered Rates (LIBOR), (6) Singapore Interbank Offered Rates (SIBOR),
(7) Philippine Interbank Reference Rate (PHIREF), (8) Government Bond Rate,
41
(9) Interbank Call Loan Rate, (10) Bank Prime Rate, (11) Treasury Bill Rate, and
(12) Promissory Note Rate from January 2008 to December 2020.
4.2.2.3. External Indicators
Statistics for the external sector indicators are also obtained from Bloomberg.
However, the weekly figures of Foreign Portfolio Investment (FPI) are formally requested from
the International Operations Department (IOD) of the BSP.
47
Similar to the case of
available reserves and reserve money, its historical high-frequency values are not published nor
shared publicly. Other than the (1) FPI, (2) daily foreign exchange rate (i.e., Philippine Peso
per US Dollar) is also used as an external sector indicator in this study. The coverage of these
data is from January 2008 to December 2020.
4.2.2.4. Lagged Values of Domestic Liquidity
48
Although this study captures numerous monetary, financial, and external indicators as
input variables to predict the future movement of domestic liquidity in the Philippines,
other determinants that are not included in the dataset could also influence its growth.
To address this concern, lagged value of the domestic liquidity is also considered as an
input variable. The lagged values used in this study are of the target variable.
4.3. Averaging and Interpolation
Given that the main objective of this study is to provide useful and advance data or
information in order to minimize the usual approach in addressing different economic phenomena
and formulating policies based on outdated or lagged data, this study aims to nowcast
domestic liquidity in the Philippines on a bi-monthly basis, with the second nowcast being the
47
The IOD supports the BSP in maintaining the monetary stability and external sustainability through the management of
external debt, foreign investments, and other foreign exchange transactions (2020 BSP Organization Primer, p. 25).
48
Lagged values of domestic liquidity are only utilized under machine learning algorithms.
42
principal prediction result. This is to maximize the explanatory power of each high-frequency
input variable (i.e., variables with daily frequency). Aside from this, utilizing regressors with
high-frequency data typically solves the overfitting problem caused by the
observations).
However, based on the data publication release of each indicator (Table 4.2),
it can be observed that there is an unbalanced frequency problem. Standard regression models
require that the datasets should have the same level of granularity. Therefore, to align all of the
data correctly, averaging and interpolation are conducted in this study.
4.3.1. Averaging of High-Frequency Variables
Data averaging is performed on variables with a daily and weekly frequency.
The input variables (e.g., monetary, financial indicators) with daily frequency are aggregated
and averaged into two (2) numerical values in a month. The first value is the average of
1st until the 15th day of the month, while the other half is the mean of 16th until the last day
of the month (e.g., available reserves data from 1 to 15 January and 16 to 31 January are
averaged, respectively). On the other hand, explanatory variables with weekly frequency are
averaged based on the first and second week as well as third and fourth-week data release,
respectively (e.g., first- and second-week data of foreign portfolio investment are averaged).
4.3.2. Interpolation of Low-Frequency Variables
Data interpolation is conducted on the variables with low frequency (i.e., monthly),
such as domestic liquidity, BSP liabilities on NG, and BSP claims on other sectors.
Since these are published on a monthly basis, their official data are categorized as the
month-end growth rate. The data points between each period of averaged input variable data
(e.g., mid-month data) are considered missing values and interpolated using a
spline interpolation method, which is commonly used for non-linear data estimation.
43
4.4. Diagnostics and Feature Engineering
The raw dataset is refined to improve the performance of time series and machine
learning algorithms used in this study. In particular, data of target and input variables are
(1) seasonally adjusted, (2) log-transformed, and (3) individually assessed if they are stationary.
4.4.1. Seasonal Adjustment
Since most published data in the Philippines are not seasonally adjusted,
data of domestic liquidity and most input variables used in this study are deseasonalized
accordingly. This includes data that were requested from the DES and IOD as well as the other
statistics obtained from the official website of the BSP and Bloomberg
(e.g., BSP liabilities to NG, BSP discount rate). The aforementioned correction was performed
to ensure that estimates from the time series and machine learning models are accurate since
seasonal components (e.g., holidays) are not present in each model simulation.
4.4.2. Logarithmic Transformation
The normality of data is also an important factor in economic and statistical modeling.
Given that most real-life datasets do not always follow a normal distribution, they are often
skewed, which makes the empirical results or analysis spurious. Therefore, to address this
concern, the numerical figures of target and input variables in this study are transformed based
on their respective logarithmic equivalent.
49
4.4.3. Stationarity
In order to develop an accurate or precise forecasting model, it is crucial to establish that
the time series data of each indicator is stationary. This is mainly performed in order to ensure
49
If the data of a variable is an index or growth rate, it is not transformed to its logarithmic equivalent.
44
that the statistical properties of each time series do not change over time.
In this study, the stationarity of target and input variables are verified through the
Augmented Dickey-Fuller (ADF) and Philipps-Perron (PP) tests.
Based on the conducted unit root tests, the level, growth rate, or logarithmic equivalent
of domestic liquidity and input variables are non-stationary (Table 4.3).
50
This is because their
individual p-value is greater than the five (5) percent significance level (except for central bank
liabilities to NG). However, when transformed in their respective first difference, ADF and
PP tests showed that these variables are stationary. Therefore, to formulate a nowcasting model
to estimate domestic liquidity growth in the Philippines, the first difference values of target and
input variables (except for BSP Liabilities to NG) are used in this study.
51
Table 4.3: Unit Root Tests for Domestic Liquidity in the Philippines
VARIABLE
TEST
LEVEL OF SIG.
P-VALUE
(LEVEL/GROWTH/LOG)
P-VALUE
(FIRST DIFF.)
Domestic Liquidity (M3)
ADF
PP
0.05
0.14
0.61
0.01
0.01
Figure 4.2: Domestic Liquidity in the Philippines (January 2008 December 2020)
(a) Growth Rate (in %); (b) Growth Rate (in %, First Difference)
(a)
(b)
50
See Annex B for the individual ADF and PP test result of input variables.
51
For univariate models, the process of obtaining the first difference values of target variable is conducted within the ARIMA and
RW process. For DFM and machine learning models (i.e., regularization, tree-based methods), data of target and input variables
are transformed by their first difference prior model simulation.
45
Figure 4.3: Research Workflow Diagram
46
Chapter V:
EMPIRICAL RESULTS AND ANALYSIS
5.1. Primer
In this chapter, results of the simulated nowcasts using time series and
machine learning algorithms are presented. The sections of this chapter mainly discuss the
(1) calibration method performed in each model, (2) individual performance of
benchmark and machine learning models through the expanding window validation, and
(3) critical high-frequency indicators (i.e., monetary, financial, external sectors) that are
considered important to accurately nowcast the real-time growth of domestic liquidity in the
Philippines.
5.2. Calibration and Nowcast Results
5.2.1. One-Step-Ahead (Out-of-Sample) via Expanding Window
Since the main objective of this study is to accurately determine the growth of
domestic liquidity in the short-run, one-step-ahead (out-of-sample) nowcasts are performed.
This particular approach is preferred compared with multi-step-ahead (out-of-sample) estimates
because of two (2) primary underlying reasons. The first reason is to ensure that the
recent numerical figures of target and input variables are part of the structure and characteristics
of the training datasets. The second reason is to maximize the forecasting ability of
time series models, specifically Autoregressive Integrated Moving Average (ARIMA) and
Random Walk. Mainly because these univariate models place heavier emphasis on the recent
past rather than the distant past in conducting a forecast. Therefore, to appropriately compare
the accuracy of benchmark models vis-à-vis machine learning algorithms, their respective
one-step-ahead (out-of-sample) nowcasts should be considered one of the bases of evaluation.
47
It is also crucial to determine the precision consistency of simulated nowcasting models.
Therefore, the benchmark and machine learning models are trained over an expanding window
(also known as recursive method) to provide a series of one-step-ahead (out-of-sample) nowcast.
The bi-monthly dataset covering thirteen (13) years from 2008 to 2020 is divided into
twelve (12) different training and test datasets to perform the said approach.
The first training dataset covers the numerical figures of the target and input variables from
January 2008 to December 2019. Meanwhile, its corresponding test dataset is comprised of the
numerical statistics of target and input variables as of January 2020. This process is
accomplished until the test dataset covers the numerical figures of the target and input variables
as of December 2020. Overall, there are twenty-four (24) generated nowcasts for each
time series model and machine learning algorithm, with the end-month one-step-ahead
(out-of-sample) nowcast being the principal prediction result. The estimates of benchmark
models and machine learning algorithms under the said approach are then evaluated
individually and collectively based on their Root Mean Squared Error (RMSE) and
Mean Absolute Error (MAE).
5.2.2. Autoregressive Models
5.2.2.1. Model Calibration
In this study, the trained models under univariate or Autoregressive (AR) methods are
simulated based on three (3) different approaches. The first simulated model has the
parameters (0,1,0) of an ARIMA structure, otherwise known as Random Walk (RW).
This model was formulated because the time series data of domestic liquidity shows an
irregular growth as found in the conducted Augmented Dickey-Fuller (ADF) and
Philipps-Perron (PP) tests. To address this concern, one of the best strategies is to predict the
change that occurs from one period to the next rather than directly predicting the level of the
series at each period. In other words, it is essential to observe the first difference of the
time series to monitor if there are predictable patterns that can be determined (Nau, 2014).
48
The second univariate model simulated has the parameters (4,1,1) of an ARIMA Model.
This is formulated since the Partial Autocorrelation Function (PACF) as well as
Akaike Information Criterion (AIC) suggest that four (4) autoregressive (AR) lags should be
considered to forecast domestic liquidity in the Philippines (Figure 5.1). It is also simulated
because the time series data of said monetary indicator was found to be non-stationary.
Hence, in some cases of non-stationary time series, it is essential to use the average of the
last few observations to filter out the noise and accurately estimate the local mean (Nau, 2014).
Figure 5.1: ACF and PACF of Domestic Liquidity Growth in the Philippines (Seasonally Adjusted)
(a) ACF of M3 (Seasonally Adjusted); (b) PACF of M3 (Seasonally Adjusted)
(a)
(b)
Figure 5.2: Residual Plot for ARIMA (4,1,1)
52
52
The red-colored line under the ACF of ARIMA(4,1,1) indicates that a seasonal lag should be included in overall model.
49
Lastly, the parameters of the third univariate model are established based on the
built-in function of the statistical software, R Studio. The decision to use this automated process
is due to the seasonal lag that was found to be relevant under the
Autocorrelation Function (ACF) of ARIMA (4,1,1) (Figure 5.2). For this reason,
the third univariate model utilized in this study is a seasonal ARIMA (SARIMA) with
parameters based on the characteristics of the twelve (12) training datasets.
53
5.2.2.2. Nowcast Results
Figure 5.3: Autoregressive Model Nowcasts vs. Actual M3 Growth (January to December 2020)
(In Percent, Year-on-Year Seasonally Adjusted)
Based on the three (3) univariate models conducted, results indicate that their respective
one-step-ahead (out-of-sample) nowcasts from January to December 2020 strongly adhere to the
overall trend of domestic liquidity growth in the Philippines (Figure 5.3).
The ARIMA, RW, and auto-SARIMA models provided decent estimates in the months where
the growth of said monetary indicator (i.e., March, April, May) suddenly expand due to the
53
The parameters under auto-SARIMA models can be different from January to December 2020. This is because R Studio selects
the optimal lag orders to forecast domestic liquidity in each time period. For example, univariate model to nowcast January 2020
has the parameters ARIMA(2,1,4)(1,0,1) while for February 2020 the model has the parameters of ARIMA(5,1,1)(1,0,1).
50
increase in the borrowings of the National Government (NG) to minimize the negative impact
of Coronavirus Disease 2019 (COVID-19) pandemic in the economy of said country.
54
However, by comparing their respective monthly forecast errors, it can be observed that
no specific univariate model can accurately estimate the growth of domestic liquidity throughout
the expanding window. Tables 5.1 and 5.2 displayed that auto-SARIMA provided the highest
number of months with low RMSE and MAE (i.e., March, May, September, November,
December). This was followed by Random Walk (i.e., January, February, June, July) and
ARIMA (i.e., April, August, October), respectively. The accurate nowcasts from auto-SARIMA
are expected since the statistical software R Studio designates its parameters.
Table 5.1: RMSE of Autoregressive Models
55
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
OVR.
ARIMA
0.716
1.422
0.936
1.663
0.196
1.636
0.474
0.102
0.649
0.117
0.452
0.577
0.917
R. Walk
0.288
0.722
1.470
2.415
0.434
1.095
0.425
0.403
0.669
0.199
0.880
0.895
1.016
A. SARIMA
1.622
1.879
0.556
1.986
0.134
1.535
0.702
0.428
0.299
0.174
0.222
0.057
1.066
Table 5.2: MAE of Autoregressive Models
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
OVR.
ARIMA
0.715
1.395
0.762
1.537
0.194
1.527
0.467
0.088
0.544
0.106
0.389
0.537
0.688
R. Walk
0.273
0.669
1.319
2.327
0.428
0.996
0.416
0.380
0.543
0.149
0.825
0.862
0.766
A. SARIMA
1.609
1.801
0.405
1.854
0.134
1.411
0.650
0.355
0.244
0.162
0.194
0.050
0.739
The overall forecast errors of the three (3) univariate models, on the other hand,
provided different results to the aforementioned statement. Based on their overall RMSE and
MAE, it can be observed that ARIMA (4,1,1) is the most appropriate univariate time series
model to estimate the growth of domestic liquidity. This is because the said model registered
the most accurate overall nowcasts with RMSE of 0.917 and MAE of 0.688.
54
https://www.bsp.gov.ph/SitePages/MediaAndResearch/MediaDisp.aspx?ItemId=5297
55
M1 to M12 refers to the months included in the expanding window validation (e.g., January, February 2020).
51
Both of these indicators are lower compared to forecast errors registered by
RW (1.016 and 0.766) and auto-SARIMA (1.066 and 0.739), respectively
(Tables 5.1 and 5.2).
5.2.3. Dynamic Factor Model
5.2.3.1. Model Calibration
Dynamic Factor Model (DFM) is also utilized in this study to systematically include the
wide range of high-frequency monetary, financial, and external sector indicators as
input variables. Hence, this study followed the methodology used by
Mariano and Ozmucur (2020) in implementing the said approach, wherein:
(1) the number of indicators is reduced through factor analysis; (2) factors identified are applied
under a Vector Autoregressive (VAR) framework; and (3) predicted values from the
aforementioned are then used to nowcast the target variable.
Figure 5.4: Eigenvalues of Input Variables via Factor Analysis
By performing factor analysis, three (3) determinants were extracted from the initial
twenty (20) input variables using the method of maximum likelihood. The decision to use the
aforementioned factors was strongly based on each indicator's eigenvalues and
52
cumulative variance.
56
Figure 5.4 indicates that factors one (1) to three (3)
(i.e., first three (3) blue points) have larger eigenvalues in contrast to the remaining
seventeen (17) factors. Although using a higher number of factors is still acceptable,
the first three (3) factors already explain the sixty-four (64) percent of the variance in the
twenty (20) different monetary, financial, and external sector indicators used in this study.
57
After the aforementioned process, the three (3) factors identified are then utilized under
a VAR framework in order to complete the method of estimating the growth of
domestic liquidity in the Philippines. The optimal lags for this model are selected based on the
AIC and Hannan-Quinn (HQ) Information Criterion. Based on these selection criteria,
five (5) autoregressive lags should be considered under the twelve (12) training models to
determine the estimates from January to December 2020.
5.2.3.2. Nowcast Results
Compared with the three (3) univariate models conducted, DFM, as a nowcasting model,
provides inconsistent estimates on the overall movement of domestic liquidity in the
first semester of 2020. The one-step-ahead (out-of-sample) nowcasts of said model, in particular,
did not precisely estimate the expansion of domestic liquidity due to the sharp increase in the
borrowings and deposits of NG to the central bank that took effect last March to May 2020
(Figure 5.5).
On the contrary, the DFM provides more accurate results in the latter half of the year.
It can be observed in Tables 5.3 and 5.4 that the monthly forecast errors of the said model are
relatively lower than those under ARIMA, Random Walk, and auto-SARIMA, particularly from
August to December 2020. This outcome is also noticed from the overall forecast errors of DFM.
The said multivariate model only conveyed an overall RMSE and MAE of 0.825 and 0.619,
56
Eigenvalues refers to the total amount of variance that can be explained by a given principal component/factor.
57
Sixty (60) to sixty-five (65) percent of variance is the common figure used in economic analysis (Mariano and Ozmucur, 2020).
53
respectively. These forecast errors are relatively lower than the overall RMSE and MAE
displayed by the univariate models (Figure 5.6).
Figure 5.5: DFM Nowcasts vs. Actual M3 Growth (January to December 2020)
(In Percent Difference, Seasonally Adjusted)
Table 5.3: RMSE of DFM
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
OVR.
DFM
0.557
1.093
0.565
1.458
0.247
1.678
0.965
0.184
0.513
0.182
0.078
0.267
0.825
Table 5.4: MAE of DFM
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
OVR.
DFM
0.526
1.091
0.509
1.446
0.237
1.649
0.918
0.138
0.452
0.136
0.077
0.246
0.619
Figure 5.6: Overall (a) RMSE and (b) MAE of Autoregressive Models and DFM
(a)
(b)
54
5.2.4. Machine Learning Models
Before using any machine learning algorithms, it is common to validate their respective
stability using the cross-validation method. This is to ensure that the models can strongly
regulate the bias-variance tradeoff and accurately provide new estimates based on the training
or historical data (James et al., 2013). In this study, therefore, the aforementioned approach is
performed before conducting a series of recursive nowcasts on the growth of domestic liquidity
in the Philippines via regularization (i.e., Ridge Regression, Least Absolute Shrinkage and
Selection Operator, Elastic Net) and tree-based (i.e., Random Forest, Gradient Boosted Trees)
methods.
Although there are various methods to cross-validate machine learning methods
(e.g., holdout method, stratified K-Fold cross-validation), this study particularly utilized
(1) K-Fold cross-validation and (2) leave-one-out cross-validation methods for the
twelve (12) training datasets of target and input variables. Specifically, training datasets
under Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO),
Elastic Net (ENET), and Gradient Boosted Trees (GBT) are tuned based on a
Ten (10)-Fold cross-validation. In contrast, training datasets under Random Forest (RF) are
calibrated based on their out-of-bag (OOB) scores.
58
,
59
5.2.4.1. Regularization Methods
5.2.4.1.1. Model Calibration
The optimal shrinkage penalty for each algorithm under regularization methods is
determined based on a ten (10) fold cross-validation method. Under this approach, twelve (12)
different values of the said parameter are determined since twelve (12) training datasets are used
in each regularization algorithm. In order words, the value of shrinkage penalty is specifically
58
10-Fold cross-validation is the standard cross-validation technique used in machine learning exercises.
59
OOB is virtually equivalent to leave-one-out cross validation (James et al., 2013).
55
tailored based on the attributes of the training datasets and the norm of regularization
(i.e., Ridge Regression, LASSO, ENET). Figure 5.6 explicitly presents this scenario.
It shows that the optimal shrinkage penalty for estimating the domestic liquidity for
January 2020 has a different value than the optimal shrinkage penalty to predict the said
monetary indicator for February 2020. In particular, Panel A shows that the former has an
optimal shrinkage penalty value of 0.772, while Panel B presents that the latter has an
optimal shrinkage penalty value of 1.012.
60
Figure 5.7: Optimal Shrinkage Penalty via Ridge Regularization (January and February 2020)
(a) Training Dataset to Estimate M3 Jan. 2020; (b) Training Dataset to Estimate M3 Feb. 2020
(a)
(b)
5.2.4.1.2. Nowcast Results
After being calibrated based on their specific shrinkage penalty, models under
regularization methods then estimate domestic liquidity growth using the test datasets from
January to December 2020. The result from recursive nowcasts displayed that
Ridge Regression, LASSO, and ENET provide more consistent and accurate projections
compared to the estimates provided by the benchmark models conducted in this study.
Particularly, monthly estimates based on the three (3) machine learning algorithms significantly
have lower forecast errors compared to the individual nowcasts stipulated by the
benchmark models used in this study, such as ARIMA, RW, auto-SARIMA, and DFM
60
See Annex C to E for the complete list of optimal shrinkage penalty for each training dataset via regularization methods.
56
(Tables 5.5 and 5.6), except for September and October 2020 (Figure 5.8). The Ridge Regression,
LASSO, and ENET also provided accurate nowcasts on the unexpected increase in the growth
of domestic liquidity due to the increase in NG borrowings and deposits to BSP in March and
April 2020 (Tables 5.5 and 5.6).
The aforementioned result can also be observed from the overall forecast errors of
the three (3) machine learning algorithms. Mainly because Ridge Regression, LASSO, and ENET
have provided low overall RMSE and MAE in comparison with the overall forecast errors of
ARIMA (0.917 and 0.688), Random Walk (1.016 and 0.766), auto-SARIMA (1.066 and 0.739),
and DFM (0.825 and 0.619) (Figure 5.9).
Figure 5.8: Regularization Method Nowcasts vs. Actual M3 Growth (January to December 2020)
(In Percent Difference, Seasonally Adjusted)
Table 5.5: RMSE of Ridge Regression, LASSO, and ENET
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
OVR.
Ridge
0.292
0.372
0.928
1.163
0.173
0.258
0.261
0.248
0.596
0.449
0.123
0.349
0.529
LASSO
0.264
0.237
0.964
1.348
0.046
0.185
0.179
0.215
0.621
0.416
0.115
0.286
0.551
ENET
0.262
0.259
0.973
1.328
0.048
0.199
0.206
0.187
0.631
0.390
0.099
0.291
0.549
However, by comparing the three (3) models under the regularization method,
it can be observed that LASSO is the most accurate machine learning model to nowcast the
57
growth of domestic liquidity in the Philippines. Mainly because the said machine learning
algorithm provided the highest number of months with low forecast error estimates from
January to December 2020. Despite the strong monthly accuracy of LASSO, however,
Ridge Regression and ENET registered the most accurate overall estimates. This is because the
former notably provided an RMSE of 0.529, while the latter registered an MAE of 0.391
which were both lower compared to the overall forecast error of LASSO (Tables 5.5 and 5.6).
Table 5.6: MAE of Ridge Regression, LASSO, and ENET
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
OVR.
Ridge
0.292
0.364
0.887
1.136
0.156
0.245
0.259
0.209
0.596
0.325
0.116
0.345
0.411
LASSO
0.257
0.234
0.909
1.340
0.040
0.182
0.179
0.202
0.620
0.345
0.114
0.281
0.392
ENET
0.255
0.257
0.916
1.321
0.036
0.196
0.206
0.171
0.631
0.318
0.099
0.286
0.391
Figure 5.9: Overall (a) RMSE and (b) MAE of Benchmark Models and Regularization Methods
(a)
(b)
5.2.4.2. Tree-Based Methods
5.2.4.2.1. Model Calibration
Similar to regularization methods, RF and GBT are tuned under the cross-validation
method to provide accurate estimates on domestic liquidity growth from January to
December 2020. The methods used to calibrate these two (2) algorithms are OOB scores and
10-Fold cross-validation. By doing this, the twelve (12) training datasets under
58
RF and GBT individually have an optimal number of variables randomly sampled as candidates
at each split and the number of trees to grow, respectively.
The results of these calibration techniques further elaborate this discussion.
Figure 5.10 depicts the OOB errors of the training datasets under RF for January and
February 2020. Panel A shows that five (5) indicators are already sufficient to estimate domestic
liquidity growth for January 2020 since it has the lowest OOB error of 1.018. On the other hand,
Panel B indicates that ten (10) indicators are necessary to accurately nowcast the growth of said
monetary indicator for February 2020 because it registered the lowest OOB error of 1.014.
Figure 5.10: OOB Error of Training Datasets via Random Forest
61
(a) Training Dataset to Estimate M3 Jan. 2020; (b) Training Dataset to Estimate M3 Feb. 2020
(a)
(b)
Figure 5.11: Optimal Number of Trees via Gradient Boosted Trees
62
(a) Training Dataset to Estimate M3 Jan. 2020; (b) Training Dataset to Estimate M3 Feb. 2020
(a)
(b)
61
See Annex F for the complete list of OOB errors for each training dataset via Random Forest.
62
See Annex G for the complete list of the optimal number of trees for each training dataset via Gradient Boosted Trees.
59
Meanwhile, Figure 5.11 illustrates the optimal number of trees that should be considered
to accurately nowcast the growth of domestic liquidity under GBT. Panel A presents
that sixty-seven (67) iterations are necessary to provide a precise estimate of
domestic liquidity growth for January 2020. On the other hand, Panel B depicts that fifteen (15)
iterations are already sufficient for the GBT model to accurately nowcast domestic liquidity
growth for February 2020.
5.2.4.2.2. Nowcast Results
Similar to the results under regularization methods, utilizing RF and GBT as
primary nowcasting models also stipulates more consistent and accurate estimates in contrast
with the benchmark models conducted in this study. The monthly forecast errors of the
two (2) machine learning models are also significantly lower than those under ARIMA, RW,
auto-SARIMA, and DFM, except for the nowcast result of RF in September 2020.
Based on the recursive nowcasts, it can also be found that RF and GBT provide decent
projections on the months (e.g., March, April, May) where the growth of domestic liquidity
unexpectedly expands due to the increased borrowings and deposits of NG to the BSP
(Tables 5.7 and 5.8).
Figure 5.12: Tree-Based Method Nowcasts vs. Actual M3 Growth (January to December 2020)
(In Percent Difference, Seasonally Adjusted)
60
Aside from their robust monthly estimates, the overall nowcasts of RF and GBT based
on the expanding window also registered a lower set of RMSE and MAE.
The result indicates that RF only displayed forecast errors of 0.595 and 0.432 for RMSE and
MAE, respectively. Meanwhile, GBT provided marginal RMSE of 0.632 and MAE of 0.469.
The figures mentioned are significantly lower than the overall forecast errors provided by the
univariate and multivariate models performed in this study (Figure 5.13).
Table 5.7: RMSE of RF and GBT
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
OVR.
RF
0.346
0.389
0.879
1.455
0.265
0.208
0.167
0.265
0.855
0.203
0.077
0.307
0.595
GBT
0.180
0.686
0.986
1.536
0.060
0.495
0.305
0.241
0.636
0.248
0.201
0.216
0.632
Table 5.8: MAE of RF and GBT
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
OVR.
RF
0.345
0.377
0.830
1.454
0.242
0.201
0.140
0.235
0.852
0.147
0.058
0.307
0.432
GBT
0.179
0.684
0.972
1.530
0.060
0.490
0.243
0.201
0.636
0.218
0.200
0.215
0.469
Figure 5.13: Overall (a) RMSE and (b) MAE of Benchmark Models vs. Tree-Based Methods
(a)
(b)
Based on the aforementioned discussion, it can also be established that RF is the most
accurate tree-based model to nowcast the growth of domestic liquidity despite having an
inaccurate estimate in September 2020. Mainly because the said model notably provided the
highest number of months with precise estimates from January to December 2020.
61
This includes the nowcasts for January, February, March, April, June, July, November, and
December 2020 (Tables 5.7 and 5.8).
5.3. Further Analysis
5.3.1. Variable Importance
One of the main advantages of using machine learning algorithms in economic nowcasting
is their strong capability to identify critical factors that could comprehensively explain the
movement or growth of a particular macroeconomic indicator and scenario. Numerous studies
have already established that these algorithms can formulate quantitative models
with accurate estimates despite using a limited number of indicators.
63
Among the machine learning models that specifically have this ability are regularization and
tree-based methods, such as LASSO, ENET, RF, and GBT.
64
5.3.1.1. LASSO and ENET
Based on the recursive nowcasts conducted by LASSO and ENET from
January and February 2020, it was found that (1) foreign exchange rate (FOREX),
(2) inflow of FPI, (3) LIBOR, (4) bank savings rate, (5) NG deposits to the central bank, and
(6) liabilities of other sectors to the central bank are among the critical indicators that should
be considered in estimating the growth of domestic liquidity in the Philippines.
Mainly because among the twenty-one (21) indicators used as input variables, these are the
consistent determinants under LASSO and ENET that do not stipulate zero coefficients
in January and February 2020 (Table 5.9).
65
63
See the studies of Cepni et al. (2018), Richardson et al. (2018), Ferrara and Simoni (2019), and Tamara et al. (2020).
64
See Chapter 3 for the comprehensive discussion on these models.
65
Other months identified BSP Discount Rate, Bank Savings Rate, and WMOR as important indicators (See Annex H and I).
62
5.3.1.2. Random Forest and Gradient Boosted Trees
The critical indicators identified under RF and GBT are similar to the input variables
that LASSO and ENET provided. However, the main difference is that both of the tree-based
methods used in this study have identified that lagged values of the target variable,
as an input variable, are also crucial to provide an accurate estimate of domestic liquidity growth
in the Philippines. In particular, Figures 5.14 and 5.15 indicate that (1) M3 ,
(2) liabilities of other sectors to the central bank (OSC), and (3) NG deposits to the central
bank (NGD) are by far the three (3) most important variables that should be considered in
estimating the growth of domestic liquidity in the Philippines.
Table 5.9: Variable Coefficients via LASSO and ENET from (January-February 2020)
NO.
VARIABLE
LASSO
(JAN. 2020)
LASSO
(FEB. 2020)
ENET
(JAN. 2020)
ENET
(FEB. 2020)
-
Intercept
0.016
0.015
0.016
0.015
1
M3 Growth (T-1)
-
-
-
-
2
BSP Liabilities on National Government
-0.015
-0.015
-0.014
-0.014
3
BSP Claims on Other Sectors
0.235
0.235
0.216
0.216
4
Foreign Portfolio Investment (In)
-0.003
-0.004
-0.010
-0.010
5
Foreign Portfolio Investment (Out)
-
-
-
-
6
Available Reserves
-
-
-
-
7
Reserve Money
-
-
-
-
8
CBOE Volatility Index
-
-
-
-
9
Credit Default Swap
-
-
-
-
10
London Interbank Reference Rate
0.111
0.114
0.097
0.100
11
Singapore Interbank Reference Rate
-
-
-
-
12
Philippine Interbank Reference Rate
-
-
-
-
13
Philippine Government Bond Rate
-
-
-
-
14
BSP Discount Rate
-
-
-
-
15
Bank Savings Rate
-0.103
-0.110
-0.080
-0.087
16
Bank Prime Rate
-
-
-
-
17
Money Market Rate (Promissory Note)
-
-
-
-
18
Treasury Bill Rate
-
-
-
-
19
Interbank Call Rate
-
-
-
-
20
Philippine Peso per US Dollar (FOREX)
0.124
0.124
0.111
0.119
21
Weighted Monetary Operations Rate
-
-
-
-
63
Figure 5.14: Node Impurity via Random Forest
Figure 5.15: Variable Importance Plot via Gradient Boosted Trees
64
CHAPTER VI: CONCLUSION
CHAPTER VII: RECOMMENDATION
65
Chapter VI:
CONCLUSION
6.1. Summary and Conclusion
Domestic liquidity (also known as broad money) is defined as the sum of all
liquid financial instruments held by money-holding sectors that are used as a
medium of exchange in an economy (IMF, 2016). The changes in the overall growth of this
monetary indicator are among the most important dynamics that numerous central banks are
closely monitoring. This is because of its property of being an essential element to the
overall transmission mechanism of monetary policy, particularly the impact of
money supply expansion or contraction on aggregate demand, interest rates, inflation, and
overall economic growth (Mankiw, n.d.).
In the Philippines, data on domestic liquidity is used as a primary component
to formulate monetary policy and utilized as a leading indicator to observe
price and financial stability. However, similar to the concerns regarding the delayed publication
of data or statistical indicators generated by most government offices, data on domestic liquidity
in the said country also suffers from series of lags and revisions. Due to this predicament,
policymakers in the Central Bank of the Philippines or Bangko Sentral ng Pilipinas (BSP)
typically formulate monetary policies and address different economic phenomena (e.g., inflation,
business cycle) using its outdated or lagged values.
The concept of short-
methodologies utilized by numerous institutions (e.g., International Financial Institutions (IFIs),
central banks) to address the aforementioned issues in data publication. This approach,
at present, also became prevalent because of the emergence of the use of big data and
machine learning. These approaches augment the overall process in providing a solution for the
difficulty in producing data on a real-time basis. Mainly because the two (2) methodologies
provide complementary information concerning the macroeconomic data that government offices
66
usually published and stipulate accurate estimates using an immense amount of data or
information, respectively (Hassani and Silva, 2015; Richardson et al., 2018).
Drawing upon this background, the concept of nowcasting using different
machine learning algorithms is utilized in this study to address the aforementioned issues,
particularly in addressing the lag data release on domestic liquidity in the Philippines.
This objective intends to formulate an accurate quantitative model that the BSP can sustainably
use to estimate the short-run growth of said monetary indicator. Therefore, five (5) popular
machine learning algorithms under regularization methods (i.e., Ridge Regression,
Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net (ENET)) and
tree-based method (i.e., Random Forest (RF), Gradient Boosted Trees (GBT)) using different
high-frequency monetary, financial, and external sector indicators from January 2008 to
December 2020 are performed to support the objective of this study. The performances of these
algorithms are then compared against traditional time series models such as Autoregressive (AR)
and Dynamic Factor Models (DFM). In particular, their respective one-step-ahead
(out-of-sample) nowcasts under an expanding window process are evaluated based on monthly
and overall Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE).
The results demonstrate that machine learning algorithms provide more accurate
estimates than the benchmark models used in this study. Mainly because the said approaches
registered consistent monthly estimates with low forecast errors. Tables 6.1 and 6.2 depict that
the nowcasts of machine learning algorithms are more accurate than the estimates provided by
AR models and DFM. It can also be observed that the overall RMSE and MAE of
all machine learning models used in this study are more accurate than the benchmark models.
These algorithms, in addition, registered precise estimates on the months (i.e., March, April,
May) where domestic liquidity growth suddenly expand (e.g., increased borrowings and deposits
of the National Government (NG) to BSP) due to the impact of the Coronavirus Disease 2019
(COVID-19) in the Philippines. Based on these outcomes, it can be concluded that both
regularization and tree-based machine learning algorithms could be used as alternative models
to estimate the growth of domestic liquidity in the Philippines.
67
Table 6.1: RMSE of Benchmark and Machine Learning Models (Summary)
66
Table 6.2: MAE of Benchmark and Machine Learning Models (Summary)
Figure 6.1: Overall Forecast Errors of Benchmark and Machine Learning Models
(a)
(b)
However, among the quantitative models, LASSO and RF provided the highest number
of months (i.e., three/four out of twelve) with at least low forecast error from January to
December 2020. The Ridge Regression and ENET, on the other hand, registered the lowest
overall RMSE and MAE with 0.529 and 0.391, respectively (Figure 6.1). These results provide
a shred of solid evidence that nowcasting through regularization methods is the most appropriate
approach to nowcast the said monetary indicator using machine learning algorithms.
66
The red-colored cells represent high forecast errors, while yellow- and green-colored cells are moderate to low forecast errors.
68
Using machine learning algorithms as a primary nowcasting approach also provides
substantial advantages against traditional time series models such as AR and DFM.
This is because the regularization and tree-based machine learning models can filter out or
identify important indicators that could stipulate parsimonious nowcasting models with precise
results. The results of the conducted recursive nowcasts based on LASSO, ENET, Random
Forest, and Gradient Boosted Trees indicate that (1) BSP Liabilities on National Government,
(2) BSP Claims on Other Sectors, (3) Foreign Exchange Rate, and (4) Lagged Values of M3 are
among the critical indicators that should be considered in estimating the growth of domestic
liquidity in the Philippines.
69
Chapter VII:
RECOMMENDATION
7.1. Potential Actions
Since the results of the conducted recursive nowcasting established the superiority of
different machine learning algorithms in estimating domestic liquidity growth in the Philippines,
this study highly recommends that the departments (i.e., statistics, research departments) under
the Central Bank of the Philippines or Bangko Sentral ng Pilipinas (BSP) should adopt and
utilize the concept of big data and machine learning. Implementing these concepts could support
the objective of the BSP in conveying data-based monetary policy in the country.
Furthermore, the additional data or information that can be gathered by the
different departments in the said institution could further improve the individual and
overall accuracy of each machine learning algorithm used in this study.
However, although this cannot be guaranteed, it is always better to calibrate models using an
immense amount of data or information than operating with a limited number of indicators.
Among the possible determinants that the BSP could explore and collect over time are
high-frequency (e.g., daily, weekly) unconventional data or information regarding the
credit condition of the Philippine Banking System (PBS) and the overall demand of the general
public to hold or forego money. Mainly because domestic credit which is composed of
loans outstanding for production and household consumption is considered a significant
contributor to the monthly change in domestic liquidity in the Philippines.
The study also recommends a regular and sustainable way of accumulating other
statistics related to the critical indicators identified in this study. This could include
high-frequency data or information regarding (1) debt securities issued by the
National Government (NG) and the BSP, (2) amount of loans granted by the BSP to
Other Depository Corporations (ODCs), (3) amount of loans granted by the BSP to
70
Other Sectors (e.g., Other Financial Corporations), and (4) New Effective Exchange Rate
(NEER) Indices of Philippine Peso.
7.2. Suggestions for Future Research
As mentioned in the previous chapters, this study has limitations in formulating the
different nowcasting models using time series and machine learning algorithms.
Therefore, the following are suggested to enhance the results and comprehensiveness of this
research:
a. It is recommended to combine the different machine learning algorithms with
low monthly and overall forecast errors. This approach (known as the
ensemble method) is performed to have a single model that contains the strength of
each algorithm. Studies of Tiffin (2016), Richardson et al. (2018),
Mariano and Ozmucur (2020), and Tamara et al. (2020) have already utilized this
approach.
b. Other robust econometric approaches such as Mixed Data Sampling (MIDAS)
Regression and Mixed Frequency Vector Autoregression (MF-VAR) are
recommended to be part of the benchmark models. These particular methods are
mainly used for models with target and input variables with a large number of
observations and data with different levels of granularity.
c. Non-parametric machine learning algorithms, such as Neural Networks and
Support Vector Machines (SVM), could also be included as models to
nowcast domestic liquidity in the Philippines.
d. The use of more granular data or information regarding the critical indicators
identified in this study is recommended to be part of input variables under the
machine learning algorithms used in this study. In particular, the daily volume or
amount of (1) BSP Liabilities on NG, (2) BSP Claims on Other Sectors,
and (3) Other Foreign Exchange Rates (e.g., PHP per JPY) are useful to enhance
the result of this research.
BIBLIOGRAPHY
Adriansson, N., & Mattsson, I. (2015). Forecasting GDP Growth, or How Can Random Forests
Improve Predictions in Economics. Uppsala University - Department of Statistics.
Aguilar, R. A., Mahler, D., & Newhouse, D. (2019). Nowcasting Global Poverty.
IARIW - World Bank.
Baldacci, E., Buono, D., Kapetanio, G., Krische, S., Marcellino, M. M., & Papailias, F. (2016).
Big Data and Macroeconomic Nowcasting: From Data Access to Modelling.
Eurostat Statistical Book.
Banbura, M., Gionnone, D., Modugno, M., & Reichlin, L. (2013). Nowcasting and
The Real-Time Data Flow. European Central Bank - Working Paper Series No. 1564.
Bangko Sentral ng Pilipinas (BSP). (2018). Depository Corporations Survey (DCS) -
Frequently Asked Questions. Manila, Philippines: Bangko Sentral ng Pilipinas.
Bangko Sentral ng Pilipinas (BSP). (2020, July). BSP Organization Primer.
https://www.bsp.gov.ph/About%20the%20Bank/BSP%20Org%20Primer.pdf
Biau, O., & D'Elia, A. (2010). Euro Area GDP Forecast Using Large Survey Dataset -
A Random Forest Approach. Euroindicators Working Papers.
Bolhuis, M., & Rayner, B. (2022). Deus ex Machina? A Framework for Macro Forecasting with
Machine Learning. IMF Working Paper.
Carriere-Swallow, Y., & Haksar, V. (2019). The Economics and Implications of Data:
An Integrated Perspective. Washington, D.C., USA: International Monetary Fund
(IMF).
Cepni, O., Guney, E., & Swanson, N. (2018). Forecasting and Nowcasting Emerging Market
GDP Growth Rate: The Role of Latent Global Economic Policy Uncertainty and
Macroeconomic Data Surprise Factors. Journal of Forecasting.
Chan-Lau, J. (2017). Lasso Regression and Forecasting Models in Applied Stress Testing.
IMF Working Paper.
Chikamatsu, K., Hirakata, N., Kido, Y., & Otaka, K. (2018). Nowcasting Japanese GDPs.
Bank of Japan Working Paper Series.
Dafnai, G., & Sidi, J. (2010). Nowcasting Israel GDP Using High-Frequency Macroeconomic
Disaggregates. Bank of Israel Discussion Paper No. 2010.16.
Doguwa, S., & Alade, S. (2015). On-Time Series Modeling of Nigeria's External Reserves.
CBN Journal of Applied Statistics.
Fan, J. (2019). Real-Time GDP Nowcasting in New Zealand. Massey University -
School of Natural and Computational Sciences.
Ferrara, L., Simoni, & Anna. (2019). When are Google Data Useful to Nowcast GDP?
An Approach via Pre-Selection and Shrinkage. EconomiX - Universite Paris Nanterre.
Fornano, P., Luomaranta, H., & Saarinen, L. (2017). Nowcasting Finnish Turnover Indexes
Using Firm-Level Data. ETLA Working Papers No. 46.
Hang, Q. (2010). Vector Autoregression with Varied Frequency Data.
Munich Personal RePEc Archive.
Hassani, H., & Silva, E. (2015). Forecasting with Big Data: A Review. Ann. Data Science, 5-19.
Hussain, F., Hyder, S., & Rehman, M. (2018). Nowcasting LSM Growth in Pakistan.
SBP Working Paper Series No.98.
Ikoku, A. (2014). Modeling and Forecasting Currency in Circulation for Liquidity Management
in Nigeria. CBN Journal of Applied Statistics.
International Monetary Fund. (2016). Monetary and Financial Statistics Manual and
Compilation Guide. Washington, D.C., USA.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to
Statistical Learning, With Applications in R. Springer.
Mankiw, N. (n.d.). Macroeconomics (9th Edition).
Mapa, D. (2018). Nowcasting Inflation Rate in the Philippines using Mixed-Frequency Models.
University of the Philippines - School of Statistics.
Mariano, R., & Ozmucur, S. (2015). High-Mixed-Frequency Dynamic Latent Factor Forecasting
Models for GDP in the Philippines. Estudios de Economia Aplicada, 451-462.
Mariano, R., & Ozmucur, S. (2020). Predictive Performance of Mixed-Frequency Nowcasting
and Forecasting Models (with Application to Philippine Inflation and GDP Growth).
University of Pennsylvania, Department of Economics.
Medel, C., & Pincheira, P. (2015). Forecasting Inflation with a Simple and Accurate
Benchmark: The Case of the US and a Set of Inflation Targeting Countries.
Czech Journal of Economics and Finance.
Meyler, A., Kenny, G., & Quinn, T. (1998). Forecasting Irish Inflation Using ARIMA Models.
Munich Personal RePEc Archive.
Mishkin, F. (n.d.). The Economics of Money, Banking, and Financial Markets
(11th Edition).
Nau, R. (2014). Notes on ARIMA Models for Time Series Forecasting. Fuqua School of Business,
Duke University.
Pincheira, P., & Medel, C. (2016). Forecasting with a Random Walk. Czech Journal of
Economics and Finance, 539-564.
Rajapov, S., & Axmadjonov, A. (2018). The Forecasting Budget Revenues in ARDL Approach:
A Case of Uzbekistan. International Journal of Innovative Technologies in Economy.
Richardson, A., Van Florenstein, T., & Vehbi, T. (2018). Nowcasting New Zealand GDP using
Machine Learning Algorithms. Irving Fischer Committee on Central Bank Statistics -
Bank of International Settlements.
Rufino, C. (2017). Nowcasting Philippine Economic Growth using MIDAS Regression Modeling.
DLSU Angelo King Institute for Economic and Business Studies.
Soybilgen, B., & Yazgan, E. (2021). Nowcasting US GDP Using Tree-Based Ensemble Models
and Dynamic Factors. Computational Economics, 387-417.
Tamara, N., Muchisha, D., Andriansyah, & Soleh, A. (2020). Nowcasting Indonesia's GDP
Growth Using Machine Learning Algorithms. Munich Personal RePEc Archive (MPRA)
No. 105235.
Tiffin, A. (2016). Seeing in the Dark: A Machine-Learning Approach to Nowcasting in Lebanon.
IMF Working Paper.
Woloszko, N. (2020). Adaptive Trees: A New Approach to Economic Forecasting.
OECD Economics Department Working Papers No. 1593.
Wooldridge, J. M. (2012). Introductory Econometrics: A Modern Approach (5th Edition).
Cengage Learning.
Ziel, F. (2022). Load Nowcasting: Predicting Actuals with Limited Data. Energies.
ANNEX A
R Studio Packages
NO.
PACKAGE
AUTHOR/S
SOURCE URLs
1
caret
Kuhn et al.
https://cran.r-project.org/web/packages/caret/vignettes/caret.html
2
dplyr
-
https://cran.r-project.org/web/packages/dplyr/dplyr.pdf
3
forecast
Hyndman et al.
https://cran.r-project.org/web/packages/forecast/forecast.pdf
4
gbm
Greenwell et al.
https://cran.r-project.org/web/packages/gbm/gbm.pdf
5
ggplot2
Wickham et al.
https://cran.r-project.org/web/packages/ggplot2/ggplot2.pdf
6
glmnet
Friedman et al.
https://cran.r-project.org/web/packages/glmnet/glmnet.pdf
7
hrbrthemes
Rudis et al.
https://cran.r-project.org/web/packages/hrbrthemes/hrbrthemes.pdf
8
leaps
Lumely, T.
https://cran.r-project.org/web/packages/leaps/leaps.pdf
9
lubridate
Spinu et al.
https://cran.r-project.org/web/packages/lubridate/lubridate.pdf
10
maptree
White and Gramacy
https://cran.r-project.org/web/packages/maptree/maptree.pdf
11
Metrics
Hamner et al.
https://cran.r-project.org/web/packages/Metrics/Metrics.pdf
12
mFilter
Balcilar, M.
https://cran.r-project.org/web/packages/mFilter/mFilter.pdf
13
pls
Mevik et al.
https://cran.r-project.org/web/packages/pls/pls.pdf
14
psych
Revelle, W.
https://cran.r-project.org/web/packages/psych/psych.pdf
15
randomForest
Breiman et al.
https://cran.r-project.org/web/packages/randomForest/randomForest.pdf
16
repr
Angerer P.
https://cran.r-project.org/web/packages/repr/repr.pdf
17
tidyverse
Wickham, H.
https://cran.r-project.org/web/packages/tidyverse/tidyverse.pdf
18
tree
Ripley, B.
https://cran.r-project.org/web/packages/tree/tree.pdf
19
tsDyn
Di Narzo et al.
https://cran.r-project.org/web/packages/tsDyn/tsDyn.pdf
20
tseries
Trapletti et al.
https://cran.r-project.org/web/packages/tseries/tseries.pdf
21
TStudio
Krispin, R.
https://cran.r-project.org/web/packages/TSstudio/TSstudio.pdf
22
urca
Pfaff et al.
https://cran.r-project.org/web/packages/urca/urca.pdf
23
vars
Pfaff and Stigler
https://cran.r-project.org/web/packages/vars/vars.pdf
24
xgboost
Chen et al.
https://cran.r-project.org/web/packages/xgboost/xgboost.pdf
ANNEX B
Unit Root Tests for Input Variables
VARIABLE
TEST
LEVEL OF
SIGNIF.
P-VALUE
(LEVEL/GROWTH /LOG)
P-VALUE
(FIRST DIFF.)
BSP Liabilities on NG
ADF
0.05
0.01
0.01
PP
0.01
0.01
BSP Claims on Other Sectors
ADF
0.05
0.80
0.01
PP
0.79
0.01
FPI (In)
ADF
0.05
0.32
0.01
PP
0.01
0.01
FPI (Out)
ADF
0.05
0.17
0.01
PP
0.01
0.01
Available Reserves
ADF
0.05
0.99
0.01
PP
0.97
0.01
Reserve Money
ADF
0.05
0.99
0.01
PP
0.98
0.01
CBOE Volatility Index
ADF
0.05
0.07
0.01
PP
0.01
0.01
Credit Default Swap
ADF
0.05
0.22
0.01
PP
0.05
0.01
LIBOR
ADF
0.05
0.26
0.01
PP
0.34
0.01
SIBOR
ADF
0.05
0.73
0.01
PP
0.66
0.01
PHIREF
ADF
0.05
0.22
0.01
PP
0.01
0.01
Phil. Government Bond Rate
ADF
0.05
0.34
0.01
PP
0.66
0.01
BSP Discount Rate
ADF
0.05
0.16
0.01
PP
0.28
0.01
Bank Savings Rate
PP
0.05
0.28
0.01
PP
0.97
0.01
Bank Prime Rate
ADF
0.05
0.92
0.01
PP
0.93
0.01
Money Market Rate (P. Note)
ADF
0.05
0.10
0.01
PP
0.01
0.01
Treasury Bill Rate
ADF
0.05
0.60
0.01
PP
0.67
0.01
ANNEX B
ADF and PP Tests of Input Variables Cont.
VARIABLE
TEST
LEVEL OF
SIGNIF.
P-VALUE
(LEVEL/GROWTH /LOG)
P-VALUE
(FIRST DIFF.)
Interbank Call Rate
ADF
0.05
0.56
0.01
PP
0.88
0.01
PHP per USD (FOREX)
ADF
0.05
0.77
0.01
PP
0.82
0.01
WMOR
ADF
0.05
0.48
0.01
PP
0.87
0.01
ANNEX C
Optimal Shrinkage Penalty via Ridge Regression
January 2020 0.772
February 2020 1.012
March 2020 0.577
April 2020 0.700
May 2020 0.691
June 2020 0.523
ANNEX C
Optimal Shrinkage Penalty via Ridge Regression Cont.
July 2020 0.589
August 2020 0.491
September 2020 0.411
October 2020 0.415
November 2020 0.313
December 2020 0.600
ANNEX D
Optimal Shrinkage Penalty via LASSO
January 2020 0.737
February 2020 0.073
March 2020 0.060
April 2020 0.080
May 2020 0.060
June 2020 0.060
ANNEX D
Optimal Shrinkage Penalty via LASSO Cont.
July 2020 0.068
August 2020 0.051
September 2020 0.047
October 2020 0.048
November 2020 0.052
December 2020 0.069
ANNEX E
Optimal Shrinkage Penalty via ENET
January 2020 0.147
February 2020 0.146
March 2020 0.091
April 2020 0.147
May 2020 0.110
June 2020 0.110
ANNEX E
Optimal Shrinkage Penalty via ENET Cont.
July 2020 0.112
August 2020 0.103
September 2020 0.095
October 2020 0.095
November 2020 0.087
December 2020 0.126
ANNEX F
OOB Error of Training Datasets via Random Forest
January 2020 5 Variables (1.018)
February 2020 10 Variables (1.014)
March 2020 7 Variables (1.026)
April 2020 10 Variables (1.018)
May 2020 10 Variables (1.028)
June 2020 7 Variables (1.024)
ANNEX F
OOB Error of Training Datasets via Random Forest Cont.
July 2020 7 Variables (1.019)
August 2020 5 Variables (1.025)
September 2020 5 Variables (1.007)
October 2020 5 Variables (1.004)
November 2020 5 Variables (0.996)
December 2020 5 Variables (0.982)
ANNEX G
Optimal Number of Trees via Gradient Boosted Trees
January 2020 67 Iterations
February 2020 15 Iterations
March 2020 8 Iterations
April 2020 10 Iterations
May 2020 2 Iterations
June 2020 4 Iterations
ANNEX G
Optimal Number of Trees via Gradient Boosted Trees Cont.
July 2020 13 Iterations
August 2020 10 Iterations
September 2020 22 Iterations
October 2020 28 Iterations
November 2020 17 Iterations
December 2020 7 Iterations
ANNEX H
Variable Coefficients via LASSO: January to December 2020
NO.
VARIABLE
1/2020
2/2020
3/2020
4/2020
5/2020
6/2020
7/2020
8/2020
9/2020
10/2020
11/2020
12/2020
-
Intercept
0.016
0.015
0.010
0.020
0.020
0.021
0.022
0.017
0.017
0.013
0.016
0.020
1
M3 Growth (T-1)
-
-
-
-
-
-
-
-
-
-
-
-
2
BSP Liabilities on National Government
-0.015
-0.015
-0.017
-0.014
-0.017
-0.017
-0.016
-0.018
-0.018
-0.018
-0.017
-0.015
3
BSP Claims on Other Sectors
0.235
0.235
0.257
0.226
0.265
0.265
0.255
0.284
0.291
0.294
0.284
0.254
4
Foreign Portfolio Investment (In)
-0.003
-0.004
-0.042
-0.003
-0.050
-0.047
-0.018
-0.064
-0.070
-0.063
-0.026
-
5
Foreign Portfolio Investment (Out)
-
-
-
-
-
-
-
-
-
-
-
-
6
Available Reserves
-
-
-
-
-
-
-
-
-
-
-
-
7
Reserve Money
-
-
-
-
-
-
-
-
-
-
-
-
8
CBOE Volatility Index
-
-
-
-
-
-
-
-
-
-
-
-
9
Credit Default Swap
-
-
-
-
-
-
-
-
-
-
-
-
10
London Interbank Reference Rate
0.111
0.114
0.203
0.013
0.116
0.115
0.052
0.182
0.219
0.220
0.184
0.043
11
Singapore Interbank Reference Rate
-
-
-
-
-
-
-
-
-0.013
-
-
-
12
Philippine Interbank Reference Rate
-
-
-
-
-
-
-
-
-
-
-
-
13
Philippine Government Bond Rate
-
-
-
-
-
-
-
-
-
-
-
-
14
BSP Discount Rate
-
-
0.039
-
0.023
0.020
-
0.086
0.108
0.102
0.064
-
15
Bank Savings Rate
-0.103
-0.110
-0.396
-
-
-
-
-0.178
-0.243
-0.247
-0.157
-
16
Bank Prime Rate
-
-
-
-
-
-
-
-
-
-
-
-
17
Money Market Rate (Promissory Note)
-
-
-
-
-
-
-
-
-
-
-
-
18
Treasury Bill Rate
-
-
-
-
-
-
-
-
-
-
-
-
19
Interbank Call Rate
-
-
-
-
-0.062
-0.061
-0.036
-0.050
-0.049
-0.040
-0.038
-0.024
ANNEX H
Variable Coefficients via LASSO: January to December 2020 Cont.
NO.
VARIABLE
1/2020
2/2020
3/2020
4/2020
5/2020
6/2020
7/2020
8/2020
9/2020
10/2020
11/2020
12/2020
20
Philippine Peso Per Us Dollar (FOREX)
0.124
0.124
0.149
0.106
0.134
0.133
0.121
0.155
0.160
0.158
0.147
0.110
21
Weighted Monetary Operations Rate
-
-
-
-0.052
-0.844
-0.817
-0.645
-0.935
-1.030
-1.019
-0.920
-0.557
ANNEX I
Variable Coefficients via ENET: January to December 2020
NO.
VARIABLE
1/2020
2/2020
3/2020
4/2020
5/2020
6/2020
7/2020
8/2020
9/2020
10/2020
11/2020
12/2020
-
Intercept
0.016
0.015
0.007
0.019
0.019
0.020
0.020
0.017
0.017
0.014
0.014
0.019
1
M3 Growth (T-1)
-
-
-
-
-
-
-
-
-
-
-
-
2
BSP Liabilities on National Government
-0.014
-0.014
-0.017
-0.014
-0.016
-0.016
-0.016
-0.017
-0.017
-0.017
-0.017
-0.015
3
BSP Claims on Other Sectors
0.216
0.216
0.268
0.218
0.257
0.257
0.257
0.267
0.274
0.277
0.283
0.246
4
Foreign Portfolio Investment (In)
-0.010
-0.010
-0.086
-0.026
-0.068
-0.065
-0.053
-0.067
-0.072
-0.065
-0.056
-0.001
5
Foreign Portfolio Investment (Out)
-
-
-
-
-
-
-
-
-
-
-
-
6
Available Reserves
-
-
-
-
-
-
-
-
-
-
-
-
7
Reserve Money
-
-
-
-
-
-
-
-
-
-
-
-
8
CBOE Volatility Index
-
-
-
-
-
-
-
-
-
-
-
-
9
Credit Default Swap
-
-
-
-
-
-
-
-
-
-
-
-
10
London Interbank Reference Rate
0.097
0.100
0.301
0.054
0.142
0.141
0.127
0.161
0.201
0.199
0.249
0.074
11
Singapore Interbank Reference Rate
-
-
-
-
-
-
-
-
-0.033
-0.007
-0.053
-
12
Philippine Interbank Reference Rate
-
-
-
-
-
-
-
-
-
-
-
-
13
Philippine Government Bond Rate
-
-
-
-
-
-
-
-
-
-
-
-
14
BSP Discount Rate
-
-
0.142
-
0.053
0.050
0.041
0.074
0.094
0.089
0.115
-
15
Bank Savings Rate
-0.080
-0.087
-0.617
-
-0.079
-0.082
-0.065
-0.164
-0.229
-0.231
-0.309
-
16
Bank Prime Rate
-
-
-
-
-
-
-
-
-
-
-
-
17
Money Market Rate (Promissory Note)
-
-
-
-
-
-
-
-
-
-
-
-
18
Treasury Bill Rate
-
-
-
-
-
-
-
-
-
-
-
-
19
Interbank Call Rate
-
-
-0.015
-0.012
-0.0823
-0.081
-0.075
-0.070
-0.069
-0.061
-0.061
-0.056
ANNEX I
Variable Coefficients via ENET: January to December 2020 Cont.
NO.
VARIABLE
1/2020
2/2020
3/2020
4/2020
5/2020
6/2020
7/2020
8/2020
9/2020
10/2020
11/2020
12/2020
20
Philippine Peso Per Us Dollar (FOREX)
0.111
0.119
0.177
0.115
0.142
0.141
0.139
0.151
0.156
0.153
0.162
0.119
21
Weighted Monetary Operations Rate
-
-
-0.285
-0.151
-0.877
-0.851
-0.795
-0.847
-0.936
-0.929
-1.012
-0.590