finRes is home to a number of packages that, although self-contained with consumption value on their own, host datasets that play important roles in the finRes suite, mostly in relation to data collection, storage and wrangling but also to analytics and asset pricing in particular. At the time of writing, the set of dataset packages in finRes includes: BBGsymbols, fewISOs, GICS, FFresearch and factors.
The BBGsymbols package plays a critical role in finRes where it provides both the pullit and the storethat packages with support for interacting with Bloomberg through the Rblpapi interface (Armstrong, Eddelbuettel, and Laing 2021).
library(BBGsymbols)
data(list = c("fields", "months", "rolls", "tickers_cftc", "tickers_futures"), package = "BBGsymbols")
The fields
dataset is the workhorse in BBGsymbols; it gathers Bloomberg datafields that have been carefully selected over time through experience. It provides popular historical and contemporaneous data fields that are likely to provide the necessary information and beyond for any rigorous research or more applied work in finance and financial economics. Financial instruments covered at the time of writing include ‘equity’, referring to any equity like security, ‘fund’ encompassing any money managing entity and ‘futures’ covering the futures markets. The author welcomes pull requests that could help expanding the current coverage.
#> # A tibble: 961 x 10
#> instrument book type subtype section subsection name id symbol
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 equity key … adju… <NA> <NA> <NA> mark… RR250 HISTO…
#> 2 equity key … adju… <NA> <NA> <NA> ente… RR472 ENTER…
#> 3 equity key … adju… <NA> <NA> <NA> adju… IS010 SALES…
#> 4 equity key … adju… <NA> <NA> <NA> adju… RR861 GROSS…
#> 5 equity key … adju… <NA> <NA> <NA> adju… RR009 EBITDA
#> 6 equity key … adju… <NA> <NA> <NA> adju… RR530 EARN_…
#> 7 equity key … adju… <NA> <NA> <NA> adju… IS147 IS_DI…
#> 8 equity key … adju… <NA> <NA> <NA> cash… CF015 CF_CA…
#> 9 equity key … adju… <NA> <NA> <NA> capi… RR014 CAPIT…
#> 10 equity key … adju… <NA> <NA> <NA> free… RR008 CF_FR…
#> # … with 951 more rows, and 1 more variable: description <chr>
The months
dataset details the symbols used to refer to calendar months in Bloomberg and financial markets in general. It is particularly useful when working with financial derivatives such as futures contracts.
#> name symbol
#> 1: January F
#> 2: February G
#> 3: March H
#> 4: April J
#> 5: May K
#> 6: June M
#> 7: July N
#> 8: August Q
#> 9: September U
#> 10: October V
#> 11: November X
#> 12: December Z
The rolls
dataset details the symbols used to refer to the various roll types and adjustments available in Bloomberg when working with futures term structure contracts. These symbols can be used to construct bespoke tickers that allow the user to query Bloomberg for futures term structure data with the desired roll characteristics.
#> roll symbol name
#> 1: type A With active future
#> 2: type B Bloomberg default
#> 3: type D At first delivery
#> 4: type F Fixed day of month
#> 5: type N Relative to first notice
#> 6: type O At option expiration
#> 7: type R Relative to expiration
#> 8: adjustment D Difference
#> 9: adjustment N None
#> 10: adjustment R Ratio
#> 11: adjustment W Average
The tickers_cftc
dataset gathers Bloomberg position data tickers for a number of futures series. These tickers allow direct retrieval from Bloomberg via pullit of corresponding position data as reported by the US Commodity Futures Trading Commission (CFTC) in a collection of weekly market reports including the ‘legacy’, ‘disaggregated’, ‘supplemental’ and ‘traders in financial futures’ (TFF) reports. See ?tickers_CFTC
for details.
#> name asset class
#> 1: California Carbon Allowance Vintage 2014 climate
#> 2: California Carbon Allowance Vintage 2014 climate
#> 3: California Carbon Allowance Vintage 2014 climate
#> 4: California Carbon Allowance Vintage 2014 climate
#> 5: California Carbon Allowance Vintage 2014 climate
#> ---
#> 22215: LIBOR rate - 1-month fixed income
#> 22216: LIBOR rate - 1-month fixed income
#> 22217: LIBOR rate - 1-month fixed income
#> 22218: LIBOR rate - 1-month fixed income
#> 22219: LIBOR rate - 1-month fixed income
#> active contract ticker MIC format underlying unit
#> 1: <NA> IFUS disaggregated futures & options contracts
#> 2: <NA> IFUS disaggregated futures & options contracts
#> 3: <NA> IFUS disaggregated futures & options contracts
#> 4: <NA> IFUS disaggregated futures & options contracts
#> 5: <NA> IFUS disaggregated futures & options contracts
#> ---
#> 22215: EMA Comdty XCME legacy futures only traders
#> 22216: EMA Comdty XCME legacy futures only traders
#> 22217: EMA Comdty XCME legacy futures only traders
#> 22218: EMA Comdty XCME legacy futures only traders
#> 22219: EMA Comdty XCME legacy futures only traders
#> participant position ticker
#> 1: managed money long CC21DMML Index
#> 2: managed money net CC21DMMN Index
#> 3: managed money short CC21DMMS Index
#> 4: managed money spreading CC21DMMD Index
#> 5: other reportables long CC21DORL Index
#> ---
#> 22215: non-commercial short IMM11TNS Index
#> 22216: non-commercial spreading IMM11TNP Index
#> 22217: total long IMM11TTL Index
#> 22218: total short IMM11TTS Index
#> 22219: total total IMM11TTO Index
The tickers_futures
dataset gathers futures active contract Bloomberg tickers as well as a collection of qualitative information for several popular futures series including commodity, currency, financial and index futures with underlyings from various asset classes.
#> # A tibble: 229 x 15
#> ticker name `asset class` sector subsector currency MIC `term structure…
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 BSDA … Carb… climate <NA> <NA> USD IFED 50
#> 2 V6A C… Butt… commodity agric… dairy USD XCME 24
#> 3 CHEA … Chee… commodity agric… dairy USD XCME 25
#> 4 DAA C… Milk… commodity agric… dairy USD XCME 24
#> 5 KVA C… Milk… commodity agric… dairy USD XCME 23
#> 6 LEA C… Milk… commodity agric… dairy USD XCME 24
#> 7 WBSA … Whea… commodity agric… financial USD XCBT 12
#> 8 QKA C… Whea… commodity agric… grains GBP IFLX 10
#> 9 RSA C… Cano… commodity agric… grains CAD IFUS 11
#> 10 C A C… Corn… commodity agric… grains USD XCBT 17
#> # … with 219 more rows, and 7 more variables: `contract size` <int>, `trading
#> # unit` <chr>, `point value` <dbl>, `tick size` <dbl>, `tick value` <dbl>,
#> # FIGI <chr>, description <chr>
fewISOs provides a collection of financial economics related ISO code datasets conveniently packaged for consumption in R. Beyond their self-contained consumption value these datasets belong to finRes where they help with data wrangling and exploration. At the time of writing fewISOs hosts the countries
, currencies
and exchanges
datasets.
The countries
dataset corresponds to the ISO 3166-1 sub-standard, part of the ISO 3166 standard published by the International Organization for Standardization (ISO) that defines codes for the names of countries, dependent territories, special areas of geographical interest, and their principal subdivisions (e.g., provinces or states). The sub-standard comes in three sets of country codes, all provided in the dataset:
#> name alpha 2 alpha 3 numeric capital
#> 1: Afghanistan AF AFG 004 Kabul
#> 2: Åland Islands AX ALA 248 Mariehamn
#> 3: Albania AL ALB 008 Tirana
#> 4: Algeria DZ DZA 012 Algiers
#> 5: American Samoa AS ASM 016 Pago Pago
#> ---
#> 246: Wallis and Futuna Islands WF WLF 876 Mata Utu
#> 247: Western Sahara EH ESH 732 El-Aaiun
#> 248: Yemen YE YEM 887 Sanaa
#> 249: Zambia ZM ZMB 894 Lusaka
#> 250: Zimbabwe ZW ZWE 716 Harare
The currencies
dataset corresponds to the ISO 4217 standard that defines codes for worldwide currencies and comes as a three-letter alphabetic as well as an alternative three-digit numeric code, both provided in the dataset. The ISO 4217 three-letter alphabetic code standard is based on the ISO 3166-1 code standard for countries with the first two letters corresponding the ISO 3166-1 alpha-2 code for the country issuing the corresponding currency and the third corresponding to the first letter of the currency name when possible. The three-digit numeric code is the same as the ISO 3166-1 numeric code for the issuing country when possible.
#> name alphabetic numeric minor unit country
#> 1: UAE Dirham AED 784 2 AE
#> 2: Afghani AFN 971 2 AF
#> 3: Lek ALL 008 2 AL
#> 4: Armenian Dram AMD 051 2 AM
#> 5: Netherlands Antillean Guilder ANG 532 2 CW
#> ---
#> 151: CFP Franc XPF 953 0 PF
#> 152: Yemeni Rial YER 886 2 YE
#> 153: Rand ZAR 710 2 LS
#> 154: Zambian Kwacha ZMW 967 2 ZM
#> 155: Zimbabwe Dollar ZWL 932 2 ZW
The exchanges
dataset corresponds to the ISO 10383 standard that defines four alphanumeric character Market Identifier Codes (MIC). These are unique identification codes used to identify securities trading exchanges, trading platforms and regulated or non-regulated markets as sources of prices and related information in order to facilitate automated processing.
#> name MIC country city
#> 1: zobex ZOBX DE Berlin
#> 2: zurcher kantonalbank securities exchange ZKBX CH Zurich
#> 3: jse currency derivatives market ZFXM ZA Johannesburg
#> 4: zar x ZARX ZA Johannesburg
#> 5: zagreb stock exchange - apa ZAPA HR Zagreb
#> ---
#> 1665: athens exchange - apa AAPA GR Athens
#> 1666: credit agricole cib - systematic internaliser AACA FR Paris
#> 1667: a2x A2XX ZA Johannesburg
#> 1668: 360t 360T DE Frankfurt
#> 1669: ssy futures ltd - freight screen 3579 GB London
#> website
#> 1: www.boerse-berlin.de
#> 2: www.zkb.ch
#> 3: www.jse.co.za
#> 4: www.zarx.co.za
#> 5: www.zse.hr
#> ---
#> 1665: www.athexgroup.gr
#> 1666: www.ca-cib.com
#> 1667: www.a2x.co.za
#> 1668: www.360t.com
#> 1669: www.ssyonline.com
GICS packages the Global Industry Classification Standard (GICS) dataset for consumption in R. Beyond its self-contained consumption value GICS belongs to finRes where, along with BBGsymbols and fewISOs, it helps with data wrangling and exploration.
The GICS is a standardized classification system for equities developed jointly by Morgan Stanley Capital International (MSCI) and Standard & Poor’s. The GICS methodology is used by the MSCI indexes, which include domestic and international stocks, as well as by a large portion of the professional investment management community. The GICS hierarchy begins with 11 sectors and is followed by 24 industry groups, 68 industries, and 157 sub-industries. Each stock that is classified will have a coding at all four of these levels with all these provided in the standards
dataset.
#> # A tibble: 157 x 9
#> `sector id` `sector name` `industry group… `industry group… `industry id`
#> <dbl> <chr> <dbl> <chr> <dbl>
#> 1 10 Energy 1010 Energy 101010
#> 2 10 Energy 1010 Energy 101010
#> 3 10 Energy 1010 Energy 101020
#> 4 10 Energy 1010 Energy 101020
#> 5 10 Energy 1010 Energy 101020
#> 6 10 Energy 1010 Energy 101020
#> 7 10 Energy 1010 Energy 101020
#> 8 15 Materials 1510 Materials 151010
#> 9 15 Materials 1510 Materials 151010
#> 10 15 Materials 1510 Materials 151010
#> # … with 147 more rows, and 4 more variables: `industry name` <chr>,
#> # `subindustry id` <dbl>, `subindustry name` <chr>, description <chr>
FFresearch conveniently packages Fama/French asset pricing research data for consumption in R. The data is pulled directly from Kenneth French’s online data library.
library(FFresearch)
data(list = c("factors", "portfolios_univariate", "portfolios_bivariate", "portfolios_trivariate",
"portfolios_industries", "variables", "breakpoints"), package = "FFresearch")
The portfolios_univariate
dataset provides various feature time series for Fama/French portfolios formed on single variable sorts. Sorting variables include size, book-to-market, operating profitability and investment.
#> region frequency sort variable dividend weights portfolio
#> 1: US day market capitalization Y value Dec 2
#> 2: US day market capitalization Y value Dec 2
#> 3: US day market capitalization Y value Dec 2
#> 4: US day market capitalization Y value Dec 2
#> 5: US day market capitalization Y value Dec 2
#> ---
#> 2212298: US year residual variance Y equal Qnt 4
#> 2212299: US year residual variance Y equal Qnt 4
#> 2212300: US year residual variance Y equal Qnt 4
#> 2212301: US year residual variance Y equal Qnt 4
#> 2212302: US year residual variance Y equal Qnt 4
#> field period value
#> 1: return 19710104 -0.29
#> 2: return 19710105 1.65
#> 3: return 19710106 1.37
#> 4: return 19710107 0.11
#> 5: return 19710108 -0.19
#> ---
#> 2212298: return 2016 19.61
#> 2212299: return 2017 17.84
#> 2212300: return 2018 -12.37
#> 2212301: return 2019 19.54
#> 2212302: return 2020 37.81
The portfolios_bivariate
dataset provides various feature time series for Fama/French portfolios formed on two variable sorts. Sorting variables include size, book-to-market, operating profitability and investment.
#> region frequency sort variable 1 sort variable 2 dividend
#> 1: US day market capitalization book/market Y
#> 2: US day market capitalization book/market Y
#> 3: US day market capitalization book/market Y
#> 4: US day market capitalization book/market Y
#> 5: US day market capitalization book/market Y
#> ---
#> 6262922: North America year market capitalization momentum Y
#> 6262923: North America year market capitalization momentum Y
#> 6262924: North America year market capitalization momentum Y
#> 6262925: North America year market capitalization momentum Y
#> 6262926: North America year market capitalization momentum Y
#> weights portfolio field period value
#> 1: value BIG HiBM return 20110103 4.81
#> 2: value BIG HiBM return 20110104 0.16
#> 3: value BIG HiBM return 20110105 1.80
#> 4: value BIG HiBM return 20110106 -0.40
#> 5: value BIG HiBM return 20110107 -0.71
#> ---
#> 6262922: equal SMALL LoPRIOR return 2016 33.83
#> 6262923: equal SMALL LoPRIOR return 2017 27.42
#> 6262924: equal SMALL LoPRIOR return 2018 -27.13
#> 6262925: equal SMALL LoPRIOR return 2019 12.85
#> 6262926: equal SMALL LoPRIOR return 2020 57.99
The portfolios_trivariate
dataset provides various feature time series for Fama/French portfolios formed on three variable sorts. Sorting variables include size, book-to-market, operating profitability and investment.
#> region frequency sort variable 1 sort variable 2
#> 1: US month market capitalization book/market
#> 2: US month market capitalization book/market
#> 3: US month market capitalization book/market
#> 4: US month market capitalization book/market
#> 5: US month market capitalization book/market
#> ---
#> 1165820: North America year market capitalization operating profitability
#> 1165821: North America year market capitalization operating profitability
#> 1165822: North America year market capitalization operating profitability
#> 1165823: North America year market capitalization operating profitability
#> 1165824: North America year market capitalization operating profitability
#> sort variable 3 dividend weights portfolio field
#> 1: operating profitability Y value BIG HiBM.HiOP return
#> 2: operating profitability Y value BIG HiBM.HiOP return
#> 3: operating profitability Y value BIG HiBM.HiOP return
#> 4: operating profitability Y value BIG HiBM.HiOP return
#> 5: operating profitability Y value BIG HiBM.HiOP return
#> ---
#> 1165820: investment N equal SMALL LoOP.LoINV return
#> 1165821: investment N equal SMALL LoOP.LoINV return
#> 1165822: investment N equal SMALL LoOP.LoINV return
#> 1165823: investment N equal SMALL LoOP.LoINV return
#> 1165824: investment N equal SMALL LoOP.LoINV return
#> period value
#> 1: 197101 18.7986
#> 2: 197102 4.1366
#> 3: 197103 0.6142
#> 4: 197104 0.9330
#> 5: 197105 2.6881
#> ---
#> 1165820: 2016 40.9400
#> 1165821: 2017 29.5600
#> 1165822: 2018 -24.7600
#> 1165823: 2019 27.9800
#> 1165824: 2020 127.9700
The portfolios_industries
dataset provides various feature time series for Fama/French industry portfolios (Fama and French 1997).
#> region frequency dividend weights portfolio field period value
#> 1: US month Y value Aero return 197101 20.39
#> 2: US month Y value Aero return 197102 4.36
#> 3: US month Y value Aero return 197103 2.49
#> 4: US month Y value Aero return 197104 6.54
#> 5: US month Y value Aero return 197105 -4.19
#> ---
#> 2416940: US day Y equal Wood return 20210426 1.60
#> 2416941: US day Y equal Wood return 20210427 0.69
#> 2416942: US day Y equal Wood return 20210428 -1.39
#> 2416943: US day Y equal Wood return 20210429 0.58
#> 2416944: US day Y equal Wood return 20210430 -2.28
The factors
dataset provides the return (factors) and level (risk free rate) time series for the classic Fama/French asset pricing factors as used in their three (Fama and French 1992, 1993, 1995) and most recently five-factor (Fama and French 2015, 2016, 2017) asset pricing models very popular to the asset pricing enthusiasts.
#> region frequency factor period value
#> 1: US month CMA 197101 -0.14
#> 2: US month CMA 197102 -0.72
#> 3: US month CMA 197103 -2.69
#> 4: US month CMA 197104 0.72
#> 5: US month CMA 197105 0.30
#> ---
#> 466114: North America year WML 2016 -17.97
#> 466115: North America year WML 2017 5.16
#> 466116: North America year WML 2018 8.96
#> 466117: North America year WML 2019 -0.47
#> 466118: North America year WML 2020 21.95
The variables
dataset is a helper dataset that provides details, including construction methods, for the variables used to construct the portfolios and asset pricing factors above.
#> # A tibble: 23 x 3
#> name symbol description
#> <chr> <chr> <chr>
#> 1 market capitaliz… ME Market equity (size) is price times shares outstand…
#> 2 book value BE Book equity is constructed from Compustat data or c…
#> 3 book/market ME/BE The book-to-market ratio used to form portfolios in…
#> 4 operating profit… OP The operating profitability ratio used to form port…
#> 5 investment INV The investment ratio used to form portfolios in Jun…
#> 6 earnings/price E/P Earnings is total earnings before extraordinary ite…
#> 7 cash flow/price CF/P Cashflow is total earnings before extraordinary ite…
#> 8 dividend/price D/P The dividend yield used to form portfolios in June …
#> 9 accruals ACCR AC for June of year t is the change in operating wo…
#> 10 univariate marke… BETA β for June of year t is estimated using the precedi…
#> # … with 13 more rows
Finally, the breakpoints
dataset is a helper dataset that provides the times series for the variables breakpoints used to construct the variables that in turn allow the construction of the portfolios and asset pricing factors above-mentioned.
#> variable frequency percentile period value
#> 1: size month # positive 202104 1142.00
#> 2: size month 5% 202104 191.41
#> 3: size month 10% 202104 469.18
#> 4: size month 15% 202104 689.71
#> 5: size month 20% 202104 1035.99
#> ---
#> 168: pior returns month 80% 202104 126.58
#> 169: pior returns month 85% 202104 149.93
#> 170: pior returns month 90% 202104 182.36
#> 171: pior returns month 95% 202104 245.90
#> 172: pior returns month 100% 202104 3212.74
The factors package gathers various asset pricing research factor time series for convenient consumption in R with the data directly pulled from the authors’ website. The current version includes the factor data from Kenneth’s French, also available in the FFresearch package described above, as well as factor data from Robert F. Stambaugh.
The fama_french
dataset provides the return (factors) and level (risk free rate) time series for the classic Fama/French asset pricing factors as used in their three (Fama and French 1992, 1993, 1995) and most recently five-factor (Fama and French 2015, 2016, 2017) asset pricing models very popular to the asset pricing enthusiasts:
#> region frequency factor period value
#> 1: US month CMA 197101 -0.14
#> 2: US month CMA 197102 -0.72
#> 3: US month CMA 197103 -2.69
#> 4: US month CMA 197104 0.72
#> 5: US month CMA 197105 0.30
#> ---
#> 466114: North America year WML 2016 -17.97
#> 466115: North America year WML 2017 5.16
#> 466116: North America year WML 2018 8.96
#> 466117: North America year WML 2019 -0.47
#> 466118: North America year WML 2020 21.95
The stambaugh
dataset provides the return (factors) and level (risk free rate) time series for various research asset pricing factors put together by Robert F. Stambaugh and collaborators including Lubos Pastor and Yu Yuan. The factors include traded & non-traded liquidity (Pástor and Stambaugh 2003), as well as market, size and two ‘mispricing’ factors: management & performance (Stambaugh and Yuan 2016):
#> frequency factor period value
#> 1: month non-traded liquidity 196208 0.00426023
#> 2: month non-traded liquidity 196209 0.01172080
#> 3: month non-traded liquidity 196210 -0.07442466
#> 4: month non-traded liquidity 196211 0.02854555
#> 5: month non-traded liquidity 196212 0.01435009
#> ---
#> 72608: month market 201608 0.00520000
#> 72609: month market 201609 0.00270000
#> 72610: month market 201610 -0.02000000
#> 72611: month market 201611 0.04870000
#> 72612: month market 201612 0.01850000
Armstrong, Whit, Dirk Eddelbuettel, and John Laing. 2021. Rblpapi: R Interface to ’Bloomberg’. https://CRAN.R-project.org/package=Rblpapi.
Fama, Eugene F., and Kenneth R. French. 1992. “The Cross-Section of Expected Stock Returns.” The Journal of Finance 47 (2): 427–65. https://doi.org/10.1111/j.1540-6261.1992.tb04398.x.
———. 1993. “Common Risk Factors in the Returns on Stocks and Bonds.” Journal of Financial Economics 33 (1): 3–56. https://doi.org/10.1016/0304-405X(93)90023-5.
———. 1995. “Size and Book-to-Market Factors in Earnings and Returns.” The Journal of Finance 50 (1): 131–55. https://doi.org/10.1111/j.1540-6261.1995.tb05169.x.
———. 1997. “Industry Costs of Equity.” Journal of Financial Economics 43 (2): 153–93. https://doi.org/10.1016/S0304-405X(96)00896-3.
———. 2015. “A Five-Factor Asset Pricing Model.” Journal of Financial Economics 116 (1): 1–22. https://doi.org/10.1016/j.jfineco.2014.10.010.
———. 2016. “Dissecting Anomalies with a Five-Factor Model.” The Review of Financial Studies 29 (1): 69–103. https://doi.org/10.1093/rfs/hhv043.
———. 2017. “International Tests of a Five-Factor Asset Pricing Model.” Journal of Financial Economics 123 (3): 441–63. https://doi.org/10.1016/j.jfineco.2016.11.004.
Pástor, L’uboš, and Robert F Stambaugh. 2003. “Liquidity Risk and Expected Stock Returns.” Journal of Political Economy 111 (3): 642–85.
Stambaugh, Robert F, and Yu Yuan. 2016. “Mispricing Factors.” The Review of Financial Studies 30 (4): 1270–1315.