Bibliography#

AM07

Ryan Prescott Adams and David JC MacKay. Bayesian online changepoint detection. arXiv preprint arXiv:0710.3742, 2007.

ALTdJ+23

Joshua Ainslie, James Lee-Thorp, Michiel de Jong, Yury Zemlyanskiy, Federico Lebrón, and Sumit Sanghai. Gqa: training generalized multi-query transformer models from multi-head checkpoints. arXiv preprint arXiv:2305.13245, 2023.

BPC20

Iz Beltagy, Matthew E Peters, and Arman Cohan. Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150, 2020.

CGRS19

Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509, 2019.

DCL21

Yihe Dong, Jean-Baptiste Cordonnier, and Andreas Loukas. Attention is not all you need: pure attention loses rank doubly exponentially with depth. In International Conference on Machine Learning, 2793–2803. PMLR, 2021.

GG16

Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: representing model uncertainty in deep learning. In international conference on machine learning, 1050–1059. PMLR, 2016.

GIG17

Yarin Gal, Riashat Islam, and Zoubin Ghahramani. Deep bayesian active learning with image data. In International conference on machine learning, 1183–1192. PMLR, 2017.

Gal86

Francis Galton. Regression towards mediocrity in hereditary stature. The Journal of the Anthropological Institute of Great Britain and Ireland, 15:246–263, 1886.

Gre03

William H Greene. Econometric analysis. Pearson Education India, 2003.

Gun22

Gregory Gundersen. Bayesian online changepoint detection. 2022. URL: https://gregorygundersen.com/blog/2019/08/13/bocd/.

HZRS16

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778. 2016.

HBL22

Yuzi He, Keith A Burghardt, and Kristina Lerman. Leveraging change point detection to discover natural experiments in data. EPJ Data Science, 11(1):49, 2022.

HHuszarGL11

Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745, 2011.

JSM+23

Albert Q Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, and others. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.

KC15

Taehoon Kim and Jaesik Choi. Reading documents for bayesian online change point detection. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1610–1619. 2015.

LXT+18

Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, and Tom Goldstein. Visualizing the loss landscape of neural nets. Advances in neural information processing systems, 2018.

LWLQ22

Tianyang Lin, Yuxin Wang, Xiangyang Liu, and Xipeng Qiu. A survey of transformers. AI Open, 2022.

Mur12

Kevin P Murphy. Machine learning: a probabilistic perspective. MIT press, 2012.

NW87

Whitney K Newey and Kenneth D West. Hypothesis testing with efficient method of moments estimation. International Economic Review, pages 777–787, 1987.

PY09

Pierre Perron and Tomoyoshi Yabu. Testing for shifts in trend with an integrated or stationary noise component. Journal of Business & Economic Statistics, 27(3):369–396, 2009.

SAL+23

Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: enhanced transformer with rotary position embedding. Neurocomputing, pages 127063, 2023.

VSP+17

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 2017.

Whi14

Halbert White. Asymptotic theory for econometricians. Academic press, 2014.

Woo10

Jeffrey M Wooldridge. Econometric analysis of cross section and panel data. MIT press, 2010.

Woo15

Jeffrey M Wooldridge. Introductory econometrics: A modern approach. Cengage learning, 2015.

ZS19

Biao Zhang and Rico Sennrich. Root mean square layer normalization. Advances in Neural Information Processing Systems, 2019.

ZHZ+20

Bin Zuo, Zhaolu Hou, Fei Zheng, Lifang Sheng, Yang Gao, and Jianping Li. Robustness assessment of the rsd t-test for detecting trend turning in a time series. Earth and Space Science, 7(5):e2019EA001042, 2020.

ZLSZ19

Bin Zuo, Jianping Li, Cheng Sun, and Xin Zhou. A new statistical method for detecting trend turning. Theoretical and Applied Climatology, 138(1):201–213, 2019.