There is an enormous amount of information in the public domain about building models. In fact, once you get into the weeds you'll realize there is too much and in many cases (not all, but many) the very specific way something was done or what framework they used or what hardware configuration they had was just a function of what they have or have experience with etc. One could spend a lifetime just trying to repro olmo's work or a lot of the huggingface stuff....