Baseball Sim Methodology
Data sources
Baseball Sim is powered by imported Lahman database tables and app-generated simulation results.
Historical data comes from Lahman
The app imports historical baseball data from the Lahman database. Lahman provides the player, batting, pitching, fielding, team, park, home-game, and appearance tables that power the historical stats browser and roster builder.
Those tables give the simulator the raw material it needs: player identity, handedness, seasonal batting lines, pitching lines, fielding records by position, team-season totals, ballpark information, and games played by position. Baseball Sim turns that imported data into search pages, roster candidates, player ratings, team summaries, and park context.
Ratings are derived from season records
When the simulator needs a player, it uses the specific season version selected for that player. The same player can rate differently across seasons because the inputs are different. A young speed-focused season and a later power-focused season are treated as distinct versions.
Rate stats are recalculated from the available components instead of copied blindly from display tables. Where early data is missing, the rating system uses conservative fallbacks based on available stats and league context. The public pages show historical stats; the hidden ratings convert those stats into game behavior.
Simulated records are app data
The app also stores results created by Baseball Sim itself. Game, series, season, challenge, and record pages are based on simulated outcomes, not historical MLB results. That separation matters: historical pages explain what actually happened in baseball history, while records pages track what has happened inside this simulator.
As the simulator changes, generated results and all-time records may need to be rebuilt or revalidated. The project keeps tuning notes and reference matchup checks so changes to the engine can be compared against known examples.